Lecture Notes in Computer Science Edited by G. Goos, J. Hartmanis, and J. van Leeuwen
2489
3
Berlin Heidelberg New York Barcelona Hong Kong London Milan Paris Tokyo
Takeyoshi Dohi Ron Kikinis (Eds.)
Medical Image Computing and Computer-Assisted Intervention – MICCAI 2002 5th International Conference Tokyo, Japan, September 25-28, 2002 Proceedings, Part II
13
Series Editors Gerhard Goos, Karlsruhe University, Germany Juris Hartmanis, Cornell University, NY, USA Jan van Leeuwen, Utrecht University, The Netherlands Volume Editors Takeyoshi Dohi Department of Mechano-informatics Graduate School of Information Science and Technology University of Tokyo, 7-3-1 Hongo Bunkyo-ku, 113-8656 Tokyo, Japan E-mail:
[email protected] Ron Kikinis Department of Radiology, Brigham and Women’s Hospital 75 Francis St., MA, 02115 Boston, USA E-mail:
[email protected] Cataloging-in-Publication Data applied for Die Deutsche Bibliothek - CIP-Einheitsaufnahme Medical image computing and computer assisted intervention : 5th international conference ; proceedings / MICCAI 2002, Tokyo, Japan, September 25 - 28, 2002. Takeyoshi Dohi ; Ron Kikinis (ed.). - Berlin ; Heidelberg ; New York ; Hong Kong ; London ; Milan ; Paris ; Tokyo : Springer Pt. 2 . - (2002) (Lecture notes in computer science ; Vol. 2489) ISBN 3-540-44225-1
CR Subject Classification (1998): I.5, I.4, I.3.5-8, I.2.9-10, J.3 ISSN 0302-9743 ISBN 3-540-44225-1 Springer-Verlag Berlin Heidelberg New York This work is subject to copyright. All rights are reserved, whether the whole or part of the material is concerned, specifically the rights of translation, reprinting, re-use of illustrations, recitation, broadcasting, reproduction on microfilms or in any other way, and storage in data banks. Duplication of this publication or parts thereof is permitted only under the provisions of the German Copyright Law of September 9, 1965, in its current version, and permission for use must always be obtained from Springer-Verlag. Violations are liable for prosecution under the German Copyright Law. Springer-Verlag Berlin Heidelberg New York a member of BertelsmannSpringer Science+Business Media GmbH http://www.springer.de © Springer-Verlag Berlin Heidelberg 2002 Printed in Germany Typesetting: Camera-ready by author, data conversion by Olgun Computergrafik Printed on acid-free paper SPIN: 10870643 06/3142 543210
Preface
The fifth international Conference in Medical Image Computing and Computer Assisted Intervention (MICCAI 2002) was held in Tokyo from September 25th to 28th, 2002. This was the first time that the conference was held in Asia since its foundation in 1998. The objective of the conference is to offer clinicians and scientists the opportunity to collaboratively create and explore the new medical field. Specifically, MICCAI offers a forum for the discussion of the state of art in computer-assisted interventions, medical robotics, and image processing among experts from multi-disciplinary professions, including but not limited to clinical doctors, computer scientists, and mechanical and biomedical engineers. The expectations of society are very high; the advancement of medicine will depend on computer and device technology in coming decades, as they did in the last decades. We received 321 manuscripts, of which 41 were chosen for oral presentation and 143 for poster presentation. Each paper has been included in these proceedings in eight-page full paper format, without any differentiation between oral and poster papers. Adherence to this full paper format, along with the increased number of manuscripts, surpassing all our expectations, has led us to issue two proceedings volumes for the first time in MICCAI’s history. Keeping to a single volume by assigning fewer pages to each paper was certainly an option for us considering our budget constraints. However, we decided to increase the volume to offer authors maximum opportunity to argue the state of art in their work and to initiate constructive discussions among the MICCAI audience. It was our great pleasure to welcome all MICCAI 2002 attendees to Tokyo. Japan, in fall, is known for its beautiful foliage all over the country. The traditional Japanese architectures always catches the eyes of visitors to Japan. We hope that all the MICCAI attendees took the opportunity to enjoy Japan and that they had a scientifically fruitful time at the conference. Those who could not attend the conference should keep the proceedings as a valuable source of information for their academic activities. We look forward to seeing you at another successful MICCAI in Toronto in 2003.
July 2002
DOHI Takeyoshi and Ron Kikinis
Organizing Committee
Honorary Chair Kintomo Takakura
Tokyo Women’s Medical University, Japan
General Chair Takeyoshi Dohi Terry Peters Junichiro Toriwaki
The University of Tokyo, Japan University of Western Ontario, Canada Nagoya University, Japan
Program Chair Ron Kikinis
Harvard Medical School and Brigham and Women’s Hospital, USA
Program Co-chairs Randy Ellis Koji Ikuta Gabor Szekely
Queen’s University at Kingston, Canada Nagoya University, Japan Swiss Federal Institute of Technology, ETH Zentrum, Switzerland
Tutorial Chair Yoshinobu Sato
Osaka University, Japan
Industrial Liaison Masakatsu Fujie Makoto Hashizume Hiroshi Iseki
Waseda University, Japan Kyushu University, Japan Tokyo Women’s Medical University, Japan
VIII
Organization
Program Review Committee Alan Colchester Wei-Qi Wang Yongmei Wang Jocelyne Troccaz Erwin Keeve Frank Tendick Sun I. Kim Pierre Hellier Pheng Ann Heng Gabor Szekely Kirby Vosburgh Allison M. Okamura James S. Duncan Baba Vemuri Terry M. Peters Allen Tannenbaum Richard A. Robb Brian Davies David Hawkes Carl-Fredrik Westin Chris Taylor Derek Hill Ramin Shahidi Demetri Terzopoulos Shuqian Luo Paul Thompson Simon Warfield Gregory D. Hager Kiyoyuki Chinzei Shinichi Tamura Jun Toriwaki Yukio Kosugi Jing Bai Philippe Cinquin Xavier Pennec Frithjof Kruggel
University of Kent at Canterbury, UK Dept. of E., Fudan University, China The Chinese University of Hong Kong, China TIMC Laboratory, France Research Center Caesar, Germany University of California, San Francisco, USA Hanyang University, Korea INRIA Rennes, France The Chinese University of Hong Kong, China Swiss Federal Institute of Technology Zurich, Switzerland CIMIT/MGH/Harvard Medical School, USA Johns Hopkins University, USA Yale University, USA University of Florida, USA The John P. Robarts Research Institute, Canada Georgia Institute of Technology, USA Mayo Clinic, USA Imperial College London, UK King’s College London, UK Harvard Medical School, USA University of Manchester, UK King’s College London, UK Stanford University, USA New York University, USA Capital University of Medical Sciences, USA UCLA School of Medicine, USA Harvard Medical School, USA Johns Hopkins University, USA AIST, Japan Osaka University, Japan Nagoya University, Japan Tokyo Institute of Technology, Japan Tsinghua University, China UJF (University Joseph Fourier), France INRIA Sophia-Antipolis, France Max-Planck-Institute for Cognitive Neuroscience, Germany
Organization
Ewert Bengtsson ` Coste Mani´ere Eve Milan Sonka Branislav Jaramaz Dimitris Metaxas Tianzi Jiang Tian-ge Zhuang Masakatsu G. Fujie Takehide Asano Ichiro Sakuma Alison Noble Heinz U. Lemke Robert Howe Michael I Miga Herv´e Delingette D. Louis Collins
IX
Uppsala University, Finland INRIA Sophia Antipolis, France University of Iowa, USA West Penn Hospital, USA Rutgers University, USA Chinese Academy of Sciences, China Shanghai Jiao tong University, China Waseda University, Japan Chiba University, Japan The University of Tokyo, Japan University of Oxford, UK Technical University Berlin, Germany Harvard University, USA Vanderbilt University, USA INRIA Sophia Antipolis, France Montreal Neurological Institute, McGill University, Canada Kunio Doi University of Chicago, USA Scott Delp Stanford University, USA Louis L. Whitcomb Johns Hopkins University, USA Michael W. Vannier University of Iowa, USA Jin-Ho Cho Kyungpook National University, Korea Yukio Yamada University of Electro-Communications, Japan Yuji Ohta Ochanomizu University, Japan Karol Miller The University of Western Australia William (Sandy) Wells Harvard Medical School, Brigham and Women’s Hosp., USA Kevin Montgomery National Biocomputation Center/Stanford University, USA Kiyoshi Naemura Tokyo Women’s Medical University, Japan Yoshihiko Nakamura The University of Tokyo, Japan Toshio Nakagohri National Cancer Center Hospital East, Japan Yasushi Yamauchi AIST, Japan Masaki Kitajima Keio University, Japan Hiroshi Iseki Tokyo Women’s Medical University, Japan Yoshinobu Sato Osaka University, Japan Amami Kato Osaka University School of Medicine, Japan Eiju Watanabe Tokyo Metropolitan Police Hospital, Japan Miguel Angel Gonzalez Ballester INRIA Sophia Antipolis, France Yoshihiro Muragaki Tokyo Women’s Medical University, Japan
X
Organization
Makoto Hashizume Paul Suetens Michael D. Sherar Kyojiro Nambu Naoki Suzuki Nobuhiko Sugano Etsuko Kobayashi Gr´egoire Malandain Russell H. Taylor Maryellen Giger Hideaki Koizumi Rjan Smedby Karl Heinz Hoene Sherif Makram-Ebeid St´ephane Lavall´ee Josien Pluim Darwin G. Caldwell Vaillant Regis Nassir Navab Eric Grimson Wiro Niessen Richard Satava Takeyoshi Dohi Guido Gerig Ferenc Jolesz Leo Joskowicz Antonio Bicchi Wolfgang Schlegel Richard Bucholz Robert Galloway Juan Ruiz-Alzola
Kyushu University, Japan K.U. Leuven, Medical Image Computing, Belgium Ontario Cancer Institute/University of Toronto, Canada Medical Systems Company, Toshiba Corporation, Japan Institute for High Dimensional Medical Imaging, Jikei University School of Medicine, Japan Osaka University, Japan The University of Tokyo, Japan INRIA Sophia Antipolis, France Johns Hopkins University, USA University of Chicago, USA Advanced Research Laboratory, Hitachi, Ltd., Japan Linkoeping University, Sweden University of Hamburg, Germany Philips Research France PRAXIM, France University Medical Center Utrecht, The Netherlands University of Salford, England GEMS, Switzerland Siemens Corporate Research, USA MIT AI Lab, USA University Medical Center Utrecht, The Netherlands Yale University School of Medicine, USA The University of Tokyo, Japan UNC Chapel Hill, Department of Computer Science, USA Brigham and Womens Hospital Harvard Medical School, USA The Hebrew University of Jerusalem, ISRAEL University of Pisa, Italy DKFZ, Germany Saint Louis University School of Medicine, USA Vanderbilt University, USA University of Las Palmas de Gran Canaria, Spain
Organization
Tim Salcudean Stephen Pizer J. Michael Fitzpatrick Gabor Fichtinger Koji Ikuta Jean Louis Coatrieux Jaydev P. Desai Chris Johnson Luc Soler Wieslaw L. Nowinski Andreas Pommert Heinz-Otto Peitgen Rudolf Fahlbusch Simon Wildermuth Chuck Meyer Johan Van Cleynenbreugel Dirk Vandermeulen Karl Rohr Martin Styner Catherina R. Burghart Fernando Bello Colin Studholme Dinesh Pai Paul Milgram Michael Bronskill Nobuhiko Hata Ron Kikinis Lutz Nolte Ralph Mosges Bart M. ter Haar Romeny Steven Haker
XI
University of British Columbia, Canada University of North Carolina, USA Vanderbilt University, USA Johns Hopkins University, USA Nagoya University, Japan University of Rennes-INSERM, France Drexel University, USA Scientific Computing and Imaging Institute, USA IRCAD, France Biomedical Imaging Lab, Singapore University Hospital Hamburg-Eppendorf, Germany MeVis, Germany Neurochirurgische Klinik, Germany University Hospital Zurich, Inst. Diagnostic Radiology, Switzerland University of Michigan, USA Medical Image Computing, ESAT-Radiologie, K.U. Leuven, Belgium K.U. Leuven, Belgium International University in Germany, Germany Duke Image Analysis Lab, UNC Neuro Image Analysis Lab, Germany University of Karlsruhe, Germany Imperial College of Science, Technology and Medicine, UK University of California, San Francisco, USA University of British Columbia, Canada University of Toronto, Canada University of Toronto/Sunnybrook Hospital, Canada The University of Tokyo, Japan Brigham and Women’s Hospital and Harvard Medical School, USA University of Bern, Germany IMSIE Univ. of Cologne, Germany Eindhoven University of Technology, The Netherlands Brigham and Women’s Hospital and Harvard Medical School, USA
XII
Organization
Local Organizing Committee Ichiro Sakuma Mitsuo Shimada Nobuhiko Hata Etsuko Kobayashi
The University of Tokyo, Japan Kyushu University, Japan The University of Tokyo, Japan The University of Tokyo, Japan
MICCAI Board Alan C.F. Colchester (General Chair)
University of Kent at Canterbury, UK
Nicholas Ayache Anthony M. DiGioia Takeyoshi Dohi James Duncan Karl Heinz H¨ohne Ron Kikinis Stephen M. Pizer Richard A. Robb Russell H. Taylor Jocelyne Troccaz Max A. Viergever
INRIA Sophia Antipolis, France UPMC Shadyside Hospital, Pittsburgh, USA University of Tokyo, Japan Yale University, New Haven, USA University of Hamburg, Germany Harvard Medical School , Boston, USA University of North Carolina, Chapel Hill, USA Mayo Clinic, Rochester, USA Johns Hopkins University, Baltimore, USA University of Grenoble, France University Medical Center Utrecht, The Netherlands
Table of Contents, Part I
Robotics – Endoscopic Device Using an Endoscopic Solo Surgery Simulator for Quantitative Evaluation of Human-Machine Interface in Robotic Camera Positioning Systems . . . . . A. Nishikawa, D. Negoro, H. Kakutani, F. Miyazaki, M. Sekimoto, M. Yasui, S. Takiguchi, M. Monden Automatic 3-D Positioning of Surgical Instruments during Robotized Laparoscopic Surgery Using Automatic Visual Feedback . . . . . . . . . . . . . . . . A. Krupa, M. de Mathelin, C. Doignon, J. Gangloff, G. Morel, L. Soler, J. Leroy, J. Marescaux
1
9
Development of a Compact Cable-Driven Laparoscopic Endoscope Manipulator . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 P.J. Berkelman, P. Cinquin, J. Troccaz, J.-M. Ayoubi, C. L´etoublon Flexible Calibration of Actuated Stereoscopic Endoscope for Overlay in Robot Assisted Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25 ` Coste-Mani`ere F. Mourgues, E. Metrics for Laparoscopic Skills Trainers: The Weakest Link! . . . . . . . . . . . . . 35 S. Cotin, N. Stylopoulos, M. Ottensmeyer, P. Neumann, D. Rattner, S. Dawson Surgical Skill Evaluation by Force Data for Endoscopic Sinus Surgery Training System . . . . . . . . . . . . . . . . . . . . . . . . 44 Y. Yamauchi, J. Yamashita, O. Morikawa, R. Hashimoto, M. Mochimaru, Y. Fukui, H. Uno, K. Yokoyama Development of a Master Slave Combined Manipulator for Laparoscopic Surgery – Functional Model and Its Evaluation . . . . . . . . . 52 M. Jinno, N. Matsuhira, T. Sunaoshi, T. Hato, T. Miyagawa, Y. Morikawa, T. Furukawa, S. Ozawa, M. Kitajima, K. Nakazawa Development of Three-Dimensional Endoscopic Ultrasound System with Optical Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 N. Koizumi, K. Sumiyama, N. Suzuki, A. Hattori, H. Tajiri, A. Uchiyama Real-Time Haptic Feedback in Laparoscopic Tools for Use in Gastro-Intestinal Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 66 T. Hu, A.E. Castellanos, G. Tholey, J.P. Desai Small Occupancy Robotic Mechanisms for Endoscopic Surgery . . . . . . . . . . . 75 Y. Kobayashi, S. Chiyoda, K. Watabe, M. Okada, Y. Nakamura
XXII
Table of Contents, Part I
Robotics in Image-Guided Surgery Development of MR Compatible Surgical Manipulator toward a Unified Support System for Diagnosis and Treatment of Heart Disease . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83 F. Tajima, K. Kishi, K. Nishizawa, K. Kan, Y. Nemoto, H. Takeda, S. Umemura, H. Takeuchi, M.G. Fujie, T. Dohi, K. Sudo, S. Takamoto Transrectal Prostate Biopsy Inside Closed MRI Scanner with Remote Actuation, under Real-Time Image Guidance . . . . . . . . . . . . . 91 G. Fichtinger, A. Krieger, R.C. Susil, A. Tanacs, L.L. Whitcomb, E. Atalar A New, Compact MR-Compatible Surgical Manipulator for Minimally Invasive Liver Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 99 D. Kim, E. Kobayashi, T. Dohi, I. Sakuma Micro-grasping Forceps Manipulator for MR-Guided Neurosurgery . . . . . . . 107 N. Miyata, E. Kobayashi, D. Kim, K. Masamune, I. Sakuma, N. Yahagi, T. Tsuji, H. Inada, T. Dohi, H. Iseki, K. Takakura Endoscope Manipulator for Trans-nasal Neurosurgery, Optimized for and Compatible to Vertical Field Open MRI . . . . . . . . . . . . . . . . . . . . . . . . . . 114 Y. Koseki, T. Washio, K. Chinzei, H. Iseki A Motion Adaptable Needle Placement Instrument Based on Tumor Specific Ultrasonic Image Segmentation . . . . . . . . . . . . . . . . 122 J.-S. Hong, T. Dohi, M. Hasizume, K. Konishi, N. Hata
Robotics – Tele-operation Experiment of Wireless Tele-echography System by Controlling Echographic Diagnosis Robot . . . . . . . . . . . . . . . . . . . . . . . . . . 130 K. Masuda, N. Tateishi, Y. Suzuki, E. Kimura, Y. Wie, K. Ishihara Experiments with the TER Tele-echography Robot . . . . . . . . . . . . . . . . . . . . 138 A. Vilchis, J. Troccaz, P. Cinquin, A. Guerraz, F. Pellisier, P. Thorel, B. Tondu, F. Courr`eges, G. Poisson, M. Althuser, J.-M. Ayoubi The Effect of Visual and Haptic Feedback on Manual and Teleoperated Needle Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 147 O. Gerovichev, P. Marayong, A.M. Okamura Analysis of Suture Manipulation Forces for Teleoperation with Force Feedback . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 155 M. Kitagawa, A.M. Okamura, B.T. Bethea, V.L. Gott, W.A. Baumgartner
Table of Contents, Part I
XXIII
Remote Microsurgery System for Deep and Narrow Space – Development of New Surgical Procedure and Micro-robotic Tool . . . . . . . . 163 K. Ikuta, K. Sasaki, K. Yamamoto, T. Shimada Hyper-finger for Remote Minimally Invasive Surgery in Deep Area . . . . . . . 173 K. Ikuta, S. Daifu, T. Hasegawa, H. Higashikawa
Robotics – Device Safety-Active Catheter with Multiple-Segments Driven by Micro-hydraulic Actuators . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 182 K. Ikuta, H. Ichikawa, K. Suzuki A Stem Cell Harvesting Manipulator with Flexible Drilling Unit for Bone Marrow Transplantation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 192 K. Ohashi, N. Hata, T. Matsumura, N. Yahagi, I. Sakuma, T. Dohi Liver Tumor Biopsy in a Respiring Phantom with the Assistance of a Novel Electromagnetic Navigation Device . . . . . . . . . . . . . . . . . . . . . . . . . 200 F. Banovac, N. Glossop, D. Lindisch, D. Tanaka, E. Levy, K. Cleary Non-invasive Measurement of Biomechanical Properties of in vivo Soft Tissues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 208 Lianghao Han, Michael Burcher, J. Alison Noble Measurement of the Tip and Friction Force Acting on a Needle during Penetration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 216 H. Kataoka, T. Washio, K. Chinzei, K. Mizuhara, C. Simone, A.M. Okamura Contact Force Evaluation of Orthoses for the Treatment of Malformed Ears . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 224 A. Hanafusa, T. Isomura, Y. Sekiguchi, H. Takahashi, T. Dohi Computer-Assisted Correction of Bone Deformities Using A 6-DOF Parallel Spatial Mechanism . . . . . . . . . . . . . . . . . . . . . . . . . . . 232 O. Iyun, D.P. Borschneck, R.E. Ellis
Robotics – System Development of 4-Dimensional Human Model System for the Patient after Total Hip Arthroplasty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 241 Y. Otake, K. Hagio, N. Suzuki, A. Hattori, N. Sugano, K. Yonenobu, T. Ochi Development of a Training System for Cardiac Muscle Palpation . . . . . . . . 248 T. Tokuyasu, S. Oota, K. Asami, T. Kitamura, G. Sakaguchi, T. Koyama, M. Komeda
XXIV
Table of Contents, Part I
Preliminary Results of an Early Clinical Experience with the AcrobotTM System for Total Knee Replacement Surgery . . . . . . . 256 M. Jakopec, S.J. Harris, F. Rodriguez y Baena, P. Gomes, J. Cobb, B.L. Davies A Prostate Brachytherapy Training Rehearsal System – Simulation of Deformable Needle Insertion . . . . . . . . . . . . . . . . . . . . . . . . . . . . 264 A. Kimura, J. Camp, R. Robb, B. Davis A Versatile System for Computer Integrated Mini-invasive Robotic Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 272 ` Coste-Mani`ere L. Adhami, E. Measurements of Soft-Tissue Mechanical Properties to Support Development of a Physically Based Virtual Animal Model . . . . 282 C. Bruyns, M. Ottensmeyer
Validation Validation of Tissue Modelization and Classification Techniques in T1-Weighted MR Brain Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 290 M. Bach Cuadra, B. Platel, E. Solanas, T. Butz, J.-Ph. Thiran Validation of Image Segmentation and Expert Quality with an Expectation-Maximization Algorithm . . . . . . . . . . . . . . . . . . . . . . . . . 298 S.K. Warfield, K.H. Zou, W.M. Wells Validation of Volume-Preserving Non-rigid Registration: Application to Contrast-Enhanced MR-Mammography . . . . . . . . . . . . . . . . . . 307 C. Tanner, J.A. Schnabel, A. Degenhard, A.D. Castellano-Smith, C. Hayes, M.O. Leach, D.R. Hose, D.L.G. Hill, D.J. Hawkes Statistical Validation of Automated Probabilistic Segmentation against Composite Latent Expert Ground Truth in MR Imaging of Brain Tumors . 315 K.H. Zou, W.M. Wells III, M.R. Kaus, R. Kikinis, F.A. Jolesz, S.K. Warfield A Posteriori Validation of Pre-operative Planning in Functional Neurosurgery by Quantification of Brain Pneumocephalus . . . . . . . . . . . . . . 323 ´ Bardinet, P. Cathier, A. Roche, N. Ayache, D. Dormont E. Affine Transformations and Atlases: Assessing a New Navigation Tool for Knee Arthroplasty . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 331 B. Ma, J.F. Rudan, R.E. Ellis Effectiveness of the ROBODOC System during Total Hip Arthroplasty in Preventing Intraoperative Pulmonary Embolism . . . . . . . . . . . . . . . . . . . . 339 K. Hagio, N. Sugano, M. Takashina, T. Nishii, H. Yoshikawa, T. Ochi
Table of Contents, Part I
XXV
Medical Image Synthesis via Monte Carlo Simulation . . . . . . . . . . . . . . . . . . 347 J.Z. Chen, S.M. Pizer, E.L. Chaney, S. Joshi Performance Issues in Shape Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . 355 S.J. Timoner, P. Golland, R. Kikinis, M.E. Shenton, W.E.L. Grimson, W.M. Wells III
Brain-Tumor, Cortex, Vascular Structure Statistical Analysis of Longitudinal MRI Data: Applications for Detection of Disease Activity in MS . . . . . . . . . . . . . . . . . . . 363 S. Prima, N. Ayache, A. Janke, S.J. Francis, D.L. Arnold, D.L. Collins Automatic Brain and Tumor Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 372 N. Moon, E. Bullitt, K. van Leemput, G. Gerig Atlas-Based Segmentation of Pathological Brains Using a Model of Tumor Growth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 380 M. Bach Cuadra, J. Gomez, P. Hagmann, C. Pollo, J.-G. Villemure, B.M. Dawant, J.-Ph. Thiran Recognizing Deviations from Normalcy for Brain Tumor Segmentation . . . 388 D.T. Gering, W.E.L. Grimson, R. Kikinis 3D-Visualization and Registration for Neurovascular Compression Syndrome Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 396 P. Hastreiter, R. Naraghi, B. Tomandl, M. Bauer, R. Fahlbusch 3D Guide Wire Reconstruction from Biplane Image Sequences for 3D Navigation in Endovascular Interventions . . . . . . . . . . . . . . . . . . . . . . . 404 S.A.M. Baert, E.B. van der Kraats, W.J. Niessen Standardized Analysis of Intracranial Aneurysms Using Digital Video Sequences . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 411 S. Iserhardt-Bauer, P. Hastreiter, B. Tomandl, N. K¨ ostner, M. Schempershofe, U. Nissen, T. Ertl Demarcation of Aneurysms Using the Seed and Cull Algorithm . . . . . . . . . . 419 R.A. McLaughlin, J.A. Noble Gyral Parcellation of the Cortical Surface Using Geodesic Vorono¨ı Diagrams . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 427 A. Cachia, J.-F. Mangin, D. Rivi`ere, D. Papadopoulos-Orfanos, I. Bloch, J. R´egis Regularized Stochastic White Matter Tractography Using Diffusion Tensor MRI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 435 M. Bj¨ ornemo, A. Brun, R. Kikinis, C.-F. Westin
XXVI
Table of Contents, Part I
Sulcal Segmentation for Cortical Thickness Measurements . . . . . . . . . . . . . . 443 C. Hutton, E. De Vita, R. Turner Labeling the Brain Surface Using a Deformable Multiresolution Mesh . . . . . 451 S. Jaume, B. Macq, S.K. Warfield
Brain – Imaging and Analysis New Approaches to Estimation of White Matter Connectivity in Diffusion Tensor MRI: Elliptic PDEs and Geodesics in a Tensor-Warped Space . . . . . 459 L. O’Donnell, S. Haker, C.-F. Westin Improved Detection Sensitivity in Functional MRI Data Using a Brain Parcelling Technique . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 467 G. Flandin, F. Kherif, X. Pennec, G. Malandain, N. Ayache, J.-B. Poline A Spin Glass Based Framework to Untangle Fiber Crossing in MR Diffusion Based Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 475 Y. Cointepas, C. Poupon, D. Le Bihan, J.-F. Mangin Automated Approximation of Lateral Ventricular Shape in Magnetic Resonance Images of Multiple Sclerosis Patients . . . . . . . . . . . . 483 B. Sturm, D. Meier, E. Fisher An Intensity Consistent Approach to the Cross Sectional Analysis of Deformation Tensor Derived Maps of Brain Shape . . . . . . . . . . . . . . . . . . . 492 C. Studholme, V. Cardenas, A. Maudsley, M. Weiner Detection of Inter-hemispheric Asymmetries of Brain Perfusion in SPECT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 500 B. Aubert-Broche, C. Grova, P. Jannin, I. Buvat, H. Benali, B. Gibaud Discriminative Analysis for Image-Based Studies . . . . . . . . . . . . . . . . . . . . . . . 508 P. Golland, B. Fischl, M. Spiridon, N. Kanwisher, R.L. Buckner, M.E. Shenton, R. Kikinis, A. Dale, W.E.L. Grimson Automatic Generation of Training Data for Brain Tissue Classification from MRI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 516 C.A. Cocosco, A.P. Zijdenbos, A.C. Evans The Putamen Intensity Gradient in CJD Diagnosis . . . . . . . . . . . . . . . . . . . . 524 A. Hojjat, D. Collie, A.C.F. Colchester A Dynamic Brain Atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 532 D.L.G. Hill, J.V. Hajnal, D. Rueckert, S.M. Smith, T. Hartkens, K. McLeish
Table of Contents, Part I
XXVII
Model Library for Deformable Model-Based Segmentation of 3-D Brain MR-Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 540 J. Koikkalainen, J. L¨ otj¨ onen Co-registration of Histological, Optical and MR Data of the Human Brain . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 548 ´ Bardinet, S. Ourselin, D. Dormont, G. Malandain, D. Tand´e, E. K. Parain, N. Ayache, J. Yelnik
Segmentation An Automated Segmentation Method of Kidney Using Statistical Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 556 B. Tsagaan, A. Shimizu, H. Kobatake, K. Miyakawa Incorporating Non-rigid Registration into Expectation Maximization Algorithm to Segment MR Images . . . . . . . . . . . . . . . . . . . . . . . 564 K.M. Pohl, W.M. Wells, A. Guimond, K. Kasai, M.E. Shenton, R. Kikinis, W.E.L. Grimson, S.K. Warfield Segmentation of 3D Medical Structures Using Robust Ray Propagation . . . 572 H. Tek, M. Bergtholdt, D. Comaniciu, J. Williams MAP MRF Joint Segmentation and Registration . . . . . . . . . . . . . . . . . . . . . . . 580 P.P. Wyatt, J.A. Noble Statistical Neighbor Distance Influence in Active Contours . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 588 J. Yang, L.H. Staib, J.S. Duncan Active Watersheds: Combining 3D Watershed Segmentation and Active Contours to Extract Abdominal Organs from MR Images . . . . . 596 R.J. Lapeer, A.C. Tan, R. Aldridge
Cardiac Application Coronary Intervention Planning Using Hybrid 3D Reconstruction . . . . . . . . 604 O. Wink, R. Kemkers, S.J. Chen, J.D. Carroll Deformation Modelling Based on PLSR for Cardiac Magnetic Resonance Perfusion Imaging . . . . . . . . . . . . . . . . . . . . 612 J. Gao, N. Ablitt, A. Elkington, G.-Z. Yang Automated Segmentation of the Left and Right Ventricles in 4D Cardiac SPAMM Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 620 A. Montillo, D. Metaxas, L. Axel Stochastic Finite Element Framework for Cardiac Kinematics Function and Material Property Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 634 P. Shi, H. Liu
XXVIII Table of Contents, Part I
Atlas-Based Segmentation and Tracking of 3D Cardiac MR Images Using Non-rigid Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 642 M. Lorenzo-Vald´es, G.I. Sanchez-Ortiz, R. Mohiaddin, D. Rueckert Myocardial Delineation via Registration in a Polar Coordinate System . . . . 651 N.M.I. Noble, D.L.G. Hill, M. Breeuwer, J.A. Schnabel, D.J. Hawkes, F.A. Gerritsen, R. Razavi Integrated Image Registration for Cardiac MR Perfusion Data . . . . . . . . . . . 659 R. Bansal, G. Funka-Lea 4D Active Surfaces for Cardiac Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 667 A. Yezzi, A. Tannenbaum A Computer Diagnosing System of Dementia Using Smooth Pursuit Oculogyration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 674 I. Fukumoto Combinative Multi-scale Level Set Framework for Echocardiographic Image Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 682 N. Lin, W. Yu, J.S. Duncan Automatic Hybrid Segmentation of Dual Contrast Cardiac MR Data . . . . . 690 A. Pednekar, I.A. Kakadiaris, V. Zavaletta, R. Muthupillai, S. Flamm Efficient Partial Volume Tissue Classification in MRI Scans . . . . . . . . . . . . . 698 A. Noe, J.C. Gee In-vivo Strain and Stress Estimation of the Left Ventricle from MRI Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 706 Z. Hu, D. Metaxas, L. Axel Biomechanical Model Construction from Different Modalities: Application to Cardiac Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 714 M. Sermesant, C. Forest, X. Pennec, H. Delingette, N. Ayache Comparison of Cardiac Motion Across Subjects Using Non-rigid Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 722 A. Rao, G.I. Sanchez-Ortiz, R. Chandrashekara, M. Lorenzo-Vald´es, R. Mohiaddin, D. Rueckert
Computer Assisted Diagnosis From Colour to Tissue Histology: Physics Based Interpretation of Images of Pigmented Skin Lesions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 730 E. Claridge, S. Cotton, P. Hall, M. Moncrieff In-vivo Molecular Investigations of Live Tissues Using Diffracting Sources . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 739 V. Ntziachristos, J. Ripoll, E. Graves, R. Weissleder
Table of Contents, Part I
XXIX
Automatic Detection of Nodules Attached to Vessels in Lung CT by Volume Projection Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 746 G.-Q. Wei, L. Fan, J.Z. Qian LV-RV Shape Modeling Based on a Blended Parameterized Model . . . . . . . 753 K. Park, D.N. Metaxas, L. Axel Characterization of Regional Pulmonary Mechanics from Serial MRI Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 762 J. Gee, T. Sundaram, I. Hasegawa, H. Uematsu, H. Hatabu Using Voxel-Based Morphometry to Examine Atrophy-Behavior Correlates in Alzheimer’s Disease and Frontotemporal Dementia . . . . . . . . . . . . . . . . . . . 770 M.P. Lin, C. Devita, J.C. Gee, M. Grossman Detecting Wedge Shaped Defects in Polarimetric Images of the Retinal Nerve Fiber Layer . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 777 K. Vermeer, F. Vos, H. Lemij, A. Vossepoel Automatic Statistical Identification of Neuroanatomical Abnormalities between Different Populations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 785 A. Guimond, S. Egorova, R.J. Killiany, M.S. Albert, C.R.G. Guttmann Example-Based Assisting Approach for Pulmonary Nodule Classification in 3-D Thoracic CT Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 793 Y. Kawata, N. Niki, H. Ohmatsu, N. Moriyama
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 801
Table of Contents, Part II
Tubular Structures Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . H. Kitaoka, Y. Park, J. Tschirren, J. Reinhardt, M. Sonka, G. McLennan, E.A. Hoffman
1
Segmentation, Skeletonization, and Branchpoint Matching – A Fully Automated Quantitative Evaluation of Human Intrathoracic Airway Trees . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 12 J. Tschirren, K. Pal´ agyi, J.M. Reinhardt, E.A. Hoffman, M. Sonka Improving Virtual Endoscopy for the Intestinal Tract . . . . . . . . . . . . . . . . . . . 20 M. Harders, S. Wildermuth, D. Weishaupt, G. Sz´ekely Finding a Non-continuous Tube by Fuzzy Inference for Segmenting the MR Cholangiography Image . . . . . . . . . . . . . . . . . . . . . . . 28 C. Yasuba, S. Kobashi, K. Kondo, Y. Hata, S. Imawaki, M. Ishikawa Level-Set Based Carotid Artery Segmentation for Stenosis Grading . . . . . . . 36 C.M. van Bemmel, L.J. Spreeuwers, M.A. Viergever, W.J. Niessen
Interventions – Augmented Reality PC-Based Control Unit for a Head Mounted Operating Microscope for Augmented Reality Visualization in Surgical Navigation . . . . . . . . . . . . . 44 M. Figl, W. Birkfellner, F. Watzinger, F. Wanschitz, J. Hummel, R. Hanel, R. Ewers, H. Bergmann Technical Developments for MR-Guided Microwave Thermocoagulation Therapy of Liver Tumors . . . . . . . . . . . . . . . . . . . . . . . . . 52 S. Morikawa, T. Inubushi, Y. Kurumi, S. Naka, K. Sato, T. Tani, N. Hata, V. Seshan, H.A. Haque Robust Automatic C-Arm Calibration for Fluoroscopy-Based Navigation: A Practical Approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60 H. Livyatan, Z. Yaniv, L. Joskowicz Application of a Population Based Electrophysiological Database to the Planning and Guidance of Deep Brain Stereotactic Neurosurgery . . . . . . . . 69 K.W. Finnis, Y.P. Starreveld, A.G. Parrent, A.F. Sadikot, T.M. Peters
XIV
Table of Contents, Part II
An Image Overlay System with Enhanced Reality for Percutaneous Therapy Performed Inside CT Scanner . . . . . . . . . . . . . . . 77 K. Masamune, G. Fichtinger, A. Deguet, D. Matsuka, R. Taylor High-Resolution Stereoscopic Surgical Display Using Parallel Integral Videography and Multi-projector . . . . . . . . . . . . . . . 85 H. Liao, N. Hata, M. Iwahara, S. Nakajima, I. Sakuma, T. Dohi Three-Dimensional Display for Multi-sourced Activities and Their Relations in the Human Brain by Information Flow between Estimated Dipoles . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 93 N. Take, Y. Kosugi, T. Musha
Interventions – Navigation 2D Guide Wire Tracking during Endovascular Interventions . . . . . . . . . . . . . 101 S.A.M. Baert, W.J. Niessen Specification Method of Surface Measurement for Surgical Navigation: Ridgeline Based Organ Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 109 N. Furushiro, T. Saito, Y. Masutani, I. Sakuma An Augmented Reality Navigation System with a Single-Camera Tracker: System Design and Needle Biopsy Phantom Trial . . . . . . . . . . . . . . . . . . . . . . 116 F. Sauer, A. Khamene, S. Vogt A Novel Laser Guidance System for Alignment of Linear Surgical Tools: Its Principles and Performance Evaluation as a Man–Machine System . . . . 125 T. Sasama, N. Sugano, Y. Sato, Y. Momoi, T. Koyama, Y. Nakajima, I. Sakuma, M. Fujie, K. Yonenobu, T. Ochi, S. Tamura Navigation of High Intensity Focused Ultrasound Applicator with an Integrated Three-Dimensional Ultrasound Imaging System . . . . . . 133 I. Sakuma, Y. Takai, E. Kobayashi, H. Inada, K. Fujimoto, T. Asano Robust Registration of Multi-modal Images: Towards Real-Time Clinical Applications . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 140 S. Ourselin, R. Stefanescu, X. Pennec 3D Ultrasound System Using a Magneto-optic Hybrid Tracker for Augmented Reality Visualization in Laparoscopic Liver Surgery . . . . . . 148 M. Nakamoto, Y. Sato, M. Miyamoto, Y. Nakamjima, K. Konishi, M. Shimada, M. Hashizume, S. Tamura Interactive Intra-operative 3D Ultrasound Reconstruction and Visualization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 156 D.G. Gobbi, T.M. Peters
Table of Contents, Part II
XV
Projection Profile Matching for Intraoperative MRI Registration Embedded in MR Imaging Sequence . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 164 N. Hata, J. Tokuda, S. Morikawa, T. Dohi
Simulation A New Tool for Surgical Training in Knee Arthroscopy . . . . . . . . . . . . . . . . . 170 G. Megali, O. Tonet, M. Mazzoni, P. Dario, A. Vascellari, M. Marcacci Combining Volumetric Soft Tissue Cuts for Interventional Surgery Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 178 M. Nakao, T. Kuroda, H. Oyama, M. Komori, T. Matsuda, T. Takahashi Virtual Endoscopy Using Cubic QuickTime-VR Panorama Views . . . . . . . . 186 U. Tiede, N. von Sternberg-Gospos, P. Steiner, K.H. H¨ ohne High Level Simulation & Modeling for Medical Applications – Ultrasound Case . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 193 A. Chihoub Generation of Pathologies for Surgical Training Simulators . . . . . . . . . . . . . . 202 R. Sierra, G. Sz´ekely, M. Bajka Collision Detection Algorithm for Deformable Objects Using OpenGL . . . . 211 S. Aharon, C. Lenglet Online Multiresolution Volumetric Mass Spring Model for Real Time Soft Tissue Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 219 C. Paloc, F. Bello, R.I. Kitney, A. Darzi Orthosis Design System for Malformed Ears Based on Spline Approximation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 227 A. Hanafusa, T. Isomura, Y. Sekiguchi, H. Takahashi, T. Dohi Cutting Simulation of Manifold Volumetric Meshes . . . . . . . . . . . . . . . . . . . . . 235 C. Forest, H. Delingette, N. Ayache Simulation of Guide Wire Propagation for Minimally Invasive Vascular Interventions . . . . . . . . . . . . . . . . . . . . . . . . . . 245 T. Alderliesten, M.K. Konings, W.J. Niessen Needle Insertion Modelling for the Interactive Simulation of Percutaneous Procedures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 253 S.P. DiMaio, S.E. Salcudean 3D Analysis of the Alignment of the Lower Extremity in High Tibial Osteotomy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 261 H. Kawakami, N. Sugano, T. Nagaoka, K. Hagio, K. Yonenobu, H. Yoshikawa, T. Ochi, A. Hattori, N. Suzuki
XVI
Table of Contents, Part II
Simulation of Intra-operative 3D Coronary Angiography for Enhanced Minimally Invasive Robotic Cardiac Intervention . . . . . . . . . . 268 G. Lehmann, D. Habets, D.W. Holdsworth, T. Peters, M. Drangova Computer Investigation into the Anatomical Location of the Axes of Rotation in the Normal Knee . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 276 S. Martelli, A. Visani
Modeling Macroscopic Modeling of Vascular Systems . . . . . . . . . . . . . . . . . . . . . . . . . . . . 284 D. Szczerba, G. Sz´ekely Spatio-temporal Directional Filtering for Improved Inversion of MR Elastography Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 293 A. Manduca, D.S. Lake, R.L. Ehman RBF-Based Representation of Volumetric Data: Application in Visualization and Segmentation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 300 Y. Masutani An Anatomical Model of the Knee Joint Obtained by Computer Dissection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 308 S. Martelli, F. Acquaroli, V. Pinskerova, A. Spettol, A. Visani Models for Planning and Simulation in Computer Assisted Orthognatic Surgery . . . . . . . . . . . . . . . . . . . . . . . . . . . 315 M. Chabanas, C. Marecaux, Y. Payan, F. Boutault Simulation of the Exophthalmia Reduction Using a Finite Element Model of the Orbital Soft Tissues . . . . . . . . . . . . . . 323 V. Luboz, A. Pedrono, P. Swider, F. Boutault, Y. Payan A Real-Time Deformable Model for Flexible Instruments Inserted into Tubular Structures . . . . . . . . . . . . . . . 331 M. Kukuk, B. Geiger Modeling of the Human Orbit from MR Images . . . . . . . . . . . . . . . . . . . . . . . 339 Z. Li, C.-K. Chui, Y. Cai, S. Amrith, P.-S. Goh, J.H. Anderson, J. Teo, C. Liu, I. Kusuma, Y.-S. Siow, W.L. Nowinski Accurate and High Quality Triangle Models from 3D Grey Scale Images . . 348 P.W. de Bruin, P.M. van Meeteren, F.M. Vos, A.M. Vossepoel, F.H. Post Intraoperative Fast 3D Shape Recovery of Abdominal Organs in Laparoscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 356 M. Hayashibe, N. Suzuki, A. Hattori, Y. Nakamura
Table of Contents, Part II
XVII
Statistical Shape Modeling Integrated Approach for Matching Statistical Shape Models with Intra-operative 2D and 3D Data . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 364 M. Fleute, S. Lavall´ee, L. Desbat Building and Testing a Statistical Shape Model of the Human Ear Canal . . 373 R. Paulsen, R. Larsen, C. Nielsen, S. Laugesen, B. Ersbøll Shape Characterization of the Corpus Callosum in Schizophrenia Using Template Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 381 A. Dubb, B. Avants, R. Gur, J. Gee 3D Prostate Surface Detection from Ultrasound Images Based on Level Set Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 389 S. Fan, L.K. Voon, N.W. Sing A Bayesian Approach to in vivo Kidney Ultrasound Contour Detection Using Markov Random Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 397 M. Mart´ın, C. Alberola Level Set Based Integration of Segmentation and Computational Fluid Dynamics for Flow Correction in Phase Contrast Angiography . . . . . . . . . . 405 M. Watanabe, R. Kikinis, C.-F. Westin Comparative Exudate Classification Using Support Vector Machines and Neural Networks . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 413 A. Osareh, M. Mirmehdi, B. Thomas, R. Markham A Statistical Shape Model for the Liver . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 421 H. Lamecker, T. Lange, M. Seebass Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics . . . . . . 428 R. Larsen, K.B. Hilger, M.C. Wrobel Kernel Fisher for Shape Based Classification in Epilepsy . . . . . . . . . . . . . . . . 436 N. Vohra, B.C. Vemuri, A. Rangarajan, R.L. Gilmore, S.N. Roper, C.M. Leonard A Noise Robust Statistical Texture Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 444 K.B. Hilger, M.B. Stegmann, R. Larsen A Combined Statistical and Biomechanical Model for Estimation of Intra-operative Prostate Deformation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 452 A. Mohamed, C. Davatzikos, R. Taylor
Registration – 2D/D Fusion ”Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images . . . 461 D. Tomaˇzeviˇc, B. Likar, F. Pernuˇs
XVIII Table of Contents, Part II
A Novel Image Similarity Measure for Registration of 3-D MR Images X-Ray Projection Images . . . . . . . . . . . . . . . . . . . . . . . . . . . 469 T. Rohlfing, C.R. Maurer Jr. Registration of Preoperative CTA and Intraoperative Fluoroscopic Images for Assisting Aortic Stent Grafting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 477 H. Imamura, N. Ida, N. Sugimoto, S. Eiho, S. Urayama, K. Ueno, K. Inoue Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy for Voxel-Based 2-D/3-D Registration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 485 Y. Nakajima, Y. Tamura, Y. Sato, T. Tashiro, N. Sugano, K. Yonenobu, H. Yoshikawa, T. Ochi, S. Tamura
Registration – Similarity Measures A New Similarity Measure for Nonrigid Volume Registration Using Known Joint Distribution of Target Tissue: Application to Dynamic CT Data of the Liver . . . . . . . . . . . . . . . . . . . . . . . . 493 J. Masumoto, Y. Sato, M. Hori, T. Murakami, T. Johkoh, H. Nakamura, S. Tamura 2D-3D Intensity Based Registration of DSA and MRA – A Comparison of Similarity Measures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 501 J.H. Hipwell, G.P. Penney, T.C. Cox, J.V. Byrne, D.J. Hawkes Model Based Spatial and Temporal Similarity Measures between Series of Functional Magnetic Resonance Images . . . . . . . . . . . . . . . 509 F. Kherif, G. Flandin, P. Ciuciu, H. Benali, O. Simon, J.-B. Poline A Comparison of 2D-3D Intensity-Based Registration and Feature-Based Registration for Neurointerventions . . . . . . . . . . . . . . . . . . 517 R.A. McLaughlin, J. Hipwell, D.J. Hawkes, J.A. Noble, J.V. Byrne, T. Cox Multi-modal Image Registration by Minimising Kullback-Leibler Distance . 525 A.C.S. Chung, W.M. Wells III, A. Norbash, W.E.L. Grimson Cortical Surface Registration Using Texture Mapped Point Clouds and Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 533 T.K. Sinha, D.M. Cash, R.J. Weil, R.L. Galloway, M.I. Miga
Non-rigid Registration A Viscous Fluid Model for Multimodal Non-rigid Image Registration Using Mutual Information . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 541 E. D’Agostino, F. Maes, D. Vandermeulen, P. Suetens
Table of Contents, Part II
XIX
Non-rigid Registration with Use of Hardware-Based 3D B´ezier Functions . . 549 G. Soza, M. Bauer, P. Hastreiter, C. Nimsky, G. Greiner Brownian Warps: A Least Committed Prior for Non-rigid Registration . . . . 557 M. Nielsen, P. Johansen, A.D. Jackson, B. Lautrup Using Points and Surfaces to Improve Voxel-Based Non-rigid Registration . 565 T. Hartkens, D.L.G. Hill, A.D. Castellano-Smith, D.J. Hawkes, C.R. Maurer Jr., A.J. Martin, W.A. Hall, H. Liu, C.L. Truwit Intra-patient Prone to Supine Colon Registration for Synchronized Virtual Colonoscopy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 573 D. Nain, S. Haker, W.E.L. Grimson, E. Cosman Jr, W.W. Wells, H. Ji, R. Kikinis, C.-F. Westin Nonrigid Registration Using Regularized Matching Weighted by Local Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 581 E. Su´ arez, C.-F. Westin, E. Rovaris, J. Ruiz-Alzola Inter-subject Registration of Functional and Anatomical Data Using SPM . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 590 P. Hellier, J. Ashburner, I. Corouge, C. Barillot, K.J. Friston
Visualization Evaluation of Image Quality in Medical Volume Visualization: The State of the Art . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 598 A. Pommert, K.H. H¨ ohne Shear-Warp Volume Rendering Algorithms Using Linear Level Octree for PC-Based Medical Simulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 606 Z. Wang, C.-K. Chui, C.-H. Ang, W.L. Nowinski Line Integral Convolution for Visualization of Fiber Tract Maps from DTI . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 615 T. McGraw, B.C. Vemuri, Z. Wang, Y. Chen, M. Rao, T. Mareci On the Accuracy of Isosurfaces in Tomographic Volume Visualization . . . . . 623 A. Pommert, U. Tiede, K.H. H¨ ohne A Method for Detecting Undisplayed Regions in Virtual Colonoscopy Its Application to Quantitative Evaluation of Fly-Through Methods . . . . . . . . . 631 Y. Hayashi, K. Mori, J. Hasegawa, Y. Suenaga, J. Toriwaki
Novel Imaging Techniques 3D Respiratory Motion Compensation by Template Propagation . . . . . . . . . 639 P. R¨ osch, T. Netsch, M. Quist, J. Weese
XX
Table of Contents, Part II
An Efficient Observer Model for Assessing Signal Detection Performance of Lossy-Compressed Images . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 647 B.M. Schmanske, M.H. Loew Statistical Modeling of Pairs of Sulci in the Context of Neuroimaging Probabilistic Atlas . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 655 I. Corouge, C. Barillot Two-Stage Alignment of fMRI Time Series Using the Experiment Profile to Discard Activation-Related Bias . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 663 L. Freire, J.-F. Mangin Real-Time DRR Generation Using Cylindrical Harmonics . . . . . . . . . . . . . . . 671 F. Wang, T.E. Davis, B.C. Vemuri Strengthening the Potential of Magnetic Resonance Cholangiopancreatography (MRCP) by a Combination of High-Resolution Data Acquisition and Omni-directional Stereoscopic Viewing . . . . . . . . . . . . 679 T. Yamagishi, K.H. H¨ ohne, T. Saito, K. Abe, J. Ishida, R. Nishimura, T. Kudo
Author Index . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 687
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images Hiroko Kitaoka1,6, Yongsup Park2 , Juerg Tschirren3 , Joseph Reinhardt4, Milan Sonka3, Goeffrey McLennan5, and Eric A. Hoffman1,4 1
Division of Physiologic Imaging, Dept. of Radiology, College of Medicine, University of Iowa, 200 Hawkins Drive, Iowa City, Iowa 52242, USA _LMVSOSOMXESOEIVMGLSJJQERa$YMS[EIHY 2 Dept. of Informatics and Mathematical Science, Graduate School of Engineering Science, Osaka University, 2-2 Yamadaoka, Suita, Osaka 363-0871, Japan ]WTEVO$MQEKIQIHSWEOEYEGNT 3 Dept. of Electrical and Computer Engineering, College of Engineering, University of Iowa, 1402 SC, Iowa City, Iowa 52242, USA _NYIVKXWGLMVVIRQMPERWSROEa$YMS[EIHY 4 Dept. of Biomedical Engineering, College of Engineering, University of Iowa, 1402 SC, Iowa City, Iowa 52242, USA NSIVIMRLEVHX$YMS[EIHY 5 Dept. of Internal Medicine, College of Medicine, the University of Iowa, 200 Hawkins Drive, Iowa City, Iowa 52242, USA KISJJVI]QGPIRRER$YMS[EIHY 6 Biomedical Physics Laboratory, Brussels Free University, Campus Erasme cp 613/3, 808 Route de Lennik, 1070 Brussels, Belgium Abstract. A nomenclature labeling algorithm for the human bronchial tree down to sub-lobar segments is proposed, as a means of inter and intra subject comparisons for the evaluation of lung structure and function. The algorithm is a weighted maximum clique search of an association graph between a reference tree and an object tree. The adjacency between nodes in the association graph is defined so as to reflect the consistency between the bronchial name in the reference tree and the node connectivity in the object tree. Nodes in the association graph are weighted according to the similarity between two tree nodes in the respective trees. This algorithm is robust to various branching patterns and false branches that arise during segmentation processing. Experiments have been performed for nine airway trees extracted automatically from clinical 3D-CT data, where approximately 250 branches were contained. Of these, 95 % were accurately named.
1
Introduction
Isotropic volume data acquisition for medical imaging is now rapidly spreading in clinical use due to the technological progress in multi-detector CT scanners. 3D image processing techniques have enabled precise structural analysis of living organs. Anatomical nomenclature is an important step in sharing a common understanding of organ structure. Inter-individual and intra-individual comparisons are meaningful only when accurate nomenclatures are applied to the structures. Accuracy of nomenclature is also critical for diagnosis and surgical planning. However, anatomical knowledge used for establishing the nomenclature of biological structure is challenging when seeking to construct robust computational algorithms, because of the nature of bioT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 1–11, 2002. © Springer-Verlag Berlin Heidelberg 2002
2
H. Kitaoka et al.
logic complexity and diversity. Discrepancy of anatomical nomenclature even between experts is not uncommon. The human airway tree is a typical example of the difficulty of nomenclature and labeling because of its hierarchical properties and the considerable variations of branching pattern. Mori et al reported a knowledge-based labeling method of the bronchial branches and applied it to seven cases of CT images with a slice thickness of 2 or 3 mm [1]. In their experiment, the number of extracted branches for each subject was about thirty, and none of extracted trees from the seven cases had all segmental bronchi. By the use of modern multi detector CT scanners, more than a hundred bronchial branches can be extracted. The increase in the number of branches identified increases the complexity of establishing a robust labeling scheme. In this paper, we first explain how the bronchial nomenclature is constructed in terms of graph theory, and introduce an algorithm based on a weighted maximum clique search of an association graph between a reference tree and an object tree. We demonstrate its performance by volumetric human lung CT data sets. We believe the proposed algorithm will be applicable to tree systems not only in the lung but also in other organs and across species.
2
Principles of Bronchial Nomenclature
2.1
General Aspects of Bronchial Nomenclature
The human airway tree begins from the trachea and repeatedly branches into smaller and smaller bronchi, ending in the terminal bronchioles, whose diameter is about 0.5 mm. The total number of the airway branches is over 50,000 in the normal adult human [2], and the bronchial nomenclatures are defined to 74 proximal branches down to sub-segmental bronchi [3], [4 ], [5 ]. Currently, the most clinically important nomenclatures include 32 branches down to segmental bronchi. Peripheral bronchi that lie downstream from a segmental bronchus are usually named using the nomenclature of the parent segmental bronchi. The bronchial nomenclature is assigned according to the region of the lung to which a bronchus supplies air. There is a clear definition for the spatial division of the lung as shown in Figure 1. Classes of lobe, segment, and subsegment construct a hierarchic structure, and a set of all members in the same class is equal to the whole lung without overlapping. There is an exact one-to-one correspondence between a branch and the lung region supplied air by the branch, because there are no loops in the airway tree. Therefore, the bronchial nomenclatures are based upon the regional nomenclatures: lobar bronchus, segmental bronchus, and so on. The most common way to mathematically describe the airway tree is by a graph representation using a rooted tree. However, for the purpose of bronchial nomenclature, a tree representation can lead to confusion, because the hierarchy of the rooted tree does not correspond to the nomenclature hierarchy. Figure 2 shows a standard branching pattern of the human bronchial tree [3], where thick lines indicate bronchi having anatomical nomenclatures. In this branching pattern, levels of segmental bronchi across from 3rd to 7th. Furthermore, as shown in Figure 3, there are differences in branching patterns even across normal subjects. It is obvious that the same nomenclature does not mean the same level in the tree representations of different branching patterns.
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
Fig. 1. Hierarchy of space division of the lung.
s: lung segment. ss: sub-segment
3
Fig. 2. A typical example of the human bronchial tree. s: segmental bronchus
Fig. 3. Branching patterns of segmental bronchi arising from the right upper lobar bronchus (UB). Frequencies for respective branching patterns are according to [3]
Since there are only five lobar bronchi and there is little variation in the branching pattern, nomenclatures for lobar bronchi are not difficult. On the other hand, determining nomenclatures for segmental bronchi are much more difficult because of a large variety of branching patterns. 2.2
Nomenclature of the Segmental Bronchus
There are ten lung segments in the right lung and eight segments in the left lung. The names of lung segments describe their locations within the lung. For example, there are apical, lateral, and anterior segments of the right upper lobe. For simplicity, numbers from 1 to 10 are used for distinguishing locations. Both the right and left lower lobes sometimes have accessory segments called sub-superior segments, which are often located below the superior segmental bronchus (B6). They are usually expressed as the symbol of asterisk (*) instead of number [3], [4], [5]. Since branchpoints in the bronchial tree have only one upward branch, it is reasonable to assign bronchial nomenclatures to branchpoints, as shown in Figure 4. As shown in Figure 4, each segmental bronchus is located neither upstream nor downstream from other segmental bronchi, since their supplying regions are independent of each other. In addition, the segmental bronchi are always located distal to their parent lobar bronchi regardless of the branching order, because each lobe is comprised of its member segments. These two relationships appear trivial but are very important for clarifying the node connectivity in a rooted tree in terms of graph theory. Meanwhile, intermediate branches between lobar and segmental bronchi have no
4
H. Kitaoka et al.
anatomical names because of their ambiguous relationships. These relationships do not change even if a tree contains false branches or missing true branches due to image processing steps including segmentation and skeletonization. Errors occurring in the segmentation and skeletonization steps of image processing algorithms serve as a primary source of difficulty when seeking to automatically label the bronchial tree.
Fig. 4. Scheme of the bronchial nomenclatures. Each segmental node is connected upward to the segmental bronchi. Some of lobar nodes presented here are different from the traditional definitions. See the text in 3.3
There is one more important characteristic of the bronchial nomenclature that can provide node attributes in the airway tree. Each lung segment is supplied air by its corresponding segmental bronchus, and all branches within the segment are ancestors of the segmental bronchus. Therefore, the position and the direction of a segmental bronchus correspond to the position and the central axis of its associated lung segment. The segmental bronchial nomenclature is defined according to this correspondence, regardless of its branching order.
3
Bronchial Nomenclature Algorithm
Automated bronchial nomenclature and labeling can be viewed as a tree matching problem between an object tree and a standard airway tree. The nomenclature labeling is then applied to give the same name to a node in the object tree as that of its corresponding node in the reference tree. The algorithm is based on a weighted tree matching method proposed by Pelillo et al.[6]. Their method seeks the maximum weight clique in a tree association graph (TAG), equivalent to the maximum similarity subtree isomorphism between two trees. We modify the definition of adjacency of TAG nodes and construct a similarity measure between a reference tree and an object tree according to the property of the bronchial nomenclature explained in the previous section. Before explaining our algorithm, Pelillo’s original method is briefly described.
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
3.1
5
Weighted Tree Association Graph by Pelillo et al.
Let G = (V, E) be a graph, where V is the set of nodes and E is the set of edges. Let T1= (V1 ,E1) and T2=(V2, E2) be two rooted trees and let u1, v1 ∈ V1, and u2, v2 ∈ V2 be distinct nodes of the respective trees. The tree association graph (TAG) of T1 and T2 is the graph G = (V, E) where V = V1 x V2 , and TAG nodes (u1, u2) and (v1, v2) are adjacent when the connectivity between u1 and v1 is equivalent to that of u2 and v2. Pelillo et al define equivalence between two sets of nodes in respective trees by comparing path length and level difference in the tree hierarchy. By this definition, there exists one to one correspondence between maximal subtree isomorphism and maximal clique of the TAG from the two trees. Searching for the maximal clique in TAG is equivalent to tree matching. Next, let T(V, E, α) be an attributed tree, whereα is a function which assigns an attribute vector α (u) to each node u ∈ V. Let σ be a similarity measure in attribute space. Subtree isomorphism with the largest similarity is called “maximum similarity subtree isomorphism”. The weighted TAG (WTAG) of two attributed trees T1 and T2 is the weighted graph G = (V, E, ω), whereω is a function which assigns a positive weight to each node z = (u, v) ∈ V as follows: ω(z) = ω(u, v) = σ (α1 (u), α2 (v)). Weight matrix W= (mij) is defined as follows: if i = j, mij = 1 - 0.5σmin /ω(ui) mij = 1 if i ≠ j and ui are adjacent to uj 0 ≤ mij < 1 - 0.5σmin /(ω(ui) + ω(uj)) otherwise, where σmindenotes the minimum value of the similarity measure, σ. Pelillo et al. used the following method to search for a maximum clique from weighted TAG. Let G = (V, E, ω), be an arbitrary weighted graph of order n. The characteristic vector of any subset of nodes C ⊆ V, denoted xc, is defined as follows: if ui∈ C xic = ω(ui)/ Ω(C), = 0, otherwise, where Ω(C) is the total weight on C. It has been proved that C is a maximum weight clique of G if and only if xc is a global maximizer of the function xTWx, where xT denotes matrix transposition [7], [8]. Pelillo et al used replicator dynamic system to seek the maximizer [9]. 3.2
Modification of Weighted TAG for Bronchial Nomenclature
Pelillo et al. defined TAG-node adjacency as an exact agreement between the connectivity of two nodes in one tree and that of their corresponding nodes in the other tree [6]. We propose an alternative definition of TAG-node adjacency to be constructed for the purpose of bronchial nomenclature. Here, a relationship function r between nodes u, v, and w in a rooted tree is defined as follows: When u is located upstream from v, r(u, v) =1, r(v, u ) =-1 When u is located neither upstream nor downstream of w, r(u, w) = r(w, u) = 0.
6
H. Kitaoka et al.
Another relationship function, q in a reference tree is defined as follows: The basic relationship is the same as r, however, when u and v are segmental nodes, q(u, v) = q(v, u) = 2, When u is a node having no nomenclature, q(u, v) = q(v, u) = 3 Both of those relationship functions are applicable to multiple branching. From two relationship functions, the adjacency of TAG nodes A is defined as follows: A=1 if q < 2 and r = q, A=1 if q = 2 and r = 0, A=0 if q = 2 and r ≠ 0, or q < 2 and r ≠ q, A=0 if q = 3 The definition of the lobar node in the algorithm is slightly different from the anatomical definition of the lobar bronchi. The bilateral lower lobe bronchi are very short, and sometimes the superior segment bronchi of the lower lobe (B6) arise from the right intermediate bronchus and left main bronchus. Therefore, instead of the lower lobe bronchi, the basal bronchi are used in the algorithm. The reference tree is indicated in Figure 4. There are 30 branches having anatomical names excluding two lower lobe bronchi. The node attribute vector α (u) of a tree is constructed by the position of a node, denoted Pu, and the direction of the upward edge, denoted Vu, The similarity measure σ is defined as follows: σ (α (u1), α (u2)) =1- β1(1- (Vu1, Vu2) ) -β2 | Pu1 – Pu2| σ ( u, v) = σmin , if σ ( u, v)< σmin, where β1, β2, and σmin (>0)are determined experimentally as 0.5, 0.1/cm, and 0.1, respectively. In order to compare node positions in different trees, size normalization and approximate registration are necessary. The practical methods are described in the next section. 3.3
Correction of Labeled Node
In the above algorithm, a descendent of a true segmental node is labeled as the segmental node when its similarity is higher than that of the true segmental node. Therefore, it is necessary to check whether there is a true segmental node in ascendants of a labeled node. Since all descendents of a segmental node are not those of any other segmental nodes, a sibling of a segmental node should have at least one different segmental node in its descendents including itself. Therefore, if the sibling does not, one of its ancestors should be the true segmental node. Thus, correction of a segmental node is performed by replacing the segmental node upwards until a sibling having at least one segmental node is found. If the parent is labeled as a lobar node in spite of the fact its sibling does not have any segmental node, the sibling and its descendents are regarded as belonging to an unknown segmental node. If there are unknown segmental nodes after checking all labeled segmental nodes, a nomenclature which has the highest similarity of the unknown node is labeled. Proximal branches beyond segmental nodes are relabeled using a relationship between lobes and segments as shown in Figure 1. For example, if a node is located upstream of all unilateral segments, the node is assigned as the main bronchus. When false branches are generated in a proximal branch, the above algorithm does not recognize all parts of the branch. However, by adding this step, all proximal branches are obtained excluding the false branches.
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
4
7
Experiments
Nine 3D-CT data sets of the human lungs were used for testing. Scanning occurred with lung volume held near total lung capacity and with subjects lying in the supine posture. The slice thickness was 1.25mm with 0.6mm spacing, and the pixel size ranged from 0.58 mm to 0.72 mm. All subjects were studied under an approved University of Iowa IRB protocol. Segmentation and skeletonization of the airway was performed by a method reported by Kiraly et al. [10] and Palagy et al. [11], respectively. More than 190 branches were extracted for each case. In most cases, there were several false branches in the proximal portion of the tree, which could be automatically recognized as false branches. However, in some cases, there were several clusters of numerous false branches in the peripheral lung regions distal to segmental bronchi. These peripheral false branches were due to incorrect segmentation at the periphery, and they could not be automatically recognized as false. Therefore, peripheral false branches were manually recognized and excluded from the evaluation. False branches located at the proximal part were evaluated whether they were labeled correctly “false” or not. The gold standards for the bronchial nomenclatures were given by careful observation of the CT images by one of the authors who was a pulmonologist expert at chest CT images. An existing 3D mathematical model of the human airway tree [12] was slightly modified and used as a reference tree. The branching pattern was designed to represent a standard airway tree [3], and bilateral sub-superior segmental bronchi were added as shown in Figure 4. Since the maximum thoracic width of this model is fixed at 30 cm [12], size normalization was performed according to the maximum thoracic widths in CT images. The approximate registration was performed by matching the carina point in a normalized object tree to that of the reference tree. Automated detection of the carina point was performed by finding the longest branch located at the center of the thorax. Only branchnodes in an extracted airway tree were subjected to the nomenclature-labeling algorithm, and terminal nodes were labeled later. The reason is that the extracted branches extended peripherally to segmental bronchi. Table 1 shows the number of extracted branches, labeled branches, and correctly labeled branches for each subject. Almost all branches were accurately labeled except for subject 3. Overall accuracy for nine cases was calculated as 95 %. Table 1. Result of automatic labeling of the bronchial tree extracted from 3D-CT data Subject Extracted Branches Labeled branches Correctly Labeled Accuracy (%)
1
2
3
4
5
6
7
8
9
Total
245
197
203
245
192
327
268
195
301
2,173
245
197
203
244
192
327
266
195
298
2,167
245
169
148
244
192
327
266
195
288
2,074
100
96
74
100
100
100
99
100
96
95
Figures 5, 6, and 7 show bronchial trees in subjects 1, 2, and 3, respectively. In these figures, segmented bronchial regions and their skeletons are superimposed. Each segmental bronchus and its descendents are distinguished by color. Proximal branches beyond segmental bronchi are colored white. Incorrectly labeled branches
8
H. Kitaoka et al.
are colored gray. In Figure 5, even though there were several false branches at proximal bronchi, the nomenclature was successfully performed with an accuracy of 100 %. False branches in the proximal bronchi are correctly labeled as false. The left subsuperior bronchus (B*) was correctly labeled.
Fig. 5. Labeled bronchial tree in subject 1. Anterior view (left), right lateral view ( middle), and left lateral view (left). All branches are correctly labeled including false branches
Fig. 6. Labeled bronchial tree in subject 2. There are two mislabeled sub-segmental bronchi
Fig. 7. Labeled bronchial tree in subject 3. The right main and intermediate bronchi are unlabeled. Half of right segmental bronchi are mislabeled
There were several incorrectly labeled branches in subject 2. Mislabeling occurred at the level of the sub-segmental bronchus as shown in Figure 6. Two sub-segmental bronchi of the anterior segmental bronchi of the left lower lobe (B8) arose without having a common trunk in this case. One sub-segmental bronchi were correctly labeled as B8, but the other was labeled as the lateral segmental bronchi of the left lower lobe (B9) as shown by an arrow in Figure 6. One of sub-segmental bronchus of the apico-posterior segmental bronchus of the left upper lobe (B1+2) were mislabeled
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
9
as the anterior segmental bronchus of the left upper lobe (B3), because the true B3 arose at the lower position than usual, the apical sub-segmental bronchi of B1+2 was first labeled as B3 as shown by an arrow head. Causes of mislabeling in subjects 7 and 9 are due to lack of a common trunk of sub-segmental bronchi as left B8 in subject 2. Subject 3 had a very rare variant branching pattern where the apical segmental bronchi of the right upper lobe (B1) arose from the right main bronchus as shown in Figure 7. Since the right upper lobe was much larger than usual, positions and directions of other branches were different from those in a usual airway tree. Therefore, only half of the branches in the right lung were correctly labeled. The right main bronchus and the intermediate bronchus were unlabeled, as colored gray in Figure 7, because of inconsistency of the relationship between lobe and segments. This indicated that the branching abnormality occurred at the level of the main bronchus. There were two unlabeled branches in the left lung. The reason is that they were terminal branches in the extracted tree although they were sub-segmental bronchi.
5
Discussion
The experimental results indicate that the proposed algorithm is useful for bronchial nomenclature labeling up to the segmental level of the airway tree in human CT images with 95 % accuracy. The lowest accuracy is seen in subject 3, where a very rare branching pattern was observed. According to Yamashita [3], such a pattern did not occur in 170 specimens studied. It is unlikely that automated methods for labeling nomenclature will be successful in such cases, and manual correction by an expert will be required. The proposed algorithm can alert the user when such unusual patterns are encountered by labeling no nomenclatures to proximal bronchi. Except subject 3, the accuracy of nomenclature was 98 %in average, which is considered satisfactory for practical application. The main cause of mislabeling was the lack of a common trunk of sub-segmental bronchi. Although the mislabeled bronchi in the left lower lobe in subject 2 was recognized as a sub-segmental bronchus of B8 by one of the authors, other experts may name it a sub-segmental bronchi of B9 or B*. Branching patterns of bronchi at the sub-segmental level are more varied than at the segmental level [3], and hence the difficulty of labeling of sub-segmental bronchi is much higher. In order to solve this problem, the extension of the proposed algorithm to the level of sub-segmental level will be useful. There are three parameters in the weighted TAG that determine the similarity between the reference tree node and an object tree node. Although fixed values were used in the experiment, optimal values should be investigated as the number of clinical cases increase. Accuracy of the nomenclature labeling is also influenced by generality of a reference tree. We used a model-derived tree [12] as a reference in the experiment. The model-derived tree consists of the most common branching pattern in respective lobes and contains accessory segmental bronchi, which will rarely be found in real cases. Refinement of the reference tree is expected by statistically analyzing morphometric data of the airway trees in 3D CT images as we continue to study additional subjects. Mori et al. proposed a knowledge-based labeling method of the airway tree. Nomenclature labeling in their method is executed in the direction of the periphery from
10
H. Kitaoka et al.
the trachea with the depth first search [1]. However, as they discussed in their paper, the depth first search propagates proximal mislabeling into the periphery. Searching for a global solution, as in our algorithm, may be more suitable for bronchial nomenclature labeling. Krass et al. reported that they performed automated bronchial labeling based on graph theory[13], but details of the algorithm was not described in their paper. Automated nomenclature labeling of the airway tree in 3D-CT images is a promising technique for both clinical and fundamental imaging investigation. One can easily begin to recognize branching patterns and to catalogue the spatial distribution of the airway tree. Although it is difficult to obtain precise segmentation of peripheral small airways, it is possible to label pulmonary arteries adjacent to labeled airways and to track more peripheral branches of pulmonary arteries. Arterial labels can likely be transferred to their adjacent airways. These processes will provide better understanding of segmental anatomy of the lung.
6
Conclusion
We have proposed a bronchial nomenclature labeling algorithm that is robust to various branching patterns and false branches that arise during image segmentation and skeletonization. The results show very accurate labeling for more than 200 branches. This technique will be useful for both of clinical and fundamental imaging investigations of the lung.
Acknowledgements: This works was supported in part by NIH HL-04368 and HL-060158 and NSF 0092758.
References 1. Mori, K., Hasegawa, J., Suenaga, Y., Toriwaki J.: Automated Anatomical Labeling of the Bronchial Branch and Its Application to the Virtual Bronchoscopy System. IEEE Trans. Med. Imag. 19 (2000) 103-114 2. Weibel, E.R.: Morphometry of the Human Lung. Academic Press, New York (1963). 3. Yamashita, H.: Roentgenologic Anatomy of the Lung. Igaku-shoin, Tokyo (1978). 4. Moore, K.L. Clinically Oriented Anatomy. Williams &Willkins, Baltimore (1985) 49-148 5. Agur, A.M.R., Lee, M.J. Grant Atlas of Anatomy, Williams &Willkins, Baltimore (1991) 1-76. 6. Pelillo, M., Siddiqi, K., Zucker, S.W.: Matching Hierachical Structure Using Association Graph. IEEE Trans. PAMI. 21 (1999) 1105-1120 7. Motzkin, T.S., Straus, E.G.: Maxima for Graphs and a New Proof of a Theorem of Turan. Canadian J. Math. 17 (1965) 533-540 8. Bomze, L.M., Budinich, M., Pardalos, P.M., Pelillo, M.: The Maximum Clique Problem. In: Du, D.-Z., Paradolas, P.M. (eds): Handbook of Combinatorial Optimization, Vol.4. Mass. Kluwer Academic, Boston (1999) 9. Pelillo, M.: The Dynamics of Nonlinear Relaxation Labeling Process. J. Math.Imag. and Vision 7(1997) 309-323 10. 10. Kiraly, A, Higgins, W.E, Hoffman, E.A., McLennan G., Reinhardt G.M.: 3D Human Airway Segmentation for Virtual Bronchoscopy. In Proc. of SPIE Conf on Medical Imaging (2002) (in press)
Automated Nomenclature Labeling of the Bronchial Tree in 3D-CT Lung Images
11
11. Palagyi, K., Sorantin, E., Balogh, E., Kuba, A., Halmai, C., Erdohelyi, B., Hausegger, K.:A sequential 3D Thinning Algorithm and its Medical Applications. In 17th Int. Conf. IPMI (2001) 409-415 12. Kitaoka, H., Takaki, R., Suki, B.: A Three-Dimensional Model of the Human Airway Tree. J. Appl Physiol. 87(1999) 2207-2217 13. Krass, S., Selle, D., Boehm D., Jend H.H., Kriete A., Rau W.S., Peitgen H.O.: Determination of Bronchopulmonary Segments Based on HRCT Data. In: Lemke H.U. et al (eds): Computer Assisted Radiology and Surgery. Elsevier, Amsterdam (2000) 584-589.
Segmentation, Skeletonization, and Branchpoint Matching – A Fully Automated Quantitative Evaluation of Human Intrathoracic Airway Trees J. Tschirren1 , K. Pal´ agyi4 , J. M. Reinhardt2 , E. A. Hoffman3,2 , and M. Sonka1 1
4
Department of Electrical and Computer Engineering 2 Department of Biomedical Engineering 3 Department of Radiology The University of Iowa, Iowa City, IA 52242, USA Department of Applied Informatics, University of Szeged, Hungary
Abstract. Modern multislice X-ray CT scanners provide high-resolution volumetric image data containing a wealth of structural and functional information. The size of the volumes makes it more and more difficult for human observers to visually evaluate their contents. Similar to other areas of medical image analysis, highly automated extraction and quantitative assessment of volumetric data is increasingly important in pulmonary physiology, diagnosis, and treatment. We present a method for a fully automated segmentation of a human airway tree, its skeletonization, identification of airway branches and branchpoints, as well as a method for matching the airway trees, branches, and branchpoints for the same subject over time and across subjects. The validation of our method shows a high correlation between the automatically obtained results and reference data provided by human observers.
1
Introduction
Quantitative assessment of intrathoracic airway trees is critically important for objective evaluation of bronchial tree structure and function. Several approaches to three-dimensional reconstruction of the airway tree have been developed in the past. None of them, however, allows direct comparison of airway trees across and within subjects. Functional understanding of pulmonary anatomy as well as the natural course of respiratory diseases like asthma, emphysema, cystic fibrosis, and many others is limited by our inability to repeatedly evaluate the same region of the lungs time after time and perform accurate and reliable positionally corresponding measurements. Consequently, quantitative analysis of disease status and its progression and regression, as well as longitudinal physiologic and functional analyses are impossible. In this paper, we describe an integrated approach to quantitative analysis of intrathoracic airway trees and inter-tree matching using high-resolution volumetric computed tomography (CT) images. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 12–19, 2002. c Springer-Verlag Berlin Heidelberg 2002
Segmentation, Skeletonization, and Branchpoint Matching
2
13
Methods
The reported system consists of three main blocks: airway tree segmentation, skeletonization and branchpoint localization, and branchpoint matching. Each of these blocks is described separately in the following subsections. 2.1
Airway tree segmentation
The airway segmentation takes advantage of the relatively high contrast in CT images between the center of an airway and the airway wall. A seeded region growing is employed starting with an automatically identified seedpoint within the trachea. New voxels are added to the region if they have a similar X-ray density as a neighbor voxel that already belongs to the region. The similarity measure is designed so that the region growing can overcome subtle gray level changes (like for example caused by beam-hardening). On the other hand a “leaking” into the surrounding lung tissue has to be avoided. This is realized by setting an upper limit of allowed difference in gray value for two neighboring voxels. Our region growing algorithm utilizes a breadth-first search [1], which allows a fast and memory-friendly implementation. After airway segmentation, a binary subvolume is formed that represents the extracted airway tree. 2.2
Skeletonization
The binary airway tree formed in the previous step is skeletonized to identify the three-dimensional centerlines of individual branches and to determine the branchpoint locations. A sequential 3D thinning algorithm reported by Pal´ agyi et al. [2] was customized for our application. To obtain the skeleton, a thinning function deletes border voxels that can be removed without changing the topology of the tree. This thinning step is applied repeatedly until no more points can be deleted. The thinning is performed symmetrically and the resulting skeleton is guaranteed to lie in the middle of the cylindrically shaped airway segments. After completion of the thinning step, the skeleton is smoothed, false branches pruned, the location of the branchpoints identified, and the complete tree converted into a graph structure using an adjacency list representation. Fig. 1 shows a close-up view of a skeleton produced by the algorithm. Skeleton branchpoints are identified as skeleton points with more than two neighboring skeleton points. 2.3
Branchpoint matching
The goal of branchpoint matching is to find anatomically corresponding branchpoints in two different airway trees. Two types of matching are of utmost interest: intra-subject and inter-subject matching. In the first case, trees coming from different scans of the same subject are matched. In the second case, two or more trees are matched originating from different subjects. The latter case only allows matching of the primary branchpoints (the first three or four generations). These
14
J. Tschirren et al.
Fig. 1. Example of segmentation and skeletonization applied on an airway tree phantom.
primary branchpoints are frequently (although not universally) identical among humans. The branching pattern of higher airway generations varies from subject to subject, much like fingerprints do. In the mathematical sense, an airway tree is a graph (rooted tree). The branchpoints correspond to vertices and the airway segments correspond to graph edges. There are many graph-theoretic approaches to graph matching. A widely used method for matching hierarchical relational structures is to map them onto an association graph and then find its maximum clique [3], with many variations existing [4, 5]. To the best of our knowledge, only one application of the method was employed for matching airway trees [6]. A disadvantage of finding the maximum clique is its NP-completeness [7]. This means that for all but small graphs, an exhaustive search is not feasible. There are two basic ways of decreasing the computational complexity: minimizing the overall problem size or splitting the problem into several smaller subproblems. Our method uses both of these strategies. Terminal branches that are shorter than a predefined length are mostly spurious (caused by inaccuracies in the segmentation and skeletonization processes) and are pruned out of the tree in the late stages of the skeletonization process. Additionally, the major vertices (branchpoints) are identified. A vertex is considered to be major if it has at least N vertices hierarchically underneath it, and if these vertices have a spatial extent that exceeds a predefined threshold. The spatial extent is defined as the maximum of the three differences xmax − xmin , ymax − ymin , and zmax − zmin . Next, the two trees undergo a rigid registration, using the major branchpoints as landmark points. The major branchpoints are matched using an association graph. After that, a separate association graph is created for every subtree starting from a set of matched major branchpoints. When creating the association graphs for the sub-trees, only vertex-pairs that lie relatively close to each other are considered. This reduces the size of the association graph. Edges are added to the association graph based on the topological and geometrical distances, inheritance relationships, and geometrical length and directions. For all of these measures tolerances are allowed. For the topological
Segmentation, Skeletonization, and Branchpoint Matching
15
distance, a tolerance of ±2 segments is allowed. A parent–child and a childparent relationship are regarded equivalent if the geometrical distance between the two branchpoints does not exceed 2 mm in both trees. This introduces tolerance for cases where two branches are very close to each other, and — due to tolerances in segmentation and skeletonization — the order of two branchpoints is swapped for the two trees. For the length and angles of segments, tolerances of ±20% and ±0.2 radians are allowed, respectively. Allowing for these tolerances introduces robustness against false branches and missing branches. In a final step, the maximum clique is found for every association graph.
Fig. 2. Result of branchpoint matching for in vivo scan (TLC and FRC), total view and detail view of same matching. The two trees are shown in bold black and bold gray, the matchings are represented by fine black lines.
3
Experimental Methods
To test the method, CT scans of two different physical phantoms and in vivo scans of the human airway trees were used. 3.1
Data
Two different phantoms were available. The first phantom is a hollow rigid plastic phantom (Fig. 3 a), made by a rapid prototyping machine. The phantoms geometry is based on real human data. Consequently, a human-like airway tree with parameters known to a high degree of accuracy is available. This phantom consists of about 100 airway tree branches with about 50 branchpoints (not counting the terminal points of airway segments). The second phantom is a hollow rubber phantom (Fig. 3b) made from a human airway tree cast. This second phantom is more complex, consisting of about 400 branches and 200 branchpoints. The
16
J. Tschirren et al.
rubber phantom was scanned in a Perspex container filled with potato flakes to resemble texture of surrounding pulmonary parenchyma. Since this rubber phantom was not built using a numerical rapid prototyping approach and it is not rigid, exact branchpoint locations were not known. The rigid phantom was CT-scanned at three different angles (0◦ , 10◦ , and ◦ 25 ) by rotating it on the scanner table (rotation around one axis). The rubber phantom was scanned twice. It was rotated in a similar way as the rigid phantom. The rotation angle was 8◦ in this case. The pixel size was 0.49 × 0.49 × 0.60 mm3 for the rigid phantom and 0.68 × 0.68 × 0.60 mm3 for the rubber phantom. The volume sizes were 512×512×500–600 voxels.
(a)
(b)
Fig. 3. Phantoms. a) Rigid phantom, b) Rubber phantom. In both phantoms, all the airway segments are hollow.
Two scans were available for each of 18 in vivo subjects for a total of 36 volumetric high resolution in vivo CT scans. For each subject, a scan close to total lung capacity (TLC) was acquired (at 85% lung volume), and a scan close to functional residual capacity (FRC) was acquired (at 55% lung volume). All in vivo scans have a nearly isotropic resolution of 0.7 × 0.7 × 0.6 mm 3 and consist of 500–600 image slices, 512×512 pixels each. In two of these 18 CT data pairs (in 4 volumes, two from a diseased and two from a normal subject), branchpoints were manually identified by human observers and were used for quantitative validation. 3.2
Validation indices
The validation was done in two parts. First, the reproducibility of the segmentation and skeletonization was tested. Next, the accuracy of the branchpoint matching was examined. The reproducibility of the segmentation and skeletonization was measured by comparing the lengths of corresponding airway segments between the different scans of the two phantoms.
Segmentation, Skeletonization, and Branchpoint Matching
17
The accuracy of the branchpoint matching was measured by comparing the results obtained using the automated method with the results of manual matching. The manual matching was done separately and independently by six different observers. A matched pair of branchpoints was only included in the independent standard if it was matched by a majority of human observers involved.
4
Results
Our method above was successfully applied to all 5 phantom and 36 human datasets. In all cases, the method generated reliable trees, well-positioned skeletons and branchpoints, and provided consistent intra-subject matches. Quantitative validation results are reported below. Fig. 4 gives comparison of airway segment lengths. The p-values are calculated by analysis of variance (ANOVA), using an F-statistic, with the null hypothesis that the mean values are equal. The means and standard deviations for the segment length differences were: Rigid phantom, 0◦ versus 10◦ : Rigid phantom, 10◦ versus 25◦ : Rigid phantom, 0◦ versus 25◦ : Rubber phantom, 0◦ versus 8◦ :
µ1 µ1 µ2 µ1
= 0.03 mm = −0.07 mm = −0.31 mm = 0.24 mm
σ1 σ1 σ2 σ1
= 0.86 = 2.45 = 1.96 = 1.04
mm mm mm mm
Table 1. Results for accuracy assessment of branchpoint matching.
Correct matches: computerdetermined vs. independent standard Wrong matches Missing matches Total computer matches
rigid phantom 0◦ vs. 10◦ 38/39 (97%)
in vivo normal 11/13 (85%)
in vivo diseased 17/19 (89%)
0 1 47
1 1 46
0 2 31
Table 1 lists the results for the branchpoint matching. The segmentation, skeletonization, and matching processes execute very fast on a 1.2 GHz AMD Athlon based Linux system. For an image volume containing 512 × 512 × 524 voxels, the segmentation step finishes in less than one second, the complete skeletonization, smoothing, and graph-generation process executes in about 48 seconds, and matching of two trees containing 150–200 branchpoints each requires one to two seconds. Consequently, a pair of trees can be analyzed and matched in about 100 seconds using our moderate-speed hardware.
J. Tschirren et al.
40
slope = 0.98 intercept = 0.21 N = 36 r = 0.99 p = 0.99
30
20
10
0 0
10
50
40
30
20 30 40 segment length at 0 degrees [mm]
50
40
30
slope = 0.96 intercept = 0.671 N = 42 r = 0.97 p = 0.98
20
10
0 0
60
10
20
30
40
50
60
50
60
segment length at 10 degree [mm]
Rubber Phantom 60
slope = 0.97 intercept = 0.70 N = 30 r = 0.98 p = 0.89
20
10
0 0
50
Rigid Phantom
60
segment length at 25 degrees [mm]
60
segment length at 8 degree [mm]
segment length at 10 degrees [mm]
50
Rigid Phantom
Rigid Phantom
60
segment length at 25 degree [mm]
18
10
20 30 40 segment length at 0 degrees [mm]
50
60
50
40
30
slope = 0.96 intercept = 0.14 N = 121 r = 0.99 p = 0.76
20
10
0 0
10
20 30 40 segment length at 0 degree [mm]
Fig. 4. Segment length comparison for rigid phantom and rubber phantom.
5
Discussion
The comparison of segment lengths as determined in phantoms showed high correlation between the reference data and the computer-determined data (Fig. 4). Agreement between segment lengths identified in the 0◦ and 10◦ rotated phantoms and for the 10◦ and 25◦ rotated phantoms was very good. For 0◦ and 25◦ , somewhat larger differences between the lengths were observed. This is mainly caused by a few outliers likely to be associated with the relatively large change of the CT scanning conditions and is not practically important as 25◦ differences between long-axis orientations of human subjects in a CT scanner is unlikely. The comparison of computer-matched branchpoints and hand-matched branchpoints shows a high matching rate in the phantom cases (97%), as well as in the human data (85–89%). Notice that the human data contained a relatively high number of non-matching branches in the pairs of matched TLC and FRC datasets. Indeed, there is a considerable difference in the number of branches and in the identifiable parts of the tree-structures between FRC and TLC scans due to changes of lung volume and consequently lung geometry. When comparing the matches identified manually and automatically, it is important to distinguish between missing and extra matches. Comparing be-
Segmentation, Skeletonization, and Branchpoint Matching
19
tween these two classes only, a missing match is preferred over an extra match since no incorrect information is introduced. As can be seen in Table 1, only a single incorrect extra match was observed in the tested in vivo datasets. At the same time, a total of only four missing matches occurred - an encouraging sign considering that 77 correct matches were identified overall in the in vivo datasets and additional correct matches were found using the computer approach that were not identified manually. The current implementation is not free of several shortcomings. The segmentation step is currently limited to the first 6 to 8 generations of airway tree segments. While substantially better than any of our previously reported approaches, additional improvements are under development. The branchpoint matching process is under review with a goal to avoid the small number of mismatches present in the current study. Needless to say, additional datasets are manually analyzed by human observers to form a larger and more representative set of independent standard data for future validation studies.
6
Conclusion
We presented an approach that allows reliable segmentation, skeletonization, and branchpoint matching in human airway trees. When tested in two kinds of physical phantoms derived from casts of human airway trees and in 36 invivo acquired airway trees of normal subjects as well as in those suffering from various pulmonary diseases, the method’s performance was incomparably faster than manual analysis and yielded close-to-identical results.
Acknowledgements This work was supported in part by the NIH grant HL-064368.
References 1. J. Silvela and J. Portillo, “Breadth-first search and its application to image processing problems,” IEEE Transactions on Image Processing, vol. 10, pp. 1194–1199, 8 2001. 2. K. Pal´ agyi, E. Sorantin, E. Balogh, A. Kuba, C. Halmai, B. Erdohelyi, and K. Hausegger, “A Sequential 3D Thinning Algorithm and its Medical Applications,” in 17th Int. Conf. Information Processing in Medical Imaging, IPMI 2001, Davis, CA, USA. Lecture Notes in Computer Science 2082, pp. 409–415, 2001. 3. A. P. Ambler, H. G. Barrow, C. M. Brown, R. M. Burstall, and R. J. Popplestone, “A versatile computer-controlled assembly system,” in Proceedings of International Joint Conference on Artificial Intelligence, pp. 298–307, 1973. 4. D. H. Ballard and C. M. Brown, Computer Vision. Prentice Hall PTR, 1982. 5. M. Pelillo, K. Siddiqi, and S. W. Zucker, “Matching hierarchical structures using association graphs,” IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, pp. 1105–1120, 11 1999. 6. Y. Park, Registration of linear structures in 3-D medical images. PhD thesis, Osaka University, Japan. Department of Informatics and Mathematical Science., 1 2002. 7. T. H. Cormen, C. E. Leiserson, and R. L. Rivest, Introduction to Algorithms. MIT Press, 1990.
Improving Virtual Endoscopy for the Intestinal Tract Matthias Harders1 , Simon Wildermuth2 , Dominik Weishaupt2 , and G´ abor Sz´ekely1 1 Swiss Federal Institute of Technology Communication Technology Laboratory ETH Zentrum, CH-8092 Z¨ urich, Switzerland 2 University Hospital Zurich Institute of Diagnostic Radiology Raemistrasse 100, CH-8091 Z¨ urich, Switzerland {mharders,szekely}@vision.ee.ethz.ch, {dominik.weishaupt,simon.wildermuth}@dmr.usz.ch
Abstract. We present a system that opens the way to apply virtual endoscopy on the small intestines. A high-quality image acquisition technique based on MR as well as a haptically assisted interactive segmentation tool was developed. The system was used to generate a topologically correct model of the small intestines. The influence of haptic interaction on the efficiency of centerline definition has been demonstrated by a user study.
1
Introduction
The importance of radiologic imaging in the diagnosis of diseases in the intestinal tract has increased dramatically in recent years. One precursor of this development is virtual colonoscopy, which represents a promising method for colorectal cancer screening. In the early 1990s, Vining et al. [10] were the first to report on the technical feasibility of virtual colonoscopy simulating conventional endoscopic examinations. Its advantage is increased patient comfort, due to noninvasiveness, reduced cost as well as reduced sedation time. Results from recent studies [1, 7] show the accuracy to be comparable to conventional colonoscopy for detection of polyps of significant size. Nevertheless, virtual endoscopic evaluation of the intestines has so far been limited to the colon, but several diseases exist that also necessitate a radiologic exam of the small intestines - especially, since the small bowel can not be assessed completely by conventional methods. Virtual endoscopy of the small intestines is much more difficult then virtual colonoscopy because the tubular structure often follows a tortuous and curved path through 3D space. This makes the accurate tracing of the geometry an extremely difficult task. Furthermore, the tightly folded structure is often sliced at an oblique angle, resulting in extreme deterioration of image quality as tangential slicing direction is approached. Apart from these limitations, further general problems exist that hinder a wide dissemination of virtual endoscopy of the intestinal tract as a primary population T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 20–27, 2002. c Springer-Verlag Berlin Heidelberg 2002
Improving Virtual Endoscopy for the Intestinal Tract
21
screening procedure. These include the relatively lengthy time required for data interpretation, poor patient compliance regarding bowel cleansing, and concerns over the CT radiation dose. Our current research is directed at solving these problems by using MR imaging for virtual endoscopy of the intestinal tract - especially the small bowel. To improve patient compliance, we propose a new concept with an oral approach to avoid the need of invasive intubation which is more acceptable to the patient. Furthermore, we enhance the image analysis process by interactive haptic segmentation methods.
2
Medical Background
The prevalence of small bowel disease - the most common being Crohn’s Disease (chronic inflammatory bowel disease) and small bowel carcinoid tumor or tumor metastasis - is low, and the clinical diagnosis is complicated by nonspecific symptoms and a low index of suspicion. This frequently leads to delays in diagnosis and treatment. An accurate radiologic examination is, therefore, important not only for recognition of small bowel disease but also to help reliably document normal morphology [8]. The limitations of conventional enteroclysis (small bowel barium contrast x-ray) investigation, which needs invasive nasoduodenal intubation for contrast material application, have been recognized for a long time [4].
(a) 2D slice view.
(b) Thresholded 3D view.
Fig. 1. Small intestine image data.
Despite advances in fibre-optic endoscopy, the majority of the small bowel still remains inaccessible. Although recently developed small endoscopes allow true endoscopy of the duodenum and proximal jejunum, the conventional and cross-sectional gastroenterologic imaging methods currently represent the only reliable technique for evaluating the small bowel. The functional information,
22
M. Harders et al.
soft-tissue contrast, direct multiplanar capabilities, and lack of ionizing radiation suggest that MR imaging has a greater potential than other techniques to become the ideal diagnostic method for imaging of the small bowel. After acquisition of the volumetric data with an MR, the mesenteric small bowel has to be assessed by a radiologist. However, doing this step on cross sectional images is somewhat analogous to separating out and identifying a large number of writhing snakes in a crowded reptile tank. A more promising approach is using virtual endoscopy techniques, which have been a major research focus in recent years. The creation of three-dimensional images with perspective distortion promises also for the small bowel to be an advancement in diagnostics. Nevertheless, virtual endoscopy of the small bowel is much more difficult than virtual colonoscopy and can not yet be performed by the currently available postprocessing tools. The manual path definition proves difficult in the sharp turns of the small bowel and the loss of orientation is the most obvious problem (Figure 1). The most crucial task for future integration of small bowel examination into clinical routine is the development of more reliable segmentation tools and path finding systems for virtual endoscopy of the small intestine. As a consequence, we aim at enhancing this process with a new interactive haptic tool for segmentation and centerline definition.
3
Image Acquisition
Driven by public concern about medical radiation exposure, we developed a robust, albeit complex, technique for high-quality MR imaging [11, 7]. Prior to MR imaging of the small bowel, patients were prepared by oral ingestion of four doses of stool softener spiked with a clinically used MR contrast agent starting three hours before the examination. This mixture forms a viscous hydrogel within the intestinal lumen, giving good luminal distension, constant signal homogeneity, sufficient demarcation of the bowel content from surrounding tissues, and a low rate of artifacts, thus permitting non-invasive high quality MRI of the small bowel. According to the report of three volunteers and twelve patients, the oral mixture was well tolerated apart from slight abdominal discomfort and a sensation of being full. Data acquisition was performed breathhold in coronal plane, with the patient in a prone position. This near isotropic volume acquisition strategy permits multiplanar and three-dimensional reconstructions. Because MR imaging remains a motion-sensitive technique, bowel peristalsis is reduced by intravenous administration of a spasmolytic drug. The availability of high-performance gradient systems allows for the acquisition of large data volumes within a single breathhold [6], thereby eliminating respiratory motion artifacts. To assure data acquisition in apnea, imaging times are maintained under 30 seconds, limiting the number of contiguous 2mm sections to 48-64. The technique is based on the use of very short echo and repetition times rendering most tissues, including fat, dark. The signal is evident only within regions containing T1-shortening contrast in a concentration sufficient to reduce T1relaxation times to levels below 50ms [9].
Improving Virtual Endoscopy for the Intestinal Tract
4
23
Interactive Segmentation System
After acquiring the image data of the small intestines, the data sets have to be segmented into their major structural components before any high-level reasoning can be applied. As a consequence of the complex, tightly packed geometry of the small bowel, up to now no method is available, which could reliably provide a topologically correct segmentation. Even manual identification of the organ outline on 2D slices, which is usually the last rescue in case of lacking other alternatives, proved to be inappropriate due to the difficulties discussed in the introduction. Therefore, we had to apply a new virtual reality-based interaction metaphor for semi-automatic segmentation of medical 3D volume data [2, 3]. The mouse-based, manual initialization of deformable surfaces in 3D represents a major bottleneck in interactive segmentation. In our multi-modal system we enhance this process with additional sensory feedback. A 3D haptic device is used to extract the centerline of a tubular structure. Based on the obtained path a cylinder with varying diameter is generated, which in turn is used as the initial guess for a deformable surface. In the following sections we will describe our approach in detail. 4.1
Data Preparation
The initial step of our multi-modal approach is the haptically assisted extraction of the centerline of a tubular structure. First we create a binarization of our data volume by thresholding. We have to emphasize that this step is not sufficient for a complete segmentation of the datasets we are interested in. This is due to the often low quality of the image data caused by unevenly distributed contrast agents, pathological changes and partial volume effects. Nevertheless, in the initial step we are not interested in a topologically correct segmentation. On the contrary, we only need a rough approximation of our object of interest. For each voxel that is part of the tubular structure we compute the Euclidean distance to a voxel of the surrounding tissue. In the next step we negate the 3D distance map and approximate the gradients by central differences. Moreover, to ensure the smoothness of the computed forces, we apply a 5x5x5 binomial filter. This force map is precomputed before the actual interaction to ensure a stable force-update. Because the forces vectors are located at discrete voxel positions, we have to do a tri-linear interpolation to obtain the continuous gradient force map needed for stable haptic interaction. Furthermore, we apply a low-pass filter in time to further suppress instabilities. 4.2
Centerline Extraction
The goal of the centerline extraction process is to identify the ridge line through the resulting distance map. As in most object identification tasks the basic problem is to ensure the connectivity of the result by closing the gaps through areas, where the ridge is less pronounced. Haptic feedback proved to be a very efficient and intuitive metaphor to solve this problem. In the optimal case of good data quality, the user “falls through” the data set guided along the 3D ridge created by the forces. While moving along the path, control points are set, which are
24
M. Harders et al.
used to approximate the path with a spline. At regions with less clear image information, an expert can use his knowledge to guide the 3D cursor through fuzzy or ambiguously defined areas by exerting force on the haptic device to actively support path definition. 4.3
Segmentation
The next step is to use the extracted centerline to generate a good initialization for a deformable surface model. To do this, we create a tube around the path with varying thickness according to the precomputed distance map. This object is then deformed subject to a thin plate under tension model. Assuming positionindependent damping and homogeneous material properties as well as using discrete approximations of the differential operators, we can use Gauss-Seidel iteration to solve the resulting system of Euler-Lagrange equations. γvt − τ ∆v + (1 − τ )∆2 v = − δP δv Due to the good initialization, only a few steps are needed to approximate the desired object. The path initialization can be seen in Figure 2(a). Note, that the 3D data is rendered semi-transparent to visualize the path in the lower left portion of the data. Figure 2(b) depicts the surface model during deformation.
(a) Initialized path.
(b) Deforming tube.
Fig. 2. Interactive Segmentation.
In order to further improve the interaction with complicated data sets we adopt a step-by-step segmentation approach by hiding already segmented loops. This allows a user to focus his attention on the parts that still have to be extracted. For this purpose we have to turn the 3D surface model back into voxels, which should happen fast enough to maintain real time interaction. To achieve this goal we make use of the graphics hardware by implementing a zbuffer based approach as described in [5]. This process is shown in Figure 3.
Improving Virtual Endoscopy for the Intestinal Tract
(a) Voxelization.
25
(b) Removed segmented part.
Fig. 3. Hiding segmented parts.
4.4
System Evaluation
We carried out an initial test study to evaluate the influence of haptic interaction on the performance of centerline extraction and the following segmentation. The experiment followed a within-subjects repeated-measures design. Five participants took part in the study, only one had used a haptic device before.
(a) Start of process.
(b) Complete extraction.
Fig. 4. Centerline extraction.
Subjects were introduced to the interactive segmentation tool and were informed how to set a centerline with the system. Also, to familiarize the subjects with force-feedback, we presented them haptic rendering of the surface of the
26
M. Harders et al.
voxel objects based on data gradients. Each subject carried out the experiment under two conditions, without and with haptic enhancement for centerline tracing. The segmentation task was performed on an artificial and a real data set. The performance measure was the interaction time in seconds for model initialization. After setting the path, a 3D deformable surface was initialized and deformed without user interaction. The effect of the path quality on the segmentation process was also examined in our study. The data samples taken are paired allowing an analysis of difference to be undertaken. The distributions were successfully tested for normality, thus allowing the use of a paired t-test. A summary of the acquired data is shown in Table 1. This initial study has shown that there is a statistically significant performance improvement in the trial time (t = 3.59, df = 9, p ≤ 0.007) when using haptically enhanced interaction in 3D segmentation. Also in the haptic condition the quality of segmentation was always superior to the one without force-feedback. In seven out of ten cases, the deformable surface, initialized based only on visual feedback, collapsed in parts of the structure, thus requiring additional user interaction. This is due to imprecise initialization of the centerline, which causes the deformable model in poorly initialized regions to fail to automatically extract the object of interest. Subjects reported that 3D positioning was substantially facilitated with the force-feedback. Moreover, although most of the participants expressed a need for longer training in haptic interaction itself, all of them were already successful in taking advantage of the technology. Generally, subjects stated that in the haptically assisted condition they mainly focused on using the forces for guidance, while the visual feedback was only used for fine-tuning. Initialization time. Mean trial time (s) Standard deviation
Visual only With haptics 147.0 73.1 85.6 37.5
Table 1. Results of initial study.
5
Results
Three healthy volunteers (without any history of gastrointestinal disease or surgery) and twelve patients (evaluation of small bowel obstruction or chronic inflammatory bowel disease) participated in this preliminary study. We were able to use our system to obtain the centerline through the small intestines. Figure 4(a) shows the start of the process and Figure 4(b) displays the final outcome. Please note, that some of the shorter sections in the first image were combined into longer ones.
6
Conclusions
We have shown a new approach to generating computer models for virtual endoscopy based on MR image acquisition and haptically enhanced interactive
Improving Virtual Endoscopy for the Intestinal Tract
27
segmentation. We acquired high-quality images of the small intestines and used our system to completely segment the small bowel - to the best of our knowledge, this has not been achieved before. Whether our approach will readily replace currently used methods of small bowel imaging will depend on how this method can be integrated into the clinical setting in a practical manner that will be acceptable to patients, referring clinicians, and surgeons. To be the primary method for investigation of small-bowel disease, MR imaging will have to provide reliable evidence of normalcy, allow diagnosis of early or subtle structural abnormalities and influence treatment decisions in patient care. Further research and experience will help clarify whether our approach should be the primary method for investigation of the small bowel or used only as a problem-solving examination. Preliminary clinical investigation using the described system have given rise to the following recommended improvements: increased spatial and temporal resolution in MR imaging of small bowel to achieve true isotropic imaging, to assess ideal timing between intake of oral contrast agent and imaging, optimization of small bowel distension and further refinement of the current tools for segmentation and path definition. Nevertheless, the developed imaging, segmentation and navigation methods already opened a way for the extension of virtual endoscopy investigations onto the whole intestinal tract.
Acknowledgment This work has been performed within the frames of the Swiss National Center of Competence for Research in Computer Aided and Image Guided Medical Interventions (NCCR CO-ME) supported by the Swiss National Science Foundation.
References 1. H.M. Fenlon, D.P. Nunes, P.C. Schroy, M.A. Barish, P.D. Clarke, and J.T. Ferrucci. A comparison of virtual colonoscopy and conventional colonoscopy for the detection of colorectal polyps. In N Engl J Med, pages 1496–1503, 1999. 2. M. Harders and G. Sz´ekely. Improving medical segmentation with haptic interaction. In IEEE Computer Society Conf. on Virtual Reality, 2002. 3. M. Harders and G. Sz´ekely. New paradigms for interactive 3d volume segmentation. In Journal of Visualization and Computer Animation, 2002. 4. H. Herlinger and D.D.T. Maglinte. Clinical radiology of the small intestine, 1989. 41-44. 5. E.-A. Karabassi, G. Papaioannou, and T. Theoharis. A fast depth-buffer-based voxelization algorithm. Journal of Graphics Tools, 4(4):5–10, 1999. 6. D.A. Leung, G.C. McKinnon, C.P. Davis, T. Pfammatter, G.P. Krestin, and J.F. Debatin. Breathheld contrast-enhanced 3d mr angiography. In Radiology, 1996. 7. W. Luboldt, P. Bauerfeind, S. Wildermuth, B. Marincek, M. Fried, and J.F. Debatin. Colonic masses: detection with mr colonography. In Radiology, 2000. 8. D.D.T. Maglinte, K. O’Connor, J. Bessette, S.M. Gernish, and F.M. Kelvin. The role of physician in the late diagnosis of primary malignant tumors of the small intestine. American Journal of Gastroenterology, 86:304–308, 1991. 9. M.R. Prince. Gadolinium-enhanced mr aortography. In Radiology, 1994. 10. D.J. Vining. Virtual endoscopy: Is it. In Radiology, pages 30–31, 1996. 11. S. Wildermuth and J.F. Debatin. Virtual endoscopy in abdominal mr imaging. In Magn Reson Imaging Clin N Am., pages 349–364, 1999.
T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 28–35, 2002. c Springer-Verlag Berlin Heidelberg 2002
Finding a Non-continuous Tube by Fuzzy Inference
29
30
C. Yasuba et al.
Finding a Non-continuous Tube by Fuzzy Inference
31
32
C. Yasuba et al.
Finding a Non-continuous Tube by Fuzzy Inference
33
34
C. Yasuba et al.
Finding a Non-continuous Tube by Fuzzy Inference
35
Level-Set Based Carotid Artery Segmentation for Stenosis Grading C.M. van Bemmel , L.J. Spreeuwers, M.A. Viergever, W.J. Niessen Image Sciences Institute, Room E 01.334, University Medical Center Utrecht, Heidelberglaan 100, 3584 CX Utrecht, The Netherlands {kees,luuk,max,wiro}@isi.uu.nl
Abstract. A semi-automated method is presented for the determination of the degree of stenosis of the internal carotid artery (ICA) in 3D contrast-enhanced (CE) MR angiograms. Hereto, we determined the central vessel axis (CA), which subsequently is used as an initialization for a level-set based segmentation of the stenosed carotid artery. The degree of stenosis is determined by calculating the average diameters of cross-sectional planes along the CA. For twelve ICAs the degree of stenosis was determined and correlated with the scores of two experts (NASCET criterion). The Spearman’s correlation coefficient for the proposed method was 0.96 (p<0.001), versus 0.89 and 0.88 (p<0.001) for the manual scores, and a smaller bias and tighter confidence bounds for the automated method were found.
1
Introduction
Stroke is the third leading cause of death in the western world and is responsible for major disability among survivors [1]. Intra-arterial digital subtraction angiography (IA-DSA) has historically been regarded as the gold standard for visualization of the internal carotid artery (ICA). Especially the degree of stenosis is an important measure for selecting patients for carotid endarterectomy [2]. However, IA-DSA carries a risk of stroke in patients with atherosclerosis. Clinical studies have demonstrated that less invasive techniques, like magnetic resonance angiography (MRA) and computerized tomographic angiography (CTA), are useful alternatives for patient selection [3, 4]. Therefore, there is an interest in automated reproducible stenosis grading of carotid arteries from CT and MR data. Westenberg et al. performed vesseldiameter measurements on Maximum Intensity Projections (MIPs) of Gadolinium (Gd) Contrast-Enhanced (CE) MRA datasets along a line drawn perpendicular to the vessel [5]. Frangi et al. used a method in which a surface represented by a B-spline was fitted to the data using prior knowledge of the image formation process [6]. In this paper an alternative approach is adopted for semi-automatic segmentation of the ICA. First, a path-tracking tool is utilized, which automatically determines the central vessel axis (CA) of a stenosed carotid artery,
This work is funded by Philips Medical Systems, Best, The Netherlands.
T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 36–43, 2002. c Springer-Verlag Berlin Heidelberg 2002
Level-Set Based Carotid Artery Segmentation for Stenosis Grading
37
based on user-defined points. Subsequently, the stenosed artery is segmented using level-set techniques with the CA as initialization. The segmentation is used to determine the degree of stenosis. Previously, Lorigo et al. [7] presented level-set techniques for vessel segmentation. However, in this approach the CA is not used as initialization, which is an essential step in our approach. Moreover, the method has not been applied or evaluated to (carotid artery) stenosis grading. The technique presented in this paper was compared to manual scores on CE-MRA datasets and DSA images of twelve stenosed carotid arteries.
2
Method
The presented method segments the ICA using level-set techniques with the CA a collection of seedpoints. Figure 1 shows a block-diagram outlining the key steps in our approach. First, a method is applied that determines the CA (Section 2.1). Subsequently, the CA is used as initialization for level-set based segmentation of the ICA (Section 2.2). Finally, the vessel quantification is described in Section 2.3. 3D dataset
Determination Central Axis
Central Axis
Segmentation
Quantification Diagnosis Fig. 1. Overall block diagram of the proposed method.
2.1
Central Axis Determination
The CA is determined as the minimum-cost path between two user-defined points [8]. Costs are given by the reciprocal value of an image in which vessellike structures are enhanced, using the vesselness filter as described by Frangi et al. [9] (see also Section 2.2). A bi-directional search-tree is started from both the starting node and the goal node simultaneously. The evolution of the search tree is continued until the two fronts meet. The accuracy of the CA with respect to manual tracings has been demonstrated in [10]. 2.2
Level-Set Based Vessel Segmentation
In this section we describe the level-set technique, as formulated by Osher [11] and Sethian [12], applied to segmentation of the ICA, with the CA as initialization. This segmentation can be regarded as the evolution of a front, or interface,
38
C.M. van Bemmel et al.
towards the vessel boundaries. Rather than evolving an interface itself, it is represented by the zero level-set of a higher dimensional function. To formalize these notions, let Γ (t) denote a time-dependent closed 2-dimensional surface representing the evolving segmentation. This interface evolves in its normal direction according to: (1) Γt (t) = F · N . Here F denotes the speed function and N is the normal vector to the surface, pointing outwards. Now, a 3-dimensional function φ(t) is defined such that [(φ(t) = 0)] = Γ (t), i.e. Γ (t) is represented by the zero level-set of φ(t) at all times. It can easily be shown that if Γ (t) evolves according to Equation 1, the evolution of φ(t) is given by: φt (t) + F |∇φ(t)| = 0.
(2)
Thus, the evolution of the zero level-set of φ(t) equals the evolution of Γ (t). Therefore, in level-set based image segmentation, the evolution of Γ (t) is implicitly defined by evolving φ(t). This approach has the advantage that topological changes in Γ (t) (like breaking and merging) are handled naturally. In order to capture the vessel boundaries, an appropriate speed function F needs to be selected. For this purpose we investigated the effect of grey-level-, gradientand vesselness-based speed functions. 1. Grey-level based speed function The histogram of the CE-MRA dataset shows two distinct peaks, representing the background and the vasculature. Therefore, two normal distributions were fitted on the histogram of the dataset. The parameters that describe the distribution of both the background and the vasculature (N (µb , σb ) and N (µv , σv ), respectively) were determined using the expectation-maximization algorithm [13]. Since in CE-MR angiography vessels give higher signal than the background, it is clear that µv > µb . Based on these parameters, the grey-level-based speed term is defined as: FI (x) =
1 √
σv 2π
x
1
e− 2 (
i−µv σv
)2
,
(3)
i=0
where x is the image grey-value. Note that with this approach no ad hoc threshold parameter is selected as it is derived from image information. 2. Gradient-level based speed function The gradient image is calculated by convolving the dataset with the firstorder derivative of the Gaussian kernel. The gradient-based speed function is given by: 1 F∇ (x) = (4) 1 + ∇I(x, σgrad ) where ∇I(x, σgrad ) is the gradient computed at scale σgrad at position x.
Level-Set Based Carotid Artery Segmentation for Stenosis Grading
39
3. Vesselness-based speed function This speed term, FV , equals the vesselness function as described in [9], and is based on the eigenvalues |λ1 | ≤ |λ2 | ≤ |λ3 | of the Hessian matrix, that is determined by convolving the image with the second order derivatives of the Gaussian kernel at scale σ. In case of an ideal tubular structure |λ1 | = 0, |λ1 | |λ2 |, and λ2 = λ3 . Moreover, the total magnitude of the eigenvalues reflects the amount of second-order structureness. Therefore, from the eigenvalues, three terms are constructed which are used in the vesselness filter, viz.: |λ2 | |λ1 | , and S HF = RA , RB λ2j . |λ3 | |λ2 λ3 | j Here RA is essential for distinguishing between plate-like and line-like structures, RB accounts for the deviation from a blob-like structure, and S is the measure of second-order structureness. In the ideal situation |λ1 | = 0, |λ1 | |λ2 |, and λ2 = λ3 . The vesselness-based speed function is a discriminant function based on these three terms: 0 if λ2 > 0 or λ3 > 0, FV (x) (5) max σmin ≤σ≤σmax v(x, σ) otherwise, where v(x, σ) = (1 − e−
RA 2 2α2
−
)e
RB 2 2β 2
2
S − 2γ 2
(1 − e
).
(5a)
The parameters α, β, and γ tune the sensitivity of the filter to deviations in RA , RB , and S, respectively. The filter is applied at multiple scales that span the expected vessel-widths. 4. Combined speed function Since the speed terms mentioned above have different properties, a combined speed function can be composed by multiplying them: F = FI F ∇ F V ,
(6)
where a speed term is set to 1 if it is not included. All speed terms are normalized, so values are in the range [0, 1]. 2.3
Vessel Quantification
The degree of stenosis according to the NASCET criterion [2] is given by (see also Figure 2): (1 −
Minimal Residual Lumen ) · 100%. Distal ICA Lumen Diameter
(7)
40
C.M. van Bemmel et al.
b a
This measure is defined for DSA data, which are projection images. In the method we propose, the degree of stenosis is determined from cross-sectional MR slices. In order to determine the degree of stenosis that is comparable to the NASCET criterion, we used the average diameter of the cross-sectional planes along the CA. In this study 12 stenosed carotids arteries were screened. From all carotid arteries both a DSA dataset (consisting of three projections (posteroanterior, oblique, and lateral)) and a CE-MRA dataset were available.
Fig. 2. Schematic view of the linear lumen reduction measuring method according to the NASCET stenosis-criterion ((1 − ab ) ∗ 100%) used for the internal carotid artery.
3 3.1
Results Central Axis Determination
In order to determine the CA, the vesselness image is computed at 25 scales (exponentially increasing) in the range of σ = 0.25 - 7.5 mm. For the vesselenhancement, the parameters α and β were both fixed at 0.5, while γ equals 25% of the maximum occurring pixel value in the 3D dataset. In case one of the eigenvalues is large, S will be large; the output of this filtering process is rather insensitive to the value of γ. In all datasets, the CA was everywhere located inside the lumen and could be used as initialization for the level-set based segmentation. 3.2
Level-Set Based Vessel Segmentation
Vessel segmentation is achieved via level-set techniques using the CA as initialization. Hereto, we implemented Equation 2 using a simple Euler forward-scheme with time-step ∆t = 0.1. We tested the influence of the different speed functions as given by Equations 3 through 6 separately. The gradient-based speed image was computed using σgrad = 0.75 mm, which is a trade-off between noisesuppression on the one side, and taking the width of the ICA into account on the other side. The parameters of the vesselness-based speed image were equal to those used for the CA determination (see Section 3.1). It was found that the segmentation was most robustly estimated using a combination of the speed terms. Therefore, the evaluation on all datasets was carried out by evolving a front utilizing the CA as initialization and the speed function given by F = FI F∇ FV . In Figure 3 a typical segmentation and a diameter-vs-length plot are shown. 3.3
Stenosis Grading
Expert grading of the DSA images was performed by averaging the scores from all available projections without vessel over-projection. Quantification of CEMR angiograms was done by two experts by averaging the degree of stenosis
D (mm)
Level-Set Based Carotid Artery Segmentation for Stenosis Grading
41
9.5 9 8.5 8 7.5 7 6.5 6 5.5 5 4.5 4 3.5 3 2.5 2 0
25
50
75
100
L (mm)
Fig. 3. Maximum Intensity Projections (MIP) of a 3D CE-MR angiogram of the ICA (left) with corresponding segmentation (middle) and diameter-vs-length plot (right) from which the stenosis grade can be determined. 90 80 70 60 50 40 30 20 10 0
100
%D (CE-MRA)
100
%D (CE-MRA)
%D (CE-MRA)
100
90 80 70 60 50 40 30 20 10
0
10
20
30
40
50
60
70
%D (DSA)
80
90
100
0
90 80 70 60 50 40 30 20 10
0
10
20
30
40
50
60
%D (DSA)
70
80
90
100
0
0
10
20
30
40
50
60
70
80
90
100
%D (DSA)
Fig. 4. CE-MRA vs DSA. Degree of stenosis is measured in 12 carotid arteries. Linear regression expert I (left), expert II (middle), and level-set based technique (right). Dashed lines indicate 95% confidence. It can be observed that the semi-automatic method better correlates with the gold standard provided by DSA. Moreover, the bias introduced by the method is smaller and the confidence bounds are tighter.
computed from MIPs in posteroaterior, oblique, and lateral views without vessel over-projection. The same ICAs were graded with the level-set based technique by determining the average diameter of cross-sectional planes along the CA, that was resampled every 0.5 millimeter. Table 1 shows the results of the comparison between the DSA and CE-MRA for the two experts and the level-set based technique. The correlation coefficient indicates a better agreement between the level-set based technique and DSA than the experts. Figure 4 shows the linear regression with the 95% confidence intervals (0.89, 0.88, and 0.96 for expert I, expert II, and the level-set based method, respectively).
4
Discussion
A method is presented for segmentation of the ICA, which is based on level-set techniques. By using the CA as initialization, the method is better suited for segmenting vascular structures, since the initialization is everywhere near the vessel wall. The method has been applied to carotid artery stenosis grading in CE-MRA data, and compared to measurements made by clinical experts. The results show that the presented method correlates better (Spearman’s correlation
42
C.M. van Bemmel et al.
coefficient 0.96 (p<0.001)) with DSA than the manual measurements (Spearman’s correlation coefficient 0.89 and 0.88 (p<0.001), respectively). Also the reproducibility increased (tighter confidence bounds). Since only minimal user interaction is needed, and no restrictions are made with respect to the shape of the segmentation (e.g. circular cross-sections), this technique is a powerful tool in the quick and accurate determination of the degree of stenosis. Future work includes the evaluation of this technique by applying it to more patients datasets. Table 1. DSA vs CE-MRA results for both experts (I and II) and the level-set based technique. Bias and 95% bounds of agreement are in units of %D. Slope Obs. I Obs. II Level-set based
1.034 0.970 1.032
%D Bias (±1.96SD) +10.8(±30.9) +10.1(±30.1) -2.9 (±18.6)
Spearman’s rs 0.89(<0.001) 0.88(<0.001) 0.96(<0.001)
References 1. W.S. Moore, J.P. Mohr, H. Najafi, J.T. Robertson, R.J. Stoney, and J.F. Toole, “Carotid Endarterectomy: Practical Guidelines,” Journal of Vascular Surgery, vol. 15, pp. 469–79, 1992. 2. North American Symptomatic Carotid Endarterectomy Trial (NASCET) Steering Committee, “NASCET. Methods, Patient Characteristics, and Progress,” Stroke, vol. 22, pp. 711–720, 1991. 3. M.R. Patel, K.M. Kuntz, R.A. Klufas, D. Kim, and J. Kramer, “Preoperative Assessment of the Carotid Bifurcation. Can Magnetic Resonance Angiography and Duplex Ultrasonography replace Contrast Arteriography?,” Stroke, vol. 26, pp. 1753–58, 1995. 4. E.H. Dillon, M.S. van Leeuwen, M.A. Fernandez, B.C. Eikelboom, and M. Mali, “CT Angiography: Applicationto the Evaluation of Carotid Artery Stenosis,” Radiology, vol. 189, pp. 211–19, 1993. 5. J.J.M. Westenberg, R.J. van der Geest, M.N.J.M. Wasser, E.L. van der Linden, Th. van Walsum, A. de Roos H.C. van Assen, and J.H.C. Reiber, “Vessel Diameter Measurements in Gadolinium Contrast-Enhanced three-Dimensional MRA of Peripheral Arteries.,” Magnetic Resonance Imaging, vol. 18, no. 1, pp. 13–22, 2000. 6. A.F. Frangi, W.J. Niessen, P.J. Nederkoorn, J. Bakker, W.P.Th. M. Mali, and M.A. Viergever, “Quantative Analysis of Vessel Morphology from 3D MR Angiograms: in Vitro and in Vivo results,” Magnetic Resonance in Medicine, vol. 45, no. 2, pp. 311–22, 2001. 7. L.M. Lorigo, O. Faugeras, W.E.L. Grimson, R. Keriven, R. Kikinis, A. Nabavi, and C.-F. Westin, “CURVES: Curve Evolution for Vessel Segmentation,” Medical Image Analysis, vol. 5, pp. 195–206, 2001.
Level-Set Based Carotid Artery Segmentation for Stenosis Grading
43
8. O. Wink, W.J. Niessen, and M.A. Viergever, “Minimum Cost Path Determination Using a Simple Heuristic Function,” in Proc. International Conference on Pattern Recognition, A. Sanfelin, J.J. Villanueva, M. Vanrell, R. Alqu´ezar, T. Huang, and J. Serra, Eds. 2000, pp. 1010–1013, IEEE Computer Society, Piscataway, NJ. 9. A.F. Frangi, W.J. Niessen, K.L. Vincken, and M.A. Viergever, “Multiscale Vessel Enhancement Filtering,” in Proc. Medical Image Computing and ComputerAssisted Intervention, W.M. Wells, A. Colchester, and S. Delp, Eds. 1998, Lecture Notes in Computer Science, pp. 130–137, Springer Verlag, Berlin. 10. C.M. van Bemmel, W.J. Niessen, O. Wink, B. Verdonck, and M.A. Viergever, “Blood Pool Agent CE-MRA: Improved Arterial Visualization of the Aortoiliac Vasculature in the Steady-State Using First Pass Data,” in Proc. Medical Image Computing and Computer-Assisted Intervention, W.J. Niessen and M.A. Viergever, Eds. 2001, Lecture Notes in Computer Science, pp. 699–706, Springer Verlag, Berlin. 11. S. Osher and J.A. Sethian, “Fronts Propagating with Curvature Dependent Speed: Algorithms Based on Hamilton-Jacobi Formulations,” Journal of Computational Physics, vol. 79, pp. 12–49, 1988. 12. J.A. Sethian, Level Set Methods and Fast Marching Methods, Cambridge University Press, second edition, 1999. 13. C.M. Bishop, Neural Networks for Pattern Recognition, Oxford University Press Inc., New York, 1995.
PC-Based Control Unit for a Head Mounted Operating Microscope for Augmented Reality Visualization in Surgical Navigation Michael Figl1,4 , Wolfgang Birkfellner1,2 , Franz Watzinger3 , Felix Wanschitz3 , Johann Hummel1 , Rudolf Hanel1 , Rolf Ewers3 , and Helmar Bergmann1,4 1
Department of Biomedical Engineering and Physics, AKH, Vienna, Austria 2 CARCAS Group, University Hospital of Basle, Switzerland 3 Department of Oral and Maxillofacial Surgery, AKH, Vienna, Austria 4 Ludwig-Boltzmann Institute of Nuclear Medicine, Vienna, Austria
Abstract. We have adapted a miniature head mounted operating microscope for AR by integrating two very small computer displays. To calibrate the projection parameters of this so called Varioscope AR we have used Tsai’s Algorithm for camera calibration. Connection to a surgical navigation system was performed by defining an open interface to the control unit of the Varioscope AR. The control unit consists of a standard PC with an dual head graphics adapter. We connected this control unit to an computer aided surgery (CAS) system by the TCP/IP interface. In this paper we present the control unit for the HMD and its software design. We tested two different optical tracking systems, the Flashpoint (Image Guided Technologies, Boulder, CO), which provided about 10 frames per second, and the Polaris (Northern Digital, Ontario, Can) which provided at least 30 frames per second, both with a time delay of one frame.
1
Introduction
Due to the fact that computer aided surgery (CAS) was first introduced to the field of neurosurgery, many efforts were focused on introducing Augmented Reality (AR),the overlay of computer-generated graphics for providing ”target information” into operating microscopes [3,4,5,6]. and the computer graphics are to be displayed in the optical system of the microscope, therefore no focusing problems when merging the computer graphics and the optical image are encountered. On the other hand, the operating microscope is bulky, expensive, and has a rather limited field of clinical applications. In parallel, head-mounted displays (HMD) were also used for AR in medicine [7,8]. HMDs are lightweight, cheaper, and can be attached directly to the surgeon’s head. A major problem, lies in the fact that a common focal plane is not easily achieved in commercial HMDs. Furthermore, rapid head motions cause perceptible delay in the display of the computer graphics, thus causing simulator sickness and reduced acceptance. Two methods have been proposed in the past to overcome these problems. In [7] and [8], the authors describe video see-through systems. The real world T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 44–51, 2002. c Springer-Verlag Berlin Heidelberg 2002
PC-Based Control Unit for a Head Mounted Operating Microscope
45
scenery is being captured by two miniature video cameras, and the augmentation by computer graphics is achieved by mixing the video signals with the computer graphics using a dedicated control unit for the HMD. The user sees the augmented scenery using miniature video monitors. While this allows for strain-free viewing since the complete scene appears focused on the HMDs miniature monitors, the quality of the miniature video monitors is generally poor, and in case of total system failure, the surgeon is blinded. Furthermore, mixing the video signals within a reasonable time span requires a considerable amount of computing power. Another possibility to solve the focusing problem is to move the beamsplitter closer to the scenery. This can be achieved, by using semi-transparent panels [9,1]. The resulting parallax problem can be solved by using an integral photography approach as pointed out in [2]. The resulting displays are, however, somewhat bulky, and have to be placed close to the sterile operating field. We ended up with developing the Varioscope AR, a prototype of a small head-mounted operating binocular; among the design challenges were the simple integration into existing CAS systems, and the minimization of latencies in the display of the computer graphics. Our solution to these problems is presented in this paper.
2 2.1
Materials and Methods The Head Mounted Display
The Varioscope AF3, an optical miniature head-mounted operating binocular developed and produced by Life Optics, Vienna (http://www.lifeoptics.com), was adapted for AR by adding two miniature LCD displays with VGA resolution (AMEL640.480.24, 640 × 480 pixels, Planar Microdisplays Inc., Beaverton/OR) with adequate projection optics. We refer to this prototype as the Varioscope AR. The projection optics is designed so that the eye lens of the Varioscope magnifies both the image from the main lens and from the projection optics, both providing an image in the focal plane of the main lens, thus achieving common focus for the optical real world image and the computer graphics. Calibration from world coordinates to display coordinates was established by applying Tsai’s camera calibration algorithm [10] for each optical channel separately, thus achieving true stereoscopic vision with parallax correction for the left and right eye. Photos, Details on the design of the Varioscope AR and on the calibration can be found in [11]. 2.2
System Setup
The Varioscope AR is designed as an add-on for an arbitrary CAS system with a visualization engine and an optical tracking system. Since assumptions on the performance of a CAS system cannot be made in a strict enough manner we have decided to split the computational task of preoperative planning and visualization on radiological images on the one hand, and the generation of
46
M. Figl et al.
the augmented reality scenery in the HMD on the other hand. This was performed by an asynchronous TCP/IP connection between the CAS workstation and the control unit for the Varioscope AR. The CAS workstation used for that experimental setup was VISIT, a surgical navigation system developed at our institution [12,13]. VISIT supports two optical tracking systems, the Flashpoint 5000 (Image Guided Technologies, Boulder,CO) and the active Polaris (Northern Digital Inc., Waterloo, Canada). VISIT was adapted in such a manner that it – acquires position data from the optical tracker through the HMD control unit. This was necessary in order to make sure that a continuous stream of position data from the optical tracker is available to the control unit for rendering the augmentation scenery according to the HMDs actual position independently of the computer load on the CAS system. – provides raw surgical planning data in the patient coordinate system for generating an OpenGL scenery used for AR visualization in the control unit of the HMD. – supports a set of commands for additional communication between the CAS workstation and the control unit. The commands can be found in table 1.
Table 1. The commands sent from the CAS system to the control unit of the HMD. Command D T S I O X
Function send position data read tool list read new scene file read probe IDs read tool offset read data from the tracker
The rendering of the scene, visualized in the HMD, is done by means of OpenGL rendering using the freely available Mesa 3.2 library (http://www.mesa3d.org). The interprocess communication of the two processes for visualization on the ARscenery on the two displays was implemented using System V semaphores. The hardware for the control unit was a standard PC System with an Intel Pentium III Central Processing Unit with 933 MHz, 512 MB RAM, with an Ethernet Network Adapter (3Com 3C905TX) and an ASUS V7100/2V1D graphics adapters with a GeForce 2 MX Graphics Processor running SuSE Linux 7.1 (SuSE GmbH, N¨ urnberg, Germany). 2.3
Notations and System Preparation
The vector consisting of the coordinates of the point x in the system A is expressed by xA , the transformation from system A to system B consists of a rotation RAB and a translation TAB , thus: xB = RAB xA + TAB . We use five
PC-Based Control Unit for a Head Mounted Operating Microscope
47
coordinate systems, the world coordinate system, xW , which is the system of the calibration grid, the camera system, xC = (Xc , Yc , Zc )t which is the system of a display in the HMD, the source system, xS , which is the system of a trackable frame rigidly mounted to the calibration grid and the system of the HMD itself, xH , at least the system of the dynamic reference frame (DRF), xR , which is mounted on the patient. Xi , Yi denote the image coordinates, Xf , Yf the row and column numbers of the point in the display. The constant d is the center to center distance from a pixel to its neighbour. The calibration of the projection parameters, as described in [11] provides the transformation RW C , TW C , the grid coordinates of the center Cx , Cy and the effective focal length f . Immediately after this camera calibration the tracker is used to provide RSH , TSH . The transformation RW S , TW S is calculated by measuring several points of the system xW in coordinates of xS using the tracker. The last transformation RHR , THR is provided by the tracker during the time the control system is running. From the relations: t (xR − THR ) xH = RHR t xW = RW S (xS − TW S ) Xc Xi = f Zc 1 Xf = Xi + Cx d
t xS = RSH (xH − TSH ) xC = RW C xW + TW C Yc Yi = f Zc 1 Yf = Yi + Cy d
(1) (2) (3) (4)
we derive xC , as a function of xR and RHR , THR . Then Xi , Yi is the result of a perspective projection, realized by the OpenGL function glFrustum with the near clipping plane in the distance f and the far clipping plane in the distance of 800 mm. The row and column numbers Xf , Yf are calculated by use of the center Cx , Cy this transformation is realized by the function glViewport, for information about OpenGL see [14]. The typical image which we want to display consists of surgical planning data which we will refer to as the scene, and the surgical instrument tool. The coordinates of the elements of the scene are given in a coordinate system which is defined by surgical mini screws, it is registered to the system xR of the DRF, by point-to-point registration which provides also the coordinates of the scene in the system xR , see [13] 2.4
Program
The program consists of three processes, which are executed in the order father, child and grandchild. The fathers main task is the communication with the CAS system and the tracker whereas the child and the grandchild display the two images. The flow diagram of the main program is shown in the left of figure 1, is contains the following steps for the father process (righthanded course): – Initialization of the semaphores to achieve the desired order of processes running which is father, child and grandchild.
48
M. Figl et al.
– Opening a TCP/IP socket and synchronize with the CAS system. At this point the control unit is waiting for the CAS system to continue. – Opening a pipe to communicate with the child and generating a child process by the fork command. – Having received a command from the CAS system the process has to respond as shown in table 1. Otherwise the default command X is assumed and the recent position data, namely RHR , THR , RDR , TDR is read from the tracker. – Sending the command of the CAS system together with the position data to the child process. After having made the child process runnable the father process returns to the beginning of its main loop. The child and grandchild processes are shown in the lower left side of the flow diagram figure 1, the lefthanded course after the first fork(), and their glut main loops are shown in the right of fig.1. – Loading of the calibration data. – Opening a pipe to communicate with the grandchild and generating a grandchild process by the fork command. – Initializing the glut window, ie starting the OpenGL window and entering the main loop. – Reading the command and the position data out of the pipe. – Calculating the minimal distance between the surgical instrument and the planned scene. If this distance is below a previously chosen threshold the scene is painted using wire framed surfaces otherwise it is painted using full surfaces. – Depending on the command received from the CAS system via the pipe the two processes have to draw the scene using new position data or to load the new scene. – Command and position data are sent to the grandchild. After having made the succeeding process runnable the processes return to the beginning of their main loops.
3
Results
In an experimental study we attached 16 steel spheres with 4 mm diameter to the base of a skull. They were identified as preoperative planned targets and the xR coordinates of their locations were sent to the control unit of the HMD. Figure 2b shows the view as seen in the HMD, the image was taken by a camera mounted on the left eyepiece. The Flashpoint 5000 tracking system flashes the IR-LEDs separately one after the other with a maximum frequency of about 300 IR-LEDs per second. Therefore we expected to get lower frame rates for higher numbers of LEDs, whereas the Polaris system illuminates the LEDs all at the same time. We achieved at least 30 frames per second using the Polaris tracker, and no more than 18.5 with the Flashpoint tracker. The results are described in table 2, the framerate is the average framerate of 10 measurements, each having
PC-Based Control Unit for a Head Mounted Operating Microscope
49
Fig. 1. Flow diagram of the main program and the childs main loop. The father process communicates with the CAS system and the tracker. The child and the grandchild process display the two images.
Fig. 2. a) The skull as seen through the Varioscope AR before augmentation is switched on. b) The augmented scenery as seen through the Varioscope AR. The steel spheres are overlaid by their virtual counterparts on CT. A needle-shaped surgical instrument approaches from the right.
a duration of 10 seconds. It is evident that the framerate varies considerably when using the Polaris tracker. The reason for this behavior is not completely clear, we assume a synchronization problem in the tracker’s internal hardware to be the source.
50
M. Figl et al.
Table 2. The framerate using the two different optical trackers with different numbers of tools and LEDs. σ denotes the standard deviation Polaris Tools 3 Tools, 11 LEDs 3 Tools, 12 LEDs 2 Tools, 8 LEDs 2 Tools, 7 LEDs
4
Framerate 45.33 40.35 33.54 32.8
σ 10.18 11.45 3.52 2.62
Flashpoint Tools 3 Tools, 19 LEDs 3 Tools, 14 LEDs 3 Tools, 8 LEDs 2 Tools, 6 LEDs
Framerate 13.45 17.71 18.37 18.5
σ 0.05 0.03 0.05 0.01
Discussion and Conclusions
We believe that the future of CAS depends on two factors. First, the development of satisfying solutions for a wide variety of clinical specialties beyond the current classic application fields is to be fostered. Second, the intuitive man-machine interface for these solutions has to reach a stage where the CAS-system becomes an ’embedded’ system in the sense that it is no longer a dominant additional high-tech apparatus in the operating room but a simple, easy-to-use piece of equipment providing an aid to the surgeon in clinical routine use. AR will be a part of this development for providing an intuitive visual feedback on surgical progress, provided that interfacing such an AR system with sufficient performance concerning image quality, ease of use, and latency is possible for existing CAS infrastructure at reasonable cost. The possible performance of such an approach for visualization in the Varioscope AR is documented in this paper. With a simple off-the-shelf computer, stereoscopic visualization is achieved at frame rates in the range of 40 Hz; due to the sophisticated structure of the HMD-control software, the only limiting factor for even increased visualization speed is the tracking system. The fact that all information necessary (mainly the OpenGL-scenery in patient space provided by the external CAS system, and the optical tracker data provided by the HMD control system) is shared using a standard TCP/IP connection makes the integration of such a system into an arbitrary navigation system easy.
Acknowledgment The authors wish to thank Dr. M. Lehrl and her colleagues at Life Optics, for the close cooperation and the excellent relationship. This work was supported by the Austrian Science Foundation FWF under research grant P 12464 MED.
References 1. M. Blackwell, C. Nikou, A. M. DiGioia 3rd, T. Kanade: ”An image overlay system for medical data visualization”, Med Image Anal 4(1), 67-72, (2000). 2. S. Nakajima, K. Nakamura, K. Masamune, I. Sakuma, T. Dohi: ”Three-dimensional medical imaging display with computer-generated integral photography”, Comput Med Imaging Graph 25(3), 235-241, (2001).
PC-Based Control Unit for a Head Mounted Operating Microscope
51
3. D. W. Roberts, J. W. Strohbehn, J. F. Hatch, W. Murray, H. Kettenberger: “A frameless stereotaxic integration of computerized tomographic imaging and the operating microscope”, J Neurosurg 65(4), 545-549, (1986). 4. D. W. Roberts, T. Nakajima, B. Brodwater, J. Pavlidis, E. Friets, E. Fagan, A. Hartov, J. Strohbehn: ”Further development and clinical application of the stereotactic operating microscope”, Stereotact Funct Neurosurg 58(1-4), 114-117, (1992). 5. R. Shahidi, R. Mezrich, D. Silver: ”Proposed simulation of volumetric image navigation using a surgical microscope” J Image Guid Surg 1(5), 249-265, (1995). 6. P. J. Edwards, A. P. King, C. R. Maurer Jr, D. A. de Cunha, D. J. Hawkes, D. L. Hill, R. P. Gaston, M. R. Fenlon, A. Jusczyzck, A. J. Strong, C. L. Chandler, M. J. Gleeson: ”Design and evaluation of a system for microscope-assisted guided interventions (MAGI)”, IEEE Trans Med Imaging 19(11), 1082-1093, (2000). 7. H. Fuchs, M. A. Livingston, R. Raskar, et al.: “Augmented Reality Visualization for Laparoscopic Surgery”, in W. M. Wells, A. Colchester, S. Delp (eds.): “Medical Image Computing and Computer-Assisted Intervention - MICCAI’98”, Springer LNCS 1496, 934 pp., (1998). 8. C. Maurer Jr, F. Sauer, B. Huc, B. Bascle, B. Geiger, F. Wenzel, F. Recchi, T. Rohlfing, C. Brown, R. Bakos, R. Maciunas, A. Bani-Hashemi: ”Augmented reality visualization of brain structures with stereo and kinetic depth cues: System descripion and initial evaluatoin with head phantom” in Medical Imaging 2001:Visualization, Display and Image-Guided Procedures, Soeng Ki Mun, Editor, Proceedings of the SPIE Vol 4319, 445-456 (2001) 9. M. Blackwell, F. Morgan, A. M. DiGioia 3rd: ”Augmented reality and its future in orthopaedics” Clin Orthop 354, 111-122, (1998). 10. R. Y. Tsai, ”A Versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses”, IEEE Trans Robotic Autom RA-3(4), 323-344, (1987). 11. W. Birkfellner, M. Figl,K. Huber, J. Hummel, R. Hanel, P. Homolka, F. Watzinger, F. Wanschitz, R. Ewers, H. Bergmann: ”Calibration of projection parameters in the Varioscope AR, a head-mounted display for augmented reality visualization in image-guided therapy”, in Medical Imaging 2000:, Display, and Image-Guided Prozedures, Seong Ki Mun, Editor, Proceedings of SPIE Vol. 4319, 471-480(2001). 12. W. Birkfellner, K. Huber, A. Larson, D. Hanson, M. Diemling, P. Homolka, H. Bergmann: ”A modular software system for computer-aided surgery and it’s first application in oral implantology”, IEEE Trans Med Imaging 19(6), 616-620, (2000). 13. W. Birkfellner, P. Solar, A. Gahleitner, K. Huber, F. Kainberger, J. Kettenbach, P. Homolka, M. Diemling, G. Watzek, H. Bergmann: ”In-vitro assessment of a registration protocol for image guided implant dentistry”, Clin Oral Implants Res 12(1), 69-78, (2001). 14. OpenGL Architecture Review Board, “OpenGL Reference Manual - Second Edition”, Editors: R. Kempf, C. Frazier, Addison-Wesley, (1997).
Technical Developments for MR-Guided Microwave Thermocoagulation Therapy of Liver Tumors Shigehiro Morikawa1, Toshiro Inubushi1, Yoshimasa Kurumi2, Shigeyuki Naka2, Koichiro Sato2, Tohru Tani2, Nobuhiko Hata3, Viswanathan Seshan4, and Hasnine A Haque 4 1 Molecular Neuroscience Research Center, 2 1st Dept. of Surgery, and Department of Neurosurgery, Shiga University of Medical Science, Seta Tukinowa-cho, Ohtsu, Shiga 520-2192, Japan, 3 Graduate School of Information Science and Technology, The University of Tokyo, Hongo 73-1, Bunkyo-ku, Tokyo 113-8656, Japan 4 GE-Yokogawa Medical Systems, 4-7-127, Asahigaoka, Hino, Tokyo191-8503, Japan.s 3
Abstract. We have started clinical studies of MR-guided thermocoagulation therapy of liver tumors. Through this therapy for two years, we have developed new instruments, such as a filter to reduce noise in MRI from the microwave coagulator, a new MR-compatible electrode for easy detection of the tip position and an MR-compatible endoscopic system for trans-diaphragmatic approach to liver tumors just below the diaphragm. Concerning software, a program was modified for the real-time display of MR temperature map with a scale bar. A navigation software, 3D Slicer was customized to add real-time image navigation capability. The re-sliced images in the two perpendicular planes complemented the limitations of real-time MR image, which is taken in 2-3 seconds. These technical developments play important roles for more accurate, safer and easier treatment.
Introduction An open configuration MR system enabled us new minimally invasive surgical techniques under the guidance of MR images [1]. Laser beams [2], radiofrequency ablation [3] and cryosurgery [4], have been utilized as thermo-ablation therapies of various tumors in MR environment. In Japan, microwave coagulators, operating at 2.45 GHz, have been developed in liver surgery for hemostasis and tissue destruction for over two decades [5, 6]. They have also been used in interventional therapy for liver tumors under ultrasonographic or laparoscopic guidance [7, 8]. Microwave thermocoagulation therapy has already been established as one useful minimally invasive therapy for liver tumors in Japan. Since microwave ablation does not cause electromagnetic interference in MR images [9], MR temperature monitoring during ablation might be possible for real-time evaluation of the therapeutic effects. Therefore, we have started clinical studies of MR-guided microwave thermo-coagulation therapy of liver tumors and have experienced more than 100 cases since January, 2000. Through this therapy for two years, we have developed new MR-compatible surgical instruments and computer-assisted image navigation system for more accurate, safer and T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 52–59, 2002. © Springer-Verlag Berlin Heidelberg 2002
MR-Guided Microwave Thermocoagulation Therapy of Liver Tumors
53
easier treatment. In the present study, the technical developments for this therapy will be presented.
Methods All MR data were collected on a double-donut type 0.5 T SIGNA SP/i system (GE Medical Systems, Milwaukee, WI), (Fig. 1A). In this system, a FlashPoint Model 5000 (Image Guided Technologies Inc., Boulder, CO) was included for image plane control. The surgeon can determine the image plane, which exactly includes the path of needle, with a 3-point hand piece having 3 LEDs (Fig. 1B) and three detectors for the infrared light from them are fixed on the ceiling. Real time MR images of 256 x 128 resolution for fluoroscopy were collected using a spoiled gradient echo (SPGR) with 14 ms TR, 3.4 ms TE and 30 cm2 FOV. Flip angle was adjusted at a range of 3070 degrees to obtain good contrast of the target. Temperature mapping data were acquired using SPGR with 50 ms TR, 12 ms TE and 24 cm2 FOV. Temperature changes were calculated by the proton resonance frequency method [10]. A microwave coagulator, Microtaze (Model OT-110M, Azwell, Osaka, Japan), which is operated at 2.45 GHz, was used as a heating device (Fig. 2A). A custommade notch filter was inserted in the output line in order to reduce noise during ablation. MR-compatible needle type electrodes (250 mm long, 1.6 mm in diameter) was custom-made with brass coated with silver and gold (Fig. 2B). Generally, the procedures were carried out under general anesthesia. Liver tumors were percutaneously punctured by a 15 cm long 14G MR-compatible biopsy needle (Daum, Schwerin, Germany) under the guidance of real time MRI. The electrode was inserted into the tumor through the outer sheath of the needle. Usually, three 60-second ablations were repeated at the same point. Preliminary studies during laparotomy revealed such ablations caused an oval-shaped coagulated area 20 mm in diameter and 30 mm long along the axis of the electrode. Ablations and punctures were repeated depending on the size and number of tumors. To prepare an MR-compatible endoscopic system, ferromagnetic parts of a CCD camera and C-mounted lens (CN42H and T627R, Elmo, Japan) were replaced to nonmagnetic ones and combined with a 220 mm long non-magnetic telescope (K7210AWA, Storz, Germany). This endoscopic system was used as a thoracoscope for the percutaneous and trans-diaphragmatic puncture of liver tumors just below the diaphragm. The endoscopic image was combined with MR images using the picturein-picture function of a video mixer (MX-1, Futek, Tokyo, Japan). A navigation software, 3D Slicer [11, 12], which was developed in Brigham & Women’s Hospital, Boston, MA, was installed in an independent SUN Ultra 60 workstation and connected with the MR system by network. High resolution 3D MR volume data were acquired just before the surgical procedures and were registered in the 3D Slicer. The display of the work station was converted to NTSC signal by a downconverter (DC 65A, Imagenics, Tokyo, Japan) and sent to the surgeons through a video switcher of the MR system.
54
S. Morikawa et al.
Fig. 1. (A) A double-donut type 0.5 T SIGNA SP/i system (GE Medical Systems, Milwaukee, WI) and a hand piece for image plane navigation system, FlashPoint Model 5000 (Image Guided Technologies Inc., Boulder, CO)
Fig. 2. (A) A microwave coagulator, Microtaze (Model OT-110M, Azwell, Osaka, Japan). A custom-made notch filter was inserted in the output line (arrow). (B) MR-compatible needle type electrodes
Results and Discussion Instruments for Microwave Ablation Initially, as shown in Fig. 3A, electromagnetic interference was appeared during microwave irradiation. A notch filter, which attenuates 40 dB at 21.25 MHz (1H resonance frequency at 0.5T) and 0.3 dB at 2.45 GHz, was inserted in the output line (Fig. 2A, arrow). After the installation of the filter, the noise in MRI disappeared even during microwave ablation (Fig. 3B). For the microwave ablation, the electrode tip should be protruded at least by 2 cm from the needle. The 14G needle was easily detected in real time MRI, but the electrode tip was not always visible, because its susceptibility effect was only slight. It was advantageous for the data collection of temperature mapping. The length of the protrusion was controlled using a plastic tube as a stopper. However, in some cases, accurate detection of the electrode tip position was required. For such cases, another type of “firefly” electrode, which contained a small stainless steel part in the tip (Fig. 4A). The signal defect exactly corresponded to the tip position under our real-
MR-Guided Microwave Thermocoagulation Therapy of Liver Tumors
55
time MR imaging condition. It could not be used for temperature data acquisition, but enabled safer and more accurate puncture of the tumors near the large vessels (Fig. 4 B and C).
Fig. 3. Real time MR images of the liver without (A) and with (B) a notch filter
Fig. 4. (A) An MR image of standard (a) and “firefly” (b) type electrodes in an agar phantom. (B) Axial and (C) sagittal view of liver during puncture for a tumor near the inferior vena cava using the “firefly” type electrode
MR-Compatible Endoscopic System For the treatment of the tumor located just below the diaphragm, the percutaneous puncture from the abdominal wall is not easy. As shown in Fig. 5, the route of the needle is long and might injure vessels or bile ducts. For such a tumor, transdiaphragmatic approach is much easier and safer. To realize this approach, an MRcompatible thoracoscope was combined with MR-guidance. Both MR images and endoscopic image were sent to the surgeon using the picture-in-picture function (Fig. 6). Main and accessory images were exchanged depending on the surgeon’s request. In this approach, the right lung is collapsed under general anesthesia using a double-lumen endotracheal tube. The microwave ablation can be carried out sufficiently without taking care of the injury of the lung caused by heating. Thoracoscope could also detect bleeding from the diaphragm or chest wall, which could be easily stopped by the microwave. The combination of surface information by endoscope and in depth by MRI was found to be very useful for increasing the safety, reliability and availability of this procedure.
56
S. Morikawa et al.
Fig. 5. Real time MR images during the puncture of a liver tumor located just below the diaphragm from the abdominal wall
Fig. 6. Combined display of real time MR and thoracoscopic images for the transdiaphragmatic puncture of the liver tumor
MR Temperature Monitoring Initially, MR temperature monitoring during microwave ablation was impossible because of the severe noise. After the installation of the filter, good temperature maps could be obtained with agar phantom and beef liver (Fig. 7). The software was modified for real time color display of the temperature map with a scale. However, in the clinical cases, temperature monitoring was not easy. The most serious factor is the movement of the liver. Movement of surgeons and surgical instruments in the operating field also affected the magnetic field and the results of temperature calculation. MR temperature data during microwave ablation were collected while suspending the artificial ventilation (Fig. 8). In the practical aspect, it takes some time to change the acquisition parameters from real time MRI to temperature map, which also makes the temperature monitoring difficult. At present, temperature monitoring of every ablation in the procedure is impossible. In order to utilize MR temperature map for the evaluation of therapeutic effects of this procedure, further developments are required. A dynamic study using contrast media at the end of the procedure is utilized for the evaluation. Customization and Utilization of a Navigation Software, 3D Slicer The 3D Slicer, which we got from Brigham & Women’s Hospital, did not support the real time image navigation capability. A program was prepared in SIGNA SP system to transfer the information of the hand piece position and real-time MR images to the 3D Slicer continuously. The 3D Slicer was customized to display the transferred realtime MR image and two re-sliced images (same and perpendicular planes to the real-
MR-Guided Microwave Thermocoagulation Therapy of Liver Tumors
57
time image) from the 3D volume data registered beforehand (Fig. 9). In other words, all the 3 images were controlled by the surgeon using the hand piece. The image resolution and quality of real time MR images were not necessarily satisfactory because only a limited time of 2-3 seconds was allowed for their data acquisition. The two re-sliced images showed better image quality and faster response to the hand piece. The images in the two perpendicular planes made the recognition of the target position easy. Of course, the needle position must be recognized on the real time image, but the re-sliced images complemented the limitations of real-time MR image. These combinations enabled us to accomplish more accurate and safer image navigation of the liver. The tumor in Fig. 9 (white arrow), which located near the hepatic vein and the inferior vena cava, was successfully punctured without injuring the vessels. The information of the hand piece status (visible or blocked) is very important, because the needle might possibly be dislocated from the target. To inform this status to the surgeon, a button was added in the menu bar (Fig. 9, black arrow). Besides them, flip function of each image display for the convenience of the procedure and color display with a scale bar for real-time temperature map were added to the 3D slicer. In the case with thoracoscope, endoscopic image was combined to the 3D Slicer and sent to the surgeon (Fig. 10). Now, this software is used routinely in this procedure.
Fig. 7. Temperature maps of an agar phantom (upper) and beef liver (lower) during microwave ablation at 0, 1, 2, and 3 minutes
Fig. 8. Temperature maps of a liver tumor during 1-min microwave thermo-coagulation therapy
58
S. Morikawa et al.
Fig. 9. Real time image navigation for microwave thermocoagulation therapy of a liver tumor using the 3D Slicer. The upper left is a real time MR image and the lower two are re-sliced images (same and perpendicular planes to the real-time image) from the 3D volume data. The tumor is indicated with white arrows. The hand piece status (visible or blocked) is displayed in the menu bar (black arrow)
Fig. 10. Real time image navigation for thoracoscope assisted microwave thermo-coagulation therapy of a liver tumor using the 3D Slicer. The upper right is a endoscopic image showing a needle through the diaphragm. The arrangements of other windows are the same as Fig. 9. The button (black arrow) indicates the hand piece is blocked with red color
Conclusions For the MR-guided thermo-coagulation therapy of liver tumors, we have developed new instruments such as the filter, electrode and endoscope, and software for temperature monitoring and image navigation. These technical developments play important roles for more accurate, safer and easier treatment.
Acknowledgments We are indebted to Dr. F. Jolesz and Dr. R. Kikinis (Brigham & Women’s Hospital) for donating the 3D Slicer.
MR-Guided Microwave Thermocoagulation Therapy of Liver Tumors
59
Reference 1. Schenck, J.F., Jolesz, F.A., Roeme,r P.B., et al.: Superconducting open-configuration MR imaging system for image-guided therapy. Radiology 195 (1995) 805-814. 2. Schulze, C.P., Kahn, T., Harth, T., Schwurzmaier, H.-J., Schober, R.: Correlation of neuropathologic findings and phase-based MRI temperature maps in experimental laserinduced interstitial thermotherapy. J. Magn. Reson. Imaging 8 (1998) 115-120. 3. Lewin, J.S., Connell, C.F., Duerk JL, et al.: Interactive MRI-guided radiofrequency interstitial thermal ablation of abdominal tumors: clinical trial for evaluation of safety and feasibility. J Magn Reson Imaging 8 (1998) 40-47. 4. Tacke, J., Adam, G., Haage, P., Sellhaus, B., Grosskortenhaus, S., Gunther, R.W.: MRguided percutaneous cryotherapy of the liver: in vivo evaluation with histologic correlation in an animal model. J. Magn. Reson. Imaging. 13(2001) 50-56. 5. Tabuse, K.: A new operative procedure of hepatic surgery using a microwave tissue coagulator. Arch. Jpn. Chir. 48 (1979) 160-172. 6. Hamazoe, R., Hirooka, Y., Ohtani, S., Kato, T., Kaibara, N.: Intraoperative tissue coagulation as treatment for patients with nonresectable hepatocellular carcinoma Cancer 75 (1995) 794-800. 7. Seki, T., Wakabayashi, M., Nakagawa, T., et al.: Ultrasonically guided percutaneus microwave coagulation therapy for small hepatocellular carcinoma. Cancer 74 (1994) 817825. 8. Ido, K., Isoda, N., Kawamoto, C., et al.: Laparoscopic microwave coagulation for solitary hepatocellular carcinoma performed under laparoscopic ultrasonography. Gastrointest. Endosc., 45 (1997) 415-420. 9. Chen, J.C., Moriarty, J.A., Derbyshire, J.A., et al.: Prostate cancer: MR imaging and thermometry during microwave thermal ablation-Initial experience. Radiology 214 (2000) 290-297. 10. Ishihara, Y., Calderon, A., Watanabe, H., et al.: A precise and fast temperature mapping using water proton chemical shift. Magn. Reson. Med. 34 (1995) 814-823 11. Jolesz, F.A., Nabavi, A., Kikinis, R.: Integration of interventional MRI with computerassisted surgery. Journal of Magnetic Resonance Imaging. 13 (2001) 69-77 12. Gering, D.T. Nabavi, A., Kikinis, et al.: An integrated visualization system for surgical planning and guidance using image fusion and an open MR. Journal of Magnetic Resonance Imaging. 13 (2001) 967-975.
Robust Automatic C-Arm Calibration for Fluoroscopy-Based Navigation: A Practical Approach H. Livyatan, Z. Yaniv, and L. Joskowicz School of Computer Science and Engineering The Hebrew University of Jerusalem, Jerusalem 91904, Israel. {livyatan,zivy,josko}@cs.huji.ac.il
Abstract. This paper presents a new on-line automatic X-ray fluoroscopic C-arm calibration method for intraoperative use. C-arm calibration is an essential prerequisite for accurate X-ray fluoroscopy-based navigation and image-based registration. Our method utilizes a customdesigned calibration ring with a two-plane pattern of fiducials that attaches to the C-arm image intensifier, and an on-line calibration algorithm. The algorithm is robust, fully automatic, and works with images containing anatomy and surgical instruments which cause fiducial occlusions. It consists of three steps: fiducial localization, distortion correction, and camera calibration. Our experimental results show submillimetric accuracy for calibration and tip localization with occluded fiducials.
1
Introduction
Research in computer-aided surgery (CAS) has focused on developing X-ray fluoroscopy-based systems to improve the surgeons’ hand/eye coordination, to improve the accuracy and repeatability of surgical gestures, to reduce cumulative radiation, and to shorten surgery times. Fluoroscopy-based CAS systems include virtual fluoroscopy navigation, CT-based navigation with 2D/3D image registration, and tool and robot localization [6]. An essential prerequisite in all these systems is the calibration of the X-ray fluoroscopic unit. Recently, many works have focused on X-ray fluoroscopy calibration both in academia [1,4,8,9,12,13] and in industry (FluoroNav, Medtronic Sofamor Danek, USA, and SurgiGATE AG, Medivision, Switzerland). These works show that the localization error is significant (up to 5mm in older units) and that the calibration parameters are orientation dependent, so they must be corrected independently for each orientation. Two approaches have been proposed: an off-line approach [1,12] and an on-line approach [4,8,9,13]. In the off-line approach, the calibration parameters are computed for a fixed set of C-arm orientations before the surgery starts. In the on-line approach, the
This research was supported in part by a grant from the IZMEL Consortium on Image-Guided Therapy, Israel Ministry of Industry and Trade. We thank Neil Glossop from Traxtal Technologies for the design and manufacturing of the phantom.
T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 60–68, 2002. c Springer-Verlag Berlin Heidelberg 2002
Robust Automatic C-Arm Calibration
61
parameters are computed anew for each image during surgery. The off-line approach allows for larger calibration phantoms, denser grids, and produces images which are simpler to analyze, since only fiducials are present in the image. However, it requires a separate calibration step performed by an X-ray technician and limits the viewing angles that can be used. The on-line approach does not have these limitations but requires a more sophisticated image processing algorithm. It presents trade-offs between calibration phantom size, grid density, and accessibility on the one hand, and robustness and accuracy on the other. The later critically depend on the precise localization of fiducials centers and their pattern. Published methods only address these problems partially.
2
Materials and Methods
We have developed a fully automatic on-line X-ray fluoroscopic image calibration method that includes a calibration phantom and a calibration algorithm. The phantom and the algorithm were developed after considering and experimenting with the trade-offs of on-line calibration, and with special attention to robustness and accuracy in actual surgical situations in which X-ray images include anatomy, implants, and surgical tools that occlude some of the calibration fiducials. The algorithm detects when the calibration cannot be performed accurately due to poor image quality or too many occluded fiducials. We co-designed with Traxtal Technologies (Toronto, Canada) the FluoroTrax, an optically tracked on-line C-arm calibration phantom. The phantom consists of a ring frame on which 32 LEDs for optical tracking are mounted and two parallel radiolucent plates 76mm apart with 120 embedded fiducial steel balls of two diameters (2mm and 3mm). The small fiducials are arranged in a regular Cartesian grid pattern with 20mm spacing between their centers. The big fiducials (21 of them) are all in the upper plate and form a U-shaped pattern consisting of a pair of parallel lines intersected by an orthogonal line. The ring frame attaches with three fast-release clamps to the C-arm’s image intensifier (in our case, a Phillips BV29 unit with a 9” field of view). The phantom is isolated from the sterile surgical field by wrapping it with a transparent plastic sheet. The three-step calibration algorithm computes the calibration parameters from a single X-ray image and from a spatial model of the fiducial centers. First, it locates the fiducials and their patterns and matches them to the model. It then computes the calibration parameters and the distortion correction. 2.1
Fiducial and Calibration Pattern Localization
Accurate and robust localization of the fiducials and their patterns is the most important step of the calibration process, since all parameters critically depend on it. The localization consists of finding the fiducial centers in the X-ray fluoroscopic image and pairing them with the corresponding fiducial centers in the model. To be practical, this step must be real-time, fully automatic, and take into account local intensity variations across the image. Fig. 1 illustrates the process.
62
H. Livyatan, Z. Yaniv, and L. Joskowicz
(a) Original image
(b) After subtraction and NCC thresholding
(c) U-shaped pattern localization
(d) Missing fiducials localization Entropy of projections
5
4.9
Entropy
4.8
4.7
4.6
4.5
4.4
0
20
40
60 Theta
80
100
120
(e) Entropy of the grid point projections as a function of their orientation. Fig. 1. Illustration of the fiducial and calibration pattern localization step. White dots show fiducial center. Fiducials inside black boxes are U-shaped pattern fiducials, inside white boxes are grid pattern fiducials, and inside gray boxes are recovered fiducials.
Robust Automatic C-Arm Calibration
63
The algorithm finds the fiducials and their centers and then matches the U-shaped and the grid patterns to them. The search for fiducials proceeds as follows. First, the algorithm subtracts from the original X-ray fluoroscopic image a background image of it from which the fiducials have been morphologically removed. The background image is computed by performing a median filter on the original image with a square kernel whose size is the small fiducials diameter [9]. Since the gray level values of the pixels occupied by the fiducials in the original image are lower than that of the background, the gray level values at those pixels will be negative after the subtraction. Those pixels are candidates for fiducial locations (Fig. 1(b)). To determine which pixels are fiducial pixels, the algorithm computes the Normalized Cross Correlation (NCC) values with two circle templates (one for big and one for small fiducials) at those pixels. Candidate pixels whose NCC value is greater than a predetermined threshold value are fiducial pixels. Those pixels form clusters of candidate fiducial center locations. To find the actual fiducial centers, the algorithm segments out each fiducial with a local gray-level threshold computed in the vicinity of the candidate fiducial pixels. For each fiducial cluster, it finds the pixel with the highest NCC value. Two concentric squares are centered at this pixel. The pixel gray-level mode value inside the inner square is defined as the fiducial gray-level value. The pixel gray-level mode value inside the outer square is defined as the background gray-level value. The average of these two defines the local fiducial segmentation threshold. The fiducial center is the weighted center of gravity of the resulting segmentation. To find the U-shaped pattern which determines the coordinate frame of the phantom, the algorithm first finds the orientation of the phantom grid by rotating the fiducial centers ci by an angle θj and projecting them on the horizontal axis, obtaining a value xi (θj ). The algorithm computes the histogram entropy of the projected values {xi (θj )} for uniformly sampled angles θj : entropy(θj ) = −
n
p(xi (θj )) log2 p(xi (θj ))
i=1
where p(x) is the probability of x in its histogram. The angle θj which yields the lowest entropy corresponds to the grid orientation. Fig. 1(e) shows an example of the entropy graph for a range of angles. Note that the graph has two very similar minimum values. The first minimum corresponds to grid lines, and the second corresponds to grid diagonals. Due to sampling and numerical inaccuracies, the two values might be interchanged, leading to the wrong conclusion. The algorithm determines which one of the two minima corresponds to grid lines by comparing the distances between the lines in both orientations and choosing the smallest, since the distance between diagonals is smaller than the distance between grid lines. The algorithm proceeds to identify the fiducials forming the U-shaped pattern by assigning a weight to each fiducial according to its NCC and gray-level pixel values. The weights define two clusters, for the small and big fiducials. The two
64
H. Livyatan, Z. Yaniv, and L. Joskowicz
Fig. 2. Robust fiducial and U-shaped pattern localization on two X-ray fluoroscopic images. The left image illustrates robust pattern detection with missing fiducials (at the top left of the image). The right image illustrates robust fiducial detection with fiducial occlusions (indicated by the arrows in the image).
parallel lines and the orthogonal line with the largest number of big fiducials are the U-shaped pattern (Fig. 1(c)). Finally, the algorithm attempts to recover additional fiducials at grid intersections where no fiducials were found by repeating the steps above with a lower threshold on the NCC value (Fig. 1(d)). Fig. 2 illustrates fiducial and pattern localization on two realistic images. 2.2
Camera Calibration
We model the C-arm as a pin-hole camera with distortion [1,4,12]. Camera calibration consists of computing intrinsic (focal length, image center coordinates, horizontal and vertical pixel scales) and extrinsic (spatial position and orientation of the camera) camera parameters. We implemented two well known camera calibration methods: Faugeras’ method [2] and the linear method described in [3,11]. Faugeras’ method requires at least six points in general position, while the linear method requires nine points on two planes. Both methods are applicable to the FluoroTrax, which has more points. As we will see next, both algorithms are experimentally comparable. The calibration result is significantly dependent on the fiducial center locations passed to either calibration method. Using fiducial center locations as found in the distorted image will lead to incorrect results. A simple solution is to use only fiducials central to the image where the distortion is minimal. We offer a more complex solution that leads to better results. In order to calculate the locations of the dewarped fiducial centers belonging to the upper plate of FluoroTrax, we need to find the correct location of its dewarped grid. The center of the grid might be computed by intersecting the grid lines passing through it. The grid interval is a predefined constant and should not
Robust Automatic C-Arm Calibration
65
be computed, due to the adjacency of the upper plate to the image intensifier. The locations of the dewarped fiducial centers belonging to the lower plate are computed by the two-pass mesh warping algorithm described in [10], based on the correction of the fiducial centers in the upper plate. These dewarped fiducial centers enable proper calibration. 2.3
Distortion Correction
Distortion correction consists of computing a dewarping function that inputs a point in the image and returns its corrected location. The dewarping function is computed by comparing the locations of the fiducial centers with their spatial locations projected by the calibrated camera. The dewarping function can be computed for the entire image (globally) or for portions of it (locally). In the global approach, the dewarping function parameters are computed by fitting a surface, such as a bi-cubic polynomial, to the fiducial centers [4,13]. In the local approach, each grid square defined by four contiguous fiducials defines a region with its own dewarping function [8,12]. The local approach requires that all fiducial centers be detected, which is only realistic in off-line calibration. The global approach is better suited for on-line calibration since we can fit a surface to the fiducial centers even when some are missing, although their absence degrades the accuracy. We fit a cubic B-spline to each horizontal and vertical line of fiducial centers and then apply the two-pass mesh warping algorithm described in [10].
3
Experimental Results
We conducted three sets of experiments to evaluate the robustness and accuracy of our calibration method. The first quantifies the accuracy of the distortion correction process. The second quantifies the sensitivity of the calibration process to missing fiducials. The third quantifies the accuracy of the entire process and compares the two camera calibration methods discussed above. To quantify the accuracy of the distortion correction step, we attach the FluoroTraX to the image intensifier, acquire a reference image, rotate the FluoroTraX by an unknown amount around the image intensifier cylinder, and acquire a test image. We compute the dewarping function from the reference image and apply it to the test image. We then compute the distance between the expected location of the fiducial centers and their corrected location in the test image. The following table summarizes the results (values are in pixels, millimeters in parenthesis): Image number of points Maximum Minimum Mean 1 97 1.96 (0.84) 0.06 (0.03) 0.83 (0.36) 2 101 2.23 (0.96) 0.04 (0.02) 0.83 (0.36) 3 97 2.39 (1.03) 0.16 (0.07) 0.95 (0.41) 4 98 1.99 (0.86) 0.06 (0.03) 0.77 (0.33) Average 98 2.14 (0.36) 0.08 (0.03) 0.84 (0.36)
Std. Dev. 0.45 (0.19) 0.50 (0.21) 0.50 (0.21) 0.49 (0.21) 0.48 (0.21)
66
H. Livyatan, Z. Yaniv, and L. Joskowicz
To quantify the sensitivity of the calibration process on missing fiducials, we manually removed fiducials from the images and recorded the calibration parameters values. The following table summarizes the results. Number of focal length points (mm) 61 56 51 46 41 Max. diff. %
885.55 883.48 864.56 865.63 853.65 31.9 3.6%
focal point (mm)
image center (pixels)
pixel size (mm)
5.42, -0.07, 922.55 6.15, 0.46, 920.48 6.02, 1.05, 901.56 6.43, 1.02, 902.63 5.77, 1.90, 890.65 0.66, 2.06, 31.80 10.3%
376.24, 288.02 374.57, 286.62 374.93, 285.29 373.88, 285.34 375.45, 283.33 2.36, 4.69 10.6%
0.43, 0.43 0.43, 0.43 0.43, 0.43 0.43, 0.43 0.43, 0.43 0.00, 0.00 0.0%
Note that although the indiviudal parameter variations can be up to 10% as the number of fiducial points decreases, the values are still meaningful even when only 41 out of 120 fiducials are detected. To quantify the accuracy of the entire process, we compare the tracked position of a tool tip with its location in the X-ray image. We place an optically tracked pointing device with a spherical tip (Polaris, Northern Digital, Ontario, Canada) in the C-arm field of view and compute two distances. The first is the distance between the tip location in the image and its projection according to its spatial location given by the tracking system. This distance quantifies the accuracy of fluoroscopy-based navigation. The second is the distance between the ray defined by the camera focal point and the tip center in the image and the tip center spatial location given by the tracking system. This distance quantifies the accuracy of 2D/3D image-based registration. We compute both distances twice, with the linear method [3,11] and with Faugeras’ calibration method [2]. Fig. 3 shows the results of the first distance. The following table summarizes the results (all measurements are in millimeters, with standard deviation in parenthesis). Image 1. Distance in image plane 2. Distance between ray and tip 1 linear method Faugeras’ method linear method Faugeras’ method 2 1.07 1.13 0.8 0.82 3 1.38 1.39 1.0 0.98 4 0.80 0.86 0.52 0.54 5 0.93 0.89 0.45 0.41 6 1.01 1.02 0.76 0.75 7 0.56 0.57 0.37 0.37 8 1.04 1.05 0.63 0.62 9 0.63 0.59 0.30 0.27 Average 0.92 (0.26) 0.93 (0.27) 0.60 (0.23) 0.59 (0.23) These results indicate that both camera calibration algorithms are comparable and that accuracy is sufficient for X-ray fluoroscopy-based procedures.
Robust Automatic C-Arm Calibration
67
Fig. 3. Projection of tracked pointer tip location onto the X-ray image after calibration.
4
Discussion and Conclusions
We have presented a new on-line X-ray fluoroscopic C-arm calibration method for intraoperative use. The method is robust, fully automatic, and works with images in which calibration fiducials are occluded by anatomy and surgical instruments. Our experimental results show that: (1) the on-line dewarping accuracy, while not as accurate as the off-line one in [12] (405 fiducials, mean = 0.10mm, σ = 0.06mm, min = 0.01mm, and max = 0.20mm. vs. 120 fiducials mean = 0.36mm σ = 0.21mm, min = 0.03mm, and max = 0.36mm), it is still very good and is not the main contributor of the calibration inaccuracy; (2) the calibration parameters are very sensitive to the fiducial center locations and to how many are used to compute the calibration parameters, and; (3) the accuracy of the entire calibration process is below a millimeter, with either the Faugeras [2] or the linear camera calibration [3,11] methods, which yield very similar results. We are currently using the calibration method in two orthopaedic applications: 2D/3D image-based registration for fracture reduction [5] and image-based robot positioning for distal locking [7]. We are also evaluating the in-vitro accuracy and robustness of the entire X-ray image-based registration.
References 1. Brack C., Burgkart R., Czopf A., Gotte H., ”Accurate X-Ray-Based Navigation in Computer-Assisted Orthopedic Surgery” Proc. 12th Int. Symp. on Computer Assisted Radiology and Surgery, H.U. Lemke et al. eds, 1998. 2. Faugeras O., 3D Computer Vision: A Geometric Viewpoint, MIT Press, 1993. 3. Gremban K., Thorpe C., Kanade T., ”Geometric Camera Calibration Using a System of Linear Equations”, Proc. of IEEE Int. Conf. on Robotics and Automation, 1988. 4. Hofstetter R., Slomczykowski M., Sati M., Nolte LP., ”Fluoroscopy as an Imaging Means for Computer-Assisted Surgical Navigation”, Comp. Aided Surgery 4(2), 1999. 5. Joskowicz L., Milgrom C., Simkin A. et al. ”FRACAS: A System for ComputerAided Image-Guided Long Bone Fracture Surgery”, Comp. Aided Surgery 3(6), 1999.
68
H. Livyatan, Z. Yaniv, and L. Joskowicz
6. Joskowicz L., ”Fluoroscopy-Based Navigation in Computer-Aided Orthopedic Surgery”, Proc. of the IFAC Conf. on Mechatronic Systems, 2000. 7. Joskowicz, L., Milgrom, C., Shoham, M., Yaniv, Z., and Simkin, A., “Robot-Guided Long Bone Intramedullary Distal Locking: Concept and Preliminary Results”, Proc of the 3rd Int. Symposium on Robotics and Automation, Mexico, Sept. 2002. 8. Tang S.Y.T., ”Calibration and Point-Based Registration of Fluoroscopic Images”, MSc Thesis, Dept. of Computer Science, Queen’s Univ., Ontario, Canada, 1999. 9. Tate P.M., Lachine V., Fu L. et. al, ”Performance and Robustness of Automatic Fluoroscopic Image Calibration in a new Computer Assisted Surgery System”, Proc. of Medical Image Computing and Computer Assisted Intervention, 2001. 10. Wolberg G., Digital Image Warping, IEEE Press, 1990. 11. Yakimovsky Y., Cunningham R., ”A system for extracting 3D measurements from a stereo pair of TV cameras”, Computer Graphics and Image Processing 1978. 12. Yaniv Z., Joskowicz L., Simkin A., Garza-Jinich M., Milgrom C., ”Fluoroscopic Image Processing for Computer-Aided Orthopedic Surgery” 1st Int. Conf. on Medical Computing and Computer-Assisted Intervention, Lecture Notes in Computer Science 1496, Elsevier, Wells et al eds. 1998. 13. Yao J., Taylor R.H., Goldberg R.P., ”A C-Arm Fluoroscopy-Guided Progressive Cut Refinement Strategy Using a Surgical Robot”, Comp. Aided Surgery 5(6), 2000.
Application of a Population Based Electrophysiological Database to the Planning and Guidance of Deep Brain Stereotactic Neurosurgery Kirk W. Finnis1, Yves P. Starreveld1,2, Andrew G. Parrent2, Abbas F. Sadikot3, and Terry M. Peters1 1
The John P. Robarts Research Institute, Imaging Research Laboratory 2 The London Health Sciences Centre, Department of Neurosurgery, 100 Perth Drive, London, ON, Canada, N6A 5K8 3 Montreal Neurological Institute and Hospital, Department of Neurosurgery, 3801 University St, Montreal, QC, Canada, H3A 2B4 Abstract. Stereotactic neurosurgery for movement disorders involves the accurate localization of functionally distinct nuclei deep within the brain. These surgical targets exist within anatomy appearing homogeneous on preoperative magnetic resonance images (MRIs) making direct radiographic localization impossible. We have developed a visualization-oriented searchable and expandable database of functional organization representing bilaterally the sensorimotor thalamus, pallidum, internal capsule, and subthalamic nucleus. Data were obtained through microelectrode recording and stimulation mapping routinely performed during 145 functional stereotactic procedures. Electrophysiologic data were standardized using a multi-parameter coding system and annotated to their respective MRIs at the appropriate position in patient stereotactic space. We have developed an intensity-based nonlinear registration algorithm to accommodate for normal anatomical variability that rapidly warps a patient’s volumetric MRI to an MRI average brain considered representative of the patient population. The annotated functional data are subsequently transformed into the average brain coordinate system using the displacement grids generated by the algorithm. When the database is searched, clustering of like inter-patient physiologic responses within target anatomy and adjacent structures is revealed. These data may in turn be registered to a preoperative MRI using a desktop computer enabling prior to surgery interactive delineation of surgical targets.
1
Introduction
1.1
Image-Guided Stereotaxy
Surgical guidance software used during image-guided neurosurgery provides surgeons with image-based information for precise targeting of specific regions or pathologies of the brain. Such precision is only possible when the target can be seen on the patient’s preoperative magnetic resonance image (MRI) or computed tomographic (CT) image. When desired targets are functionally but not anatomically distinct, as in stereotactic functional procedures for movement disorders or chronic pain, the gross target structures may often be visualized (the thalamus, subthalamic nucleus, and the globus pallidus internus) but the important functional subdivisions T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 69–76, 2002. © Springer-Verlag Berlin Heidelberg 2002
70
K.W. Finnis et al.
they contain can not. In these cases, the exact location of the surgical target must be approximated and subsequently refined intra-operatively. To facilitate the approximation of indiscernible targets, printed or digitized versions of atlases containing photographs of human post-mortem brain slices [1,2] may be scaled to align with visible anatomical landmarks in the preoperative image and displayed as an overlay. Integrating digitized atlases of anatomy into computer guidance for functional neurosurgery has proven to be useful in the operating room and encouraged the development of several independent and commercially available atlas-based stereotactic planning systems. The atlas-to-patient image registration is achieved using linear scaling techniques based upon the length of an imaginary line joining the anterior and posterior commissures (AC-PC line). 1.2
Drawbacks of Anatomical Atlases
Anatomical atlases are used extensively for stereotactic surgical guidance. Despite their clinical acceptance, atlases of brain anatomy are not ideal predictors of surgical targets that are identified primarily by their function rather than their morphology. For example, the Schaltenbrand Wahren [1] anatomical atlas has poor volume sampling and uneven interslice distances (1-4 mm). As a result, the surgeon must frequently select the atlas plate that most closely approximates the region of patient brain being explored because no atlas plate corresponding to this region exists. A problem inherent to any atlas of anatomy is that it contains no information about normal anatomical variability. The inaccuracy of the target localization step can be largely attributed to normal anatomical variability in the relationships of the target nuclei with the structures used to indirectly approximate their locations [3]. These effects are made apparent by the considerable mismatch that frequently exists once an anatomical atlas has been registered to a patient image with standard anterior commissure-posterior commissure (AC-PC) linear fitting techniques. 1.3
Electrophysiologic Exploration
The electrophysiologic environment of the target structure is interrogated through the use of electrodes introduced under stereotactic conditions into the deep brain region identified by the atlas. The atlas-approximated target is refined using a surgical electrode capable of electrical stimulation or recording of neuronal action potentials through repeated insertion into the predicted target region. The patient is awake throughout this procedure. The functional organization within and adjacent to the suspected target can be mapped based on the stereotypical responses elicited by stimulation of functionally different anatomy or through noting the changes in neuronal firing patterns correlated with physical stimuli or movement of the patient’s body during microelectrode recording. These electrophysiological data provide the surgeon with the necessary information to mentally construct an evolving map of function specific to that patient’s brain. Comparison of these data with that provided in the literature allows the surgeon to estimate the electrode tip position relative to surrounding landmarks. However, clinical studies demonstrate that the position of the final electrophysiologically refined target can vary dramatically from its predicted location [4]. Since multiple trajectories increase the risk of inducing patient trauma, it is highly desirable to reduce the number of electrode tracks required.
Application of a Population Based Electrophysiological Database
1.4
71
Nonlinear Registration
While functional stereotactic procedures generate enormous volumes of intraoperative data that are specific to one individual’s brain, it has become common practice to map these responses to a standard coordinate space using a piecewiselinear image registration approach [2]. However, registering inter-subject data to a single coordinate system using a standard 9 degree-of-freedom (translation, rotation, scale) linear technique will not compensate for normal anatomical variability and generates poor clustering of like functional data [5]. It is the goal of our research to overcome the limitations of this piecewise-linear approach. To address the normal anatomical variability found among patients, we have developed a nonlinear registration algorithm that successfully registers, or ‘warps’, a patient’s volumetric MRI to match a standardized brain MRI while providing a quantitative measure of warp accuracy for the user [6]. The algorithm incorporates an intensity-driven, multiresolution strategy that attempts to maximize the normalized cross-correlation of gradient magnitudes in successively less blurred images until an acceptable registration is obtained. It is unsupervised, platform independent, and multi-threaded to take advantage of multi-processor computers and cluster computing environments. This algorithm was implemented using a series of multi-threaded hierarchical transformation classes that we have contributed to the vtk code repository [7,8]. 1.5
Electrophysiology Database
We present here a collection of intra-operatively recorded electroanatomic observations collected during 163 functional stereotactic procedures performed on 145 patients at two institutions. The ultimate goal of the electrophysiologic database is to provide surgical guidance using the pooled functional information collected from a human population rather than, or in tandem with, an anatomical atlas. Each patient in the database typically results in between 50 and 200 individually coded responses being entered into the database. Our nonlinear registration algorithm accommodates for normal anatomical variability by determining the best intensity-based correlation of subcortical anatomy observed between the patient images and a standardized brain MRI. We have integrated the functional atlas into ASP, the image-guidance software package developed in our lab [8], and linked it to an interactive graphical user interface (GUI) to simplify data entry and retrieval.
2
Method
2.1
Patients and Imaging
To date there are over 145 patients included in the database, representing varying degrees of parkinsonism, essential tremor, localized tremor, and chronic pain from institutions in Montreal, Quebec and London, Ontario. Excluded from the study were patients who exhibited space-occupying lesions or other pathology that could distort the functional organization of the brain or compromise the quality of the nonlinear registration. Patients with congenital anomalies or amputations were included for later analysis but not inserted into the primary database.
72
2.2
K.W. Finnis et al.
Functional Data Entry
2.2.1 Visualization System For our purposes, we have extended the ASP platform to encode, store, and display electrophysiologic data as individual three-dimensional polygon-based objects. Virtual probes custom designed to match the specifications and characteristics of those used in the operating room have been integrated. Fig. 1. Database GUI
2.2.2 Standardization of Functional Data As outlined previously [9], we assign a six-parameter code to each data point entered using the ASP system. To facilitate this process, we designed a GUI that rapidly guides the user through code creation and prompts for information necessary to satisfy all parameters. Integrated into this interface is a subdivided clickable model of the human body. Each anatomical subdivision and groups of subdivisions contained in the model were assigned a unique identification number so that once clicked with the mouse, the number for the selected body part appears in the appropriate entry field in the GUI (figure 1). 2.2.3 Annotation in Native MRI-Space Within the ASP system, coded functional data are plotted directly onto the patient’s preoperative image along a virtual trajectory that corresponds to the position, declination, and azimuth of the operative trajectory (figure 2). A virtual electrode matching the dimensions of the physical tool determines the placement of the data point along the trajectory in millimeters from target. Each functional data-point is displayed as an individual sphere colour-coded to reflect the response type and diameter indicating the amount of current (µA) or voltage used to evoke the response. Images are automatically registered to the coordinate system defined by the stereotactic head frame using a “Frame Finder” software algorithm incorporated into the visualization platform [10]. Once tagged to their respective Fig. 2. Probe (purple) registered to patient MRI imaging volumes, the x, y, and z with crosshairs (yellow) placed at target. Cartesian coordinates of these data Spheres indicate responses obtained with in patient image-space along with stimulation. Slice plane orientations indicated. the corresponding ASP-generated III: 3rd ventricle codes are saved in a text file for addition to the central database. A secondary code that describes the sex, age, pathological condition, and handedness of the patient, surgical procedure and
Application of a Population Based Electrophysiological Database
73
specifications of the probes used during the procedure is assigned to the header of each data file. These data need only be entered once per procedure. 2.3
Pooling Population Data
2.3.1 Nonlinear Registration Algorithm Nonlinear registration of one image volume to another using our algorithm involves two separate steps. The first generates a global affine transformation that maximizes the normalized cross-correlation between the two volumes. The second computes a deformation grid, using the affine transformation as a starting point, to maximize the same similarity metric on successively smaller sub-volumes of the images. The output of the algorithm is a MRI-specific deformation grid that describes the transformation for every voxel in native patient MRI-space to the coordinate system of the target volume. Since we are only interested in registering the deep anatomical structures of the brain, we limit the registration to only those voxels residing within the volume of a user-specified closed polygonal surface. With the use of such a mask, we can compute a deformation grid describing the best nonlinear fit of the patient deep brain anatomy to that of the standardized brain in less than five minutes (dual 933MHz PIII, Linux). 2.3.2 Standardized Brain Volume We use the standard brain template CJH27 [11] as the repository for the database information. Once a patient image was warped to this high-resolution volume, the functional data annotated to the source image could subsequently be re-mapped into standard brain image space. The standard image thus acts as the common coordinate system for collecting and analyzing all population electrophysiologic data. To view the population database registered to a patient’s MRI image, the inverse of the MRIspecific deformation grid generated as part of this procedure is used to re-map the pooled data into patient image-space. 2.3.3 Quantifying Registration Accuracy While visual inspection provides an assessment of registration quality, we also need to assess it qualitatively. Our nonlinear warping algorithm maximizes the normalized cross-correlation of two images by modifying a field of deformation vectors until the highest correlation value is achieved. This alignment metric is calculated in four separate domains or blurring levels (12, 8, 4, and 2mm FWHM Gaussian blurs). Computing and averaging the correlations for each domain across the entire image volume creates a three-dimensional map of cross-correlation values. When this map is simultaneously displayed with the warped target image in ASP, a visual check will rapidly pinpoint any region of poor registration. 2.4
Searching the Database
The database may be selectively searched with the same GUI used to enter the functional data. Database codes relating to a specific response and selected anatomical regions can be retrieved and displayed as clusters of spheres in 3D space or density maps on the slice-plane images (figure 3). Our software permits interactive adjustment of the properties of selected spheres, such as opacity, colour, and shading, or toggling of their presence in the virtual scene to facilitate interpretation of the search results.
74
K.W. Finnis et al.
Fig. 3. Results of database search for codes indicating stimulation induced tingling sensations. Response is indicative of stimulation of ventrocaudal sensory thalamic nucleus. Left: Each sphere indicates a distinct response. Right: Density map for the same data cluster shown in left image (density colour scale also shown). CN: caudate nucleus, III: third ventricle, LT: left thalamus.
3
Results
3.1
Clustering of Population Data
Following nonlinear registration of patient electrophysiologic data to the database mapped to the standard brain MRI, we can demonstrate meaningful clustering of like physiologic responses within surgical targets. 3.1.1 Thalamus In thalamotomy for tremor-dominant movement disorders, the placement of the therapeutic lesion is determined by the location of neurons that fire in synchrony with physical tremor, or tremor cells, located within the Ventralis intermedius (Vim) nucleus. We show in figure 4 the bilateral clustering of 350 tremor cell codes found within the left and right thalami as detected by microelectrode recording during 39 thalamotomies. These data provide a probabilistic localization of the Vim nuclei and the regions of highest probability for locating tremor synchronous neurons. The greater number of tremor cells in the left thalamus (n=224) verses the right (n=126) is representative of the greater number of left-sided procedures currently in the database.
Fig. 4. Bilateral clustering of tremor cells. LT/RT: Left and right thalamus.
3.1.2 Globus Pallidus We have gathered electrophysiological data bilaterally for the internal globus pallidus (GPi). During pallidotomy, surgeons locate the optic tract positioned just inferior to the apex of the anatomy by evoking visual phenomena with electrical stimulation. The optic tract acts not only as an anatomical landmark for probe placement but also signifies a region of brain that must be avoided while lesioning the GPi or implanting a chronic stimulator within it. Figure 5 (left) demonstrates a clustering of 115 stimulation spheres around and within a segmentation of the left optic tract where 29 patients reported seeing flashing lights correlated with stimulation. Sphere size indicates relative stimulation intensity and as one would expect, the further from the optic tract the responses were evoked the larger the diameter of the spheres.
Application of a Population Based Electrophysiological Database
75
Fig. 5. Left: Nonlinearly registered spheres (red) representing visual phenomena evoked through stimulation of optic tract. Right: Same data as left registered using a 9DOF linear fit. Note poor clustering around surface rendered optic tract (yellow). LON: Left optic nerve, OC: Optic chiasm, LOT: Left optic tract, IV: Fourth ventricle.
3.2
Nonlinear Versus Linear Registration
In figure 5 (right) we demonstrate the efficacy of our nonlinear approach by registering the same optic tract related data described in 3.1.2 to the standardized brain using linear registration. The linear procedure is a 9 DOF registration technique, similar to that commonly used to register anatomical atlases to the anatomy of patient MRIs. Responses of the type presented here can only be evoked by stimulation when the probe tip is within the optic tract or in very close proximity (<1mm) to its surface. Not only are the clustering of population data visibly tighter with nonlinear registration, 82% of the spheres are contained within the limits described above compared with only 22% achieved with linear registration.
4
Discussion
4.1
Technological Advances
We have described a method and presented encouraging results for the first truly three-dimensional collection of subcortical electrophysiology capable of nonlinearly registering to an MRI volume. Nonlinear registration was used in all cases to accommodate for nonlinear anatomical variability. Incorporating nonlinear registration into the pooling and analyses of population electrophysiology produces tighter clustering than standard linear registration techniques. Our rapid algorithm allows nonlinear registration to be achieved in a clinically acceptable timeframe (3-12 minutes depending on hardware configuration and number of processors) for a typical 3D MRI volume. When database contents are selectively displayed in our visualization program, delineation of functional borders within homogeneous appearing anatomy is possible and high probability tremor areas can be identified. A digital probabilistic atlas of this nature that utilizes population data will improve in accuracy over time and achieve better statistics with the addition of more data.
76
K.W. Finnis et al.
Our interactive GUI makes generic or detailed searches possible. Using the interface, the surgeon can extract and display only those data most closely approximating a patient’s age, sex, handedness, and diagnosis, or conversely choose to view only data representative of a larger cross section of the database population. The interactive GUI facilitates rapid, detailed coding of patient data that, in our experience, does not impede the normal flow of the surgical procedure when used intra-operatively.
Acknowledgements This work was supported by grants MT-11540 and GR-14973 from the Canadian Institute for Health Research and also by a grant from the Institute for Robotics and Intelligent Systems.
References 1. G. Schaltenbrand and W. Wahren, Atlas for Stereotaxy of the Human Brain. Thieme, Stuttgart, 1977. 2. 2. J. Talairach and P. Tournoux, Co-planar Stereotactic Atlas of the Human Brain. Thieme, New York, 1988. 3. J.M. Burns, S. Wilkinson, J. Kieltyka, J. Overman, T. Lundsgaarde, T. Tollefson, W.C. Koller, R. Pahwa, A.I. Troster, K.E. Lyons, S. Batnitzky, L. Wetzel, and M.A. Gordon. “Analysis of pallidotomy lesion positions using three-dimensional reconstruction of pallidal lesions, the basal ganglia, and the optic tract,” Neurosurgery 41(6), pp. 1303-1318, 1997. 4. R. L. Alterman, D. Sterio, A. Beric, and P. J. Kelly, “Microelectrode recording during posteroventral pallidotomy: Impact on target selection and complications,” Neurosurgery, 44(2), pp. 315-323, 1999. 5. C. Giorgi, P.J. Kelly, D.C. Eaton, G. Guiot, and P. Derome, “A study on the tridimensional distribution of somatosensory evoked responses in human thalamus to aid the placement of stimulating electrodes for treatment of pain,” Acta Neurochir (Wien), 30, pp. 279-87, 1980. 6. Y.P. Starreveld, K.W. Finnis, D.G. Gobbi, A.G. Parrent, D.L. Collins, and T.M. Peters, “Fast nonlinear deformation for functional neurosurgical planning and analysis,” CAS (submitted), 2002 7. Y.P. Starreveld, D.G. Gobbi, K.W. Finnis, and T.M. Peters, “Reusable software components for rapid development of medical image visualization and surgical planning applications,” SPIE-2001 Medical Imaging International Symposium, San Diego, February 17-22 2001,Poster 4319-65S12. 8. D. G. Gobbi and T.M. Peters, “Generalized 3D Nonlinear Transformations for Medical Imaging: An Object-Oriented Implementation in VTK,” CMIG, in press. 9. Finnis KW, Starreveld YP, Parrent AG, Peters TM: A 3-Dimensional database of functional anatomy, and its application to image-guided neurosurgery. In Proc. Of MICCAI ’00, pg 1-8, Springer 10. Y.P. Starreveld, D.G. Gobbi, K.W. Finnis, and T.M. Peters, “An automatic volume based image to frame coordinate transformation for MRI stereotactic surgical planning”, Neurosurgery – submitted, 2002 11. Holmes C.J., Hoge R., Collins D.L., Woods R., Toga A.W., and Evans A.C., Enhancement of MR images using registration for signal averaging. JCAT 22:324-333, 1998.
An Image Overlay System with Enhanced Reality for Percutaneous Therapy Performed Inside CT Scanner 1
2
2
Ken Masamune , Gabor Fichtinger , Anton Deguet , 1 2 Daisuke Matsuka , Russell Taylor 1
College of Sciences and Engineering, Tokyo Denki University, Ishizaka, Hatoyama, Hiki, Saitama, 350-0394, Japan _QEWEWLEVEHIa$EXPFHIRHEMEGNT 2 Center for Computer Integrated Surgical Systems and Technology, Johns Hopkins University, MD 21218, U.S.A. _KEFSVERXSRVLXa$GWNLYIHY
Abstract. We describe a simple, safe, and inexpensive image overlay system to assist surgical interventions inside a conventional CT scanner. The overlay system is mounted non-invasively on the gantry of the CT scanner and it consists of a seven degrees-of-freedom passive mounting arm, a flat LCD display, and a light brown acrylic plate as a half mirror. In a pre-operative calibration process, the display, half-mirror, and imaging plane of the scanner are spatially registered by imaging a triangular calibration object. Following the calibration, the patient is brought into the scanner, an image is acquired and sent to the overlay display via DICOM transfer. Looking at the patient through the half-mirror, the CT image appears to be floating inside the patient in correct size and position. This vision enables the physician to see both the surface and the inside of the patient at the same time, which can be utilized in guiding a surgical intervention. The complete system fits into a carry-on suitcase (except the mounting adapter), it is easy to calibrate, mounts non-invasively on the scanner, without utilizing vendor-specific features of he scanner.
1
Introduction
During the last decade, significant research has been devoted to three-dimensional (3D) medical images for surgical simulation and planning. Typically, these 3D medical images are shown on a flat 2D display on the wall of the operating room, the surgeon must mentally register the computer model with the anatomy seen in the surgical field, and use hand-eye coordination to execute the surgical plan. Therefore, the actions of the physician are not truly objective, primarily because images and patient are not registered accurately. In search of better comprehension of preacquired images, augmented reality displays are becoming more popular in the medical field. Roberts et al. [7] created a frameless image-overlay system with a stereoscopic microscope for neurosurgery. Fuchs et al. [8] and Sauer et al. [12] built augmented reality systems fusing ultrasound and MRI images with video input. In these systems, the visualization method of the images uses the principle of stereoscopic display, e.g. a polarized shutter system or two small monitors located on the user’s head. The authors have previously discussed potential errors of binocular T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 77–84, 2002. © Springer-Verlag Berlin Heidelberg 2002
78
K. Masamune et al.
system approach [3] and hypothesized that the ultimate intra-operative visualization technique may be simultaneous view of the surgical field as seen by the surgeon operation and the patient’s internal anatomy. Navab et al. [11] also followed this thought of train and blended video input and C-arm fluoroscopy with the use of a video camera co-registered with the X-ray source. In reference [3], we presented an image overlay system that provided accurate depth perception and overlay of image slices of a pre-acquired 3D volume on the patient’s body. In a subsequent publication, Stetten reported the use of this overlay method for displaying real-time transcutaneous ultrasound images [9, 10]. The presented work evolved from our earlier research [3], by applying the principle of 2D image overlay for in-scanner CTguided interventional procedures. We attached the overlay system to the gantry of the CT scanner and made the CT image “float” inside the patient, in correct size and position. This solution gives the surgeon 2D “X-ray vision” that promises to be especially useful in in-plane needle placement procedures, when the needle is inserted inside plane of the current CT slice. Our system differs from most virtual reality displays by its inherent simplicity, safety, and low cost.
2
Materials and Method
2.1
Principles
The task of the system was to make a CT image slice “appear” in the exact same location where it was acquired, while the patient is still in the scanner. By superimposing the image on the patient, the physician can see the surface and the inside of the body at the same time. This is expected to help the physician finding the optimal access to the target, while minimizing the collateral damage to normal healthy structures. The basic principle of the CT overlay system is shown in Fig. 1., and we will describe each component further below. Perhaps the most important element of principle is that we do not project the CT image into the surgical field, we merely make it appear there, hence the name “virtual reality” display. The central piece of the system is a half-mirror, which has two useful properties: one can see through the half-mirror and see the scene behind it, while at the same time one can also see reflections in the half-mirror. The reflections are fainter than they would be in a full mirror and the transparent scene is also fainter than it would be through clear glass, but with proper lighting, a reasonable compromise can be achieved between the two. The overall concept is as simple as this: acquire a CT slice, render the image on a distortion-free flat panel display and position the display above the half mirror, so that the reflection of the CT slice coincide with the patient’s body that we see behind the mirror. As we demonstrated in [3], this system results in an optically stable alignment of the real view and the image in the mirror, provided that the mirror, the display and the scanner were properly aligned earlier. The key questions are how to build a system like this, which is also lightweight, inexpensive, easy to setup, easy to calibrate, independent from the actual CT scanner, and ergonomic to both patient and physician.
An Image Overlay System with Enhanced Reality
79
Holding attachment (detachable) Slice image display imaging plane
Half-transparent mirror
mirrored image
Patient Scanner Bed
Surgeon, nurse
X-ray CT gantry
Image transfer via DICOM
PC for image processing
Fig. 1. The configuration of the X-ray CT Image Overlay System
Holding attachment Synchronized rotation by parallel link
Display
7-DOF passive arm
Half mirror Displa y
Note PC
Mirror
Fig. 2. (Left) An overview of the CT Image Overlay System. (Right) The 7-DOF passive arm
2.2
System Description
The overview of the system is shown in Fig. 2 (left). It comprises a holding arm and the unit including an LCD flat panel monitor (LT-1500MA, SHARP Co. Ltd. Japan) and a half-mirror. A personal computer acquires the X-ray CT image via a DICOM and calculates the image’s size and rotation for displaying the slice on the LCD. The Holding Attachment In current CT imaging systems, an image slice is taken at the center of the gantry, and so we have to position the mirror of the overlay system accurately on the center of the gantry. The simplest way to achieve this would be to attach the devices directly onto the CT gantry, but safety regulations prohibit us from making alterations to the
80
K. Masamune et al.
scanner. In order to achieve non-invasive mounting, we constructed a wooden holding attachment that hung down from the gantry and secured the 7-DOF passive arm (Fig. 2, left). This made the system easily detachable, which can be easily installed on a variety of CT gantries. Degrees of Freedom of the Overlay Device To fit any position and direction of the image slice, a seven degree-of-freedom (7DOF) passive arm was designed (see Fig. 2, right). We could adjust the angle and distance from the mirror to the display, and the angle and distance from the display to the CT gantry. Sterilization Considerations The system was conceived for in-scanner percutaneous procedures, for which sterilization is not a difficult issue, as the needle is not in direct contact with the image overlay device. When the device is used in open surgical procedures, sterilization will become a more significant issue. Draping the entire device with transparent sheet would result in foggy images. In order to overcome this problem, we designed the display and the mirror to be easily detachable and easy to be sterilized, making necessary to cover only the display. We used a light brown acrylic plate for the mirror and the arms were made of aluminum with a black alumite satin finish; all parts suitable for EOG sterilization.
3
Registration Method
Registration of the CT image and patient is perhaps the most critical element of the system. Although many registration methods have been studied in the field of augmented reality [4, 5], they do not lend themselves well to a lightweight design, for depending heavily on dedicated instrumentation and intensive calculation. A complicated calibration procedure would certainly be prohibitive to widespread use of the device. We applied a simple triangle fiducial marker with single image acquisition. Three acrylic plates form a triangular prism shown in Fig. 3(1). A flowchart of the registration is shown in Fig. 3. Step 1. Place the triangle marker on the CT scanner bed and turn on the laser marker of the scanner to indicate plane of acquisition Step 2. Take a transverse CT slice, transfer the image to the personal computer via a DICOM link, and flip the left-right image direction. (We need to flip the image, so that in the mirror we can see it in the correct orientation.) Step 2.5. Pre-calibrate the display resolution to obtain a full-sized image. Step 3. Adjust the image to its full size. Step 4. Render the image on the flat panel display, adjusting the virtual image in the mirror with the edges of the real object behind the mirror. Upon completing registration, we can observe an image slice in an accurate position. We replace the calibration object with the patient in, while making sure that the mirror and display are not moved. The above calibration procedure was executed by a software package developed by the authors using VC++6.0 with a Windows 2000 operating system.
An Image Overlay System with Enhanced Reality (1)
(2)
81
(2.5) Pre-calibration of Display’s resolution
(3) PC
Put the triangle marker on the scanner bed and adjust to the LASER of CT
Image transfer via DICOM Adjust the image size and mirroring
(4) adjust the floating image and real image using edge of triangle markers by surgeon’s perception
Output to the display
Fig. 3. Simple registration procedure
4
Experiment
We developed a proof-of-concept prototype to confirm the principle, simplicity and the potential clinical applicability of the system. In Reference [3], we have presented a method for performing evaluation of the accuracy of human perception with using the overlay device. Now we present the results of a phantom experiment, in which we performed in-plane needle insertions into a honeydew melon to a pre-determined target. First, the acrylic prism was placed on the scanner bed and the registration procedure was completed, then we removed the calibration object from the scanner. A small steel ball marker (diameter = 1 mm) was placed deep inside a honeydew melon for target. We secured the melon on the scanner bed and took a scout view image. The steel ball in the scout view indicated the plane of imaging and needle insertion. We moved the scanner bed accordingly, acquired the image and sent it to the computer via DICOM link. The full-sized mirrored image was created by the image processing software and rendered on the flat panel display. Because the system has retained calibration, the mirrored CT image of the melon coincided with the silhouettes of the real melon behind the mirror. Figure 4 shows a picture from the experimental results. Then, using only this image guidance system, a surgeon successfully inserted a needle into the melon. The surgeon selected the entry point and oriented the needle, so that the line of the needle passed through the target and the transverse laser plane of the scanner cast on the needle. Once correct orientation was confirmed, the surgeon entered the needle to the pre-calculated depth. As Figure 4 shows, the surgeon made contact with the pre-implanted steel ball inside the melon.
5
Discussion
The experiment was an overall success: we mounted the overlay device in a noninvasive manner on the gantry, calibrated it, and hit the pre-implanted target with a
82
K. Masamune et al.
standard 18 G aspiration needle. We also identified many future extensions to the system, which could increase the clinical usefulness of the system, and at the same time, we learned about shortcomings of the current prototype.
The tip of the needle hits the steel ball inside the honeydew Fig. 4. Phantom experiment
Mounting proved to be quite cumbersome. In the next prototype, we plan to change the size of LCD monitor, and to build more rigid adapter. Likewise, setting up the 7DOF passive arm, then adjusting the mirror with respect to the display need to be more intuitive before clinical use on human subjects is considered. As also in Figure 2, the location of the crossbar of the mount forced us to lower the display and mirror, thus not leaving enough room for a potential human subject. Therefore, the mount also needs to be re-designed to make the system ergonomically feasible. The current display and mirror are relatively small and rigidly fixed on the mount at the time of calibration. This confines the field of view and target to fairly small area. The primary application we keep in focus is percutaneous needle placement to the spine, so the current hardware is expected to be sufficient for that application, but in the long run some “panning function” with in the plane of imaging will be useful. In this proof-of-concept prototype system, we did not provide any assistance to the surgeon, other than the overlaid raw CT image. In a more advanced system image processing tools and path planning tools will guide the surgeon. The physician could either mark the entry point on the skin pre-operatively or select it intra-operatively in the acquired CT image. In either case, the trajectory of the needle can be drawn in the image, so the surgeon can easily align the needle on the line, while keeping the needle in the plane of the scan using the CT laser. We must also note that as the surgeon enters the needle into the body, the needle disappears from the view. If we reacquire the image, the needle will re-appear, this time inside the body, supposedly coinciding with the trajectory pre-determined for the needle. Doing so could mimic the functionality of CT-fluoroscopy, without the harmful side effects of large amount of X-ray dose to both surgeon and patient. Even with the current hardware and software, we could mimic 3D functionality on slice-by-slice basis, by acquiring a volumetric CT and moving the table back and forth and injecting the matching CT slice into the overlay display. With more sophisticated image processing tools one can add multimodal images into the overlay. For example, we can co-register pre-operative MRI images
An Image Overlay System with Enhanced Reality
83
to intra-operative CT, then re-slice the MRI in the current plane of imaging and inject that cut into the overlay display. With more sophisticated hardware one could change the orientation of the mirror and display, enabling to inject re-formatted images in an oblique plane onto the field. With the current system, we have used simple, raw cross-sectional CT images that are familiar to a surgeon. The concept of a 3D image using a half-mirror is not a new one, related research has been presented by several investigators, also including our group [1, 2, 3, 4, 6, 9, 10]. As mentioned earlier, Stetten’s ultrasound display mechanism uses the same overlay concept, yet it targets an entirely different set of applications and clinical problems. Stetten’s system is a versatile hand-held device that is rather difficult to keep in alignment with the surgical site in a long or delicate application. The quality and the accuracy of CT images often allow for more accurate targeting than ultrasound. CT guidance promises to be particularly useful in spine punctures (nerve blocks, facet joint injections), as well as biopsies of the head, neck, and perhaps lung, and possibly in many other procedures. The system is also appears to be most promising if applied on helical CT or cone-beam CT, by which we can acquire nearreal-time images and update the view rapidly, in order to keep up with the effects of organ shift and tissue deformation. Development of a clinically feasible version is in progress, which we will present at the conference, together with more thorough experimental evaluation.
6
Conclusions
We demonstrated a simple, inexpensive, and accurate 2D image overlay system with real depth perception inside a CT scanner. Results from a phantom experiment indicate that an ergonomically advanced version of the system may be useful in a wide range of needle-placement applications. Research and development continues toward that goal.
Acknowledgements This research project is partly supported by the National Science Foundation under the Engineering Research Center grant #EEC9731478, Nakatani Electronic Measuring Technology Association of Japan and the Grant-in-Aid for Scientific Research #14702071. We are grateful to Drs. James S. Zinreich and Bruce Wasserman for housing the experiments in the Division of Neuroradiology of the Johns Hopkins Hospital. Last, but not least, we are indebt to Vincent L. Lerie, R.T for spending long hours with us in the scanner room, without whom our research would not have been as much fun.
References 1. Y. Masutani, T. Dohi, Y. Nishi, M. Iwahara, H. Iseki and K. Takakura, VOLUMEGRAPH An integral photography based enhanced reality visualization system, Proc. of CAR’96, p.1051, 1996
84
K. Masamune et al.
2. T. Ono, K. Masamune, M. Suzuki, T. Dohi, H. Iseki and K. Takakura, Study on threedimensional surgical support display using enhanced reality, Proc. of 7th annual meeting of Japan Society of Computer Aided Surgery, pp.113-114, 1997 (in Japanese) 3. K. Masamune, Y. Masutani, S. Nakajima, I. Sakuma, T. Dohi, H. Iseki, K. Takakura, “Three-dimensional Slice Image Overlay System with Accurate Depth Perception for Surgery”, MICCAI2000 LNCS 1935, pp.395-402, 2000 4. M. Blackwell, C. Nikou, A. M. DiGioia and T. Kanade, An Image Overlay System for Medical Data Visualization, MICCAI98, pp.232-240, 1998 5. W. E. L. Grimson, et al,: An Automatic Registration Method for Frameless Stereotaxy, Image Guided Surgery, and Enhanced Reality Visualization, IEEE Trans. Med. Imag., Vol.15, No.2, 1996 6. S. Nakajima, K. Masamune, I. Sakuma, T. Dohi, Development of a 3-D Display System for Surgical Navigation, Trans. of the Institute of Electronics, Information and Communication Eng. D-II, Vol. J83-D-II, No.1, pp.387-395, 2000 7. E. M. Frets, J. W. Strobe, K. F. Hatch, D. W. Roberts, “A Frameless Stereotaxic Operating Microscope for Neurosurgery”, IEEE Transactions on Biomedical Engineering, 36(6), pp.608-617, 1989 8. M. Rosenthal, A. State, J. Lee, G. Hirota, J. Ackerman, K. Keller, E. D. Pisano, M. Jiroutek, K. Muller, H. Fuchs, “Augmented Reality Guidance for Needle Biopsies: A Randomized, Controlled Trial in Phantoms”, MICCAI2001 LNCS 2208, pp.240-248, 2001 9. G. Stetten, V. S. Chib, R. J. Tamburo, “Tomographic Reflection to Merge Ultrasound Images with Direct Vision”, IEEE Proc. Applied Imagery Pattern Recognition (AIPR) annual workshop, pp.200-205, 2000 10. G. Stetten, V. Chib, “Magnified Real-Time Tomographic Reflection”, MICCAI2001 LNCS 2208, pp.683-690, 2001 11. M Mitschke, et al., “Interventions unver Video-Augmented X-ray Guidance: Application to Needle Placement”, MICCAI2000, LNCS 1935, pp.858-868, 2000 12. F. Sauer, A. Khamene, B. Bascle, and G.J. Rubino “A Head-Mounted Display System for Augmented Reality Image Guidance: Towards Clinical Evaluation for iMRI-guided Neurosurgery”, MICCAI2000 LNCS 2208, p. 707-716., 2001
High-Resolution Stereoscopic Surgical Display Using Parallel Integral Videography and Multi-projector Hongen Liao1, Nobuhiko Hata2, Makoto Iwahara2, Susumu Nakajima3, Ichiro Sakuma4, and Takeyoshi Dohi2 2
1 Graduate School of Engineering, The University of Tokyo, Tokyo, Japan Graduate School of Information Science and Technology, The University of Tokyo 3 Department of Orthopaedic Surgery, The University of Tokyo Hospital 4 Graduate School of Frontier Sciences, The University of Tokyo {liao,noby,iwahara}@atre.t.u-tokyo.ac.jp, {nakajima,sakuma,dohi}@miki.pe.u-tokyo.ac.jp
Abstract. This paper proposes a high-resolution stereoscopic surgical display using the integral videography (IV) and multiple projectors. IV projects a computer generated graphical object by multiple rays through “fly’s eye lens”, which can display geometrically accurate autostereoscopic images and reproduce motion parallax in three-dimensional (3-D) space. This technique requires neither special glasses nor sensing device to track viewer’s eyes, thus being suitable for pre-operative diagnosis and intra-operative use. This paper reports the use of multiple reduction projection display system and a parallelcalculation to achieve high-resolution IV image. We evaluate the feasibility of this display by developing a 3-D CT stereoscopic image and applying it to surgical planning and intra-operative guidance. The main contribution of this paper is application and modification of medical stereoscopic technique originally developed in high-resolution multi-projector stereoscopic display system.
1
Introduction
Stereoscopic technique has been taking an important roll in surgery and diagnosis recently, with various modes of visualization on offer [1-3]. Among previous reported methods use polarized or shuttering glasses originally developed in augmented reality domain and applied in surgical simulation and diagnosis. However, glass-based method does not always give an observer an accurate sense of depth, as the depth perception depends on the spacing of the observer's eyes, which is not always the same as the fixed lenses. In the case of this kind of stereoscopic displays, they can observe pre-fixed 2-D images, which create quasi 3-D images with distortion. Subjectively they perceive it as 3-D, but there may be significant inaccuracies in registration. Motion parallax, an alternative stereoscopic vision, cannot be reproduced without wearing a tracking device. Consequently, systems based on binocular stereoscopic vision are not good enough for medical use. Our previous reports [4] proposed an approach to overcome these issues of binocular stereoscopic vision by using Integral Videography (IV), which was an T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 85–92, 2002. © Springer-Verlag Berlin Heidelberg 2002
86
H. Liao et al.
animated extension of integral photography (IP) originally proposed by M.G.Lippmann [5] in 1908. IV projects a computer-generated graphical object generated by multiple rays coming through “fly’s eye lens”. The detailed description of IP and IV can be found in [6,7]. We further extended our work for surgical assistance in orthopedic surgery [7] and an intra-operative navigation in the trials of clinical application [4]. Though the advantage of IV has been proven in both feasibility studies and clinical applications [4-7], one of the issues still unsolved is the limitation of pixel density, originally caused by of the inadequate resolution of currently available display devices. Thus, the motivation of this paper is to overcome these issues on limited resolution by using multiple projection display system and parallel computing for fast rendering. We evaluate the feasibility of this display by developing a 3-D CT stereoscopic image and applying it to surgical planning. The main engineering contribution of this paper is use of parallel computing and multiple optical systems for stereoscopic IV image generation, which thought to be the most effective solution for generating high-resolution stereoscopic display. This engineering contribution deploys the clinical significance of the study that the newly proposed method enables high-resolution stereoscopic visualization (or even imageoverlay) of the three-dimensionally reconstructed MR, CT, or US images.
2
Materials and Methods
A schematic presentation of the multi-projector IV display is presented in Figure 1. Four major components of the display system are: (1) Parallel calculation and parallel display; (2) High-resolution multi-projection system; (3) Reduction projection technique and image calibration; (4) special image formation. Seamless image processing
Reduction projection Spatial image formation 300 `600dpi @
Calibration
Display cluster
Compute cluster
System area network
Data storage and server Console
Medical 3D image data MRI, CT, etc j i
Fig. 1. Schema of the high quality IV stereoscopic display
fly’s eye lens array
IV stereoscopic image
High-Resolution Stereoscopic Surgical Display
2.1
87
Parallel Calculation and Parallel Display
We integrated parallel calculation and display system of the IV stereoscopic display with several components (Fig.1), including 1) a display cluster with multi PCs, which are used for parallel display of high-resolution image; 2) a compute cluster with high performance computer or workstation, which was used for parallel calculation; 3) a console PC controls execution of system, including data input and storage. All the PCs are connected together with a 100 Base-T Ethernet network. Here we face all three general types of research challenges: coordination of PCs and graphics accelerators to create consistent, real-time images; communication among multi PCs; and resource allocation to achieve good utilization. We integrated Message Passing Interface (MPI) [8] to increase the rendering performance of IV. The parallelization is based on static partitioning of the rendering image. Each node in a cluster computer is assigned a section of the line to reconstruct. The node elaborates their section in parallel and, once finished, send reconstructed section to a master process. The master process is in charge of I/O, or network communication API, which collects source images and sends the rendered image to parallel display cluster. 2.2
High-Resolution Multi-projector Display
Despite much recent progress in the development of new display technologies such as Organic Light Emitting Diodes (OLEDs), the current economical approach to making a large-format, high-resolution display uses an array of projectors [9]. In our system, multiple projectors of the display are arranged in display array, producing an image of high resolution across a rear projection screen by utilizing reduction projection technique, which will be described later in this paper. We wanted to maximize the number of pixels displayed, but image blending would sacrifice a large percent of pixels, especially around the four edges of each center row image. In this case, an important issue is the coordination of multiple commodity projectors to achieve seamless edge blending and precise alignment. Seamless edge blending can remove the visible discontinuities between adjacent projectors. Edge blending techniques overlap the edges of projected, tiled images and blend the overlapped pixels to smooth the luminance and chromaticity transition from one image to another. We employed image-processing techniques to correct the source image before display by misaligned projectors. It requires only the coarsest physical alignment of the projectors. We obtain precise alignment (or misalignment) information with an off-the-shelf camera. We zoom the camera to focus on a relatively small region of the display wall and pan the camera across the wall to get a broader coverage. The camera measures point correspondences and line matches between neighboring projectors. We then use simulated annealing to minimize alignment error globally and solve for the projection matrices. 2.3
Reduction Projection
Although we can achieve high-resolution image by using a normal multi-projection technique, the pixel density of projected image is not suitable for stereoscopic image creating. The pixel of the projected image used in IV stereoscopic display must be
88
H. Liao et al.
high-density. In this study, we use reduction projection technique achieve a highdensity pixel image. The reduction projection was usually used in the field of electron optics and exposure technique. In this study, we redesign the projector by altering the combination and arrangement of lenses to achieve new reduction projection optics. The density of pixel on the projected image can be provided over 300 dpi. 2.4
Spatial Image Formation and Image Adjustment
By the nature if the IV principle described, the high resolution and high-density image projected on screen must be free from distortion and reflection, so an antireflective antistatic coating [10] flat screen is used to display the image. This screen is placed in the rear of “fly’s eyes lens” array. When the rendered elemental IV image is projected on the screen, the stereoscopic image will be formatted to a spatial image. In addition, we obtain precise alignment (or misalignment) information with a digital camera by measuring the position and color features of the pixels projected on a screen, and then feedback the calibration information to the image-processing hardware to make fine adjustments of the projected image on the screen.
3
Clinical Feasibility Studies
To evaluate the feasibility of this study, we applied this display system to image guide surgery. This kind of stereoscopic image can enhance the surgeon’s capability to utilize medical 3-D imagery to decrease the invasiveness of surgical procedures and increase their accuracy and safety. These applications include surgical planning, surgical guidance, and surgical guidance with intra-operative updates. Our prior works focused on surgical guidance to present the surgeon with IV image that was gathered in pre-operative, track surgical instruments within the operating field and display in a real-time IV stereoscopic image [4]. In this study, we perform surgical planning and simulation by this newly developed IV stereoscopic display, as shown in Fig.2.
Fig. 2. Schema of surgical planning and simulation system of IV stereoscopic display
High-Resolution Stereoscopic Surgical Display
3.1
89
Equipments and Results
We developed a multi-projector system with 9 projectors (XGA, U2-1110, PLUS Vision Corp., Tokyo Japan), arranged in a 3 x 3 array, which producing a display of 2872x2150 pixels across a 241x181mm (302.38dpi) rear projection screen by utilizing reduction projection technique. The pitch of each pixel projected on the screen is 0.084 mm. Micro lens array placed on the screen has hexagonal micro lenses with diameter of approximately 1.008mm, covers 12 pixels in the reduction projection screen. The image-processing hardware (Model PA99, System Development Laboratory, Hitachi, Ltd.) was developed with the maximum pixel frequency of the 65MHz that corresponds to 60 frames of XGA image per second. The high-resolution multi projection systems include 9 projectors and corresponding mirror combination. This system achieves high-resolution and high-density image with reduction projection and precision position calibration. The reduction projection lenses using in this projector include 7 sets and 11 pieces of lenses. We obtain precise alignment (or misalignment) information
with a camera (Nikon D1X digital camera). In order to observe the 3-D image with correct motion parallax, one should observe at a distance of about 50 cm from the system. The condition wherein the projected 3D image was continuously observed was examined (Fig.3).
Fig. 3. Photos of projected stereoscopic skull images (stationary) by IV, taken from various directions. They show the motion parallax. The letter denotes the position of the observer to the IV stereoscopic image: A: above, L: left, F: front, R: right, B: below
3.2
Surgical Planning and Simulation
We evaluated the usefulness of the newly developed system in operative setting. In clinical feasibility study, we performed CT scanning to take photo of in-vivo human heart 5 times in one cardiac cycle. The volumetric CT images of human heart
90
H. Liao et al.
(512x512pixels x180 slices for one time, thickness of 1.0mm) were rendered 5 times separately in one heartbeat. The rendered elemental IV images were projected continually on the IV stereoscopic display with the same heartbeat period of the patient (Fig.4). (Since the projected IV stereoscopic image was purely threedimensional, it was difficult to record in two-dimensional conventional photographs. The quality of the actual image was much better than shown in this figure.)
t=0.0s
t=0.19s
t=0.38s
t=0.57s
t=0.76s
Fig. 4. IV CT stereoscopic animated image: the patient has a rate of 63 beats per minute, cardiac cycle of 0.95s
3.3
Intra-operative Assistance and Guidance
Intra-operatively, IV stereoscopic image can help with the navigation of instruments by providing a broader view of the operation field. In combination with robotic and surgical instrument, it even can supply guidance by pre-defining the path of a biopsy needle or by preventing the surgical instruments from moving into critical regions. IV image developed in this study can be segmented. Effects such as thresholding, morphological operations (dilation, erosion), island removal, cropping, and free-hand drawing can be applied to the data. One strength of IV image algorithm in our system is that effects can be visualized by multiple layers along with surgical instrument and critical regions (Fig.5).
Surgical instrument
Fig. 5. Surgical simulation: IV image visualized by multiple layers along with surgical instrument and critical regions
High-Resolution Stereoscopic Surgical Display
4
91
Discussions
The IV stereoscopic display system developed in this study has three primary merits: Superimposing the real, intuitive 3-D image for medical diagnosis and surgical guidance; Avoiding the need of extra devices such as the wearing of special glasses, and offering a geometrical accuracy image over the projected objects (esp. depth perspective); Visibility of motion parallax over a wide area, simultaneous observation by many people. As accuracy is most important element in medical imaging, this system has advantages over other 3-D image display method. Theoretically, when the ideal lens is utilized, IV can provide a three-dimensional display that is free from any discontinuous change of images that occur with the observer’s movement and with the same resolution that conventional two-dimensional displays feature [11]. The measurements of spatial resolution of IV stereoscopic display show the quality of IV image can be improved. Actually, both of the IV stereoscopic display systems developed in this study have a pixel density of only about 300dpi. The spatial resolution of the 3-D image projected is proportional to the ratio of the lens diameter in the lens array to the pixel pitch of display. Thus, both the lens diameter and the projected pixel pitch need to be made much smaller. One of the possible solutions to realize higher pixel density is the use of more multi-projectors or projector with higher resolution (SXGA or more pixels) to create more high-resolution image. A more large computation power is also needed to correspond to multi-projector system. In the application test, we found that the display devices developed in this study is massive. The multi-projector system must be made more small and simplicity to be use in operating room, especially for the navigation use or image-overlay. This can be resolved by altering the design of projector and using the technology of the Digital Micromirror Device (DMD) directly.
5
Conclusion
In conclusion, we have developed a high resolution with high pixel density device for IV stereoscopic display system using multi-projector and reduction projection method. The feasibility study indicated that the multi-projector and parallel rendering performance achieved with the proposed method is satisfactory and suitable in surgical diagnosis and planning setting. We evaluate the feasibility of this display by developing a 3-D CT stereoscopic image and applying it to surgical planning. The main contribution of this paper is application and modification of stereoscopic technique originally developed in high-resolution multi-projector stereoscopic display system.
Acknowledgements This study was partly supported by the Grant-in-Aid for the Development of Innovative Technology (12113) by the Ministry of Education, Culture, Sport, Science
92
H. Liao et al.
and Technology in Japan. We thank Haruo Takeda, Masami Yamasaki, Tsuyoshi Minakawa, Takafumi Koike, Fujio Tajima and Yasuyuki Momoi of the System Development Laboratory, Hitachi, Ltd, for their contribution in seamless technique in multi-projector display.
References 1. M.A.Guttman, E.R.McVeigh, “Techniques for fast stereoscopic MRI,” Magnetic Resonance in Medicines, Vol.46, pp317-323, 2001. 2. M.Blackwell, C.Nikou, A.M.Digioia, T.Kanade, “An image overlay system for medical data visualization,” Medical Image Analysis, Vol.4 pp.67-72, 2000. 3. Boerner R. “Three autostereoscopic 1.25 m diagonal real projection systems with tracking features,” Proceedings of IDW, pp.835¯838, 1997. 4. H.Liao, S.Nakajima, M.Iwahara, E.Kobayashi, I.Sakuma, N.Yahagi, T.Dohi, “Intraoperative Real-Time 3-D Information Display System based on Integral Videography,” Medical Image Computing and Computer assisted Intervention MICCAI2001, LNCS 2208, pp.392-400, 2001. 5. M.G.Lippmann, “Epreuves reversibles donnant la sensation du relief,” J. de Phys Vol.7, 4th series, pp821-825, 1908. 6. Y.Masutani et al., “Development of integral photography-based enhanced reality visualization system for surgical support,” Proc. of ISCAS’95, pp16-17, 1995. 7. S.Nakajima, K.Nakamura, K.Masamune, I.Sakuma, T.Dohi, “Three-dimensional medical display with computer-generated integral photography,” Computerized Medical Imaging and Graphics, 25, pp235-241, 2001. 8. Peter S.Pacheco, “Parallel Programming with MPI,” Morgan Kaufmann, 1996. 9. K. Li et al., “Building and using a scalable display wall system,” IEEE Computer Graphics and Applications, Vol.20, Jul/Aug, Issue 4, pp.29-37, 2000. 10. Y.Endo et al, “A study of antireflective and antistatic coating with ultrafine particles,” Advances Powder Technol., Vol.7, No.2, pp.131-140, 1996. 11. H.Hoshino, F.Okano, H.Isono, I.Yuyama, “Analysis of resolution limitation of integral photography,” Optical Society of America, Vol.15, No.8, pp2059-2065, 1998.
Three-Dimensional Display for Multi-sourced Activities and Their Relations in the Human Brain by Information Flow between Estimated Dipoles Noriyuki Take12 , Yukio Kosugi1 , and Toshimitsu Musha2 1
2
Interdisciplinary Graduate School of Science and Engineering, Tokyo Institute of Technology, Yokohama, 226-8502 Japan {take,kosugi}@pms.titech.ac.jp Brain Functions Laboratory, Inc., Kawasaki, 213-0012, Japan {take,musha}@bfl.co.jp
Abstract. It is important to show brain activities and their relations visually for the elucidation of the processing mechanism and for the diagnosis of diseases in the human brain. We developed a three-dimensional displaying tool by estimating dipoles to show the activities and by analysing information flow between them to show the relations. At first, we estimate dipoles (via 3-layered concentric spherical model, 2-dipole estimation) from evoked potentials. Secondary, using derived 2 dipole locations and moments as loci and quantities of brain activities, we applied stationary analysis for the information flow between the two time-series of the 1st and the 2nd dipole moments. Therefore, we obtain bi-directional information flows between the neuronal activities localized in 3D space of the brain with respect to somatosensory evoked potentials measured with 21 electrodes arranged according to the international 10-20 standard. Furthermore, we tried non-stationary analysis for the information flow with simulation data.
1
Introduction
Recently, an information flow between the scalp potentials measured by multiple electrodes is reported [2]. However, it was not possible to reveal the threedimensional information flow within the brain, because this method is based on the assumption that the intra-cranial activities might be indirectly estimated by the information flow between the two-dimensional specified points on the scalp. Therefore, it was not sufficient for the elucidation of the information processing mechanism and the detection of disease in which the deeply seated neuronal activities are involved. In our previous study [1], we successfully developed a method to reveal brain activities and their relations. To quantitatively represent neuronal activities in the specified locations of the brain, we made dipole localization (via 3-layered concentric spherical model, 2-dipole estimation) of evoked potentials (EPs) measured with multiple electrodes. Then, to derive the relations, we analyzed the T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 93–100, 2002. c Springer-Verlag Berlin Heidelberg 2002
94
N. Take, Y. Kosugi, and T. Musha bidirectional Information Flow
EPs measured on the scalp Display part
Dipole estimation 2-dipole estimation in 3-layered concentric spherical model
3D display for time-series of dipole positions
alignment of dipoles
modeling
2 time-series of dipole moments
time-series of 1st Dipoles
time-series of 2nd Dipoles
Fig. 1. Information flow between two time-series of 1st and 2nd dipoles by assuming 2 dipoles in 3-layered concentric spherical model.
Stationary analysis for information flow
Non-stationary analysis for information flow
correlation
symbolization
AR model
selection for Markov order
linear production model bidirectional information flow
adaptive model
plot for time-series of dipolarity and dipole moments
plot for stationary information flow
3D plot for non-stationary information flow
bidirectional information flow
Fig. 2. Flow chart totally for analyses and display of the tool.
bi-directional information flows between time-series of equivalent dipoles, as schematically shown in Fig. 1. In the present study, we totally develop a three-dimensional displaying tool for the analyses and displays using our previous method. It can estimate the time-series of equivalent dipoles, analyze the information flow between them, and display results, automatically from the beginning to the end with small number of pre-set parameters adequately adjusted. Furthermore, we try non-stationary analysis for the information flow with simulation data.
2
Three-Dimensional Displaying Tool
A flow chart for the analyses and displays of the tool is totally shown in Fig. 2. In the analysis part (left side of Fig. 2), at first, time-series of equivalent dipoles are estimated from EPs. Then, stationary or non-stationary analysis for information flow is applied for the time-series of estimated dipoles. In the display part (right side of Fig. 2), time-series of dipole positions is displayed in 3D (Fig. 6) and time-series of dipolarity and dipole moments are plotted (Fig. 5). Then, stationary (Fig. 7) or non-stationary (Fig. 8) information flow is plotted in 3D. All analyses and displays are automatically done at once.
3
Methodology
In the dipole estimation, we developed a method to align the time-series of the 1st and the 2nd dipoles because in the two-dipole estimation at each time, locations of the two dipoles can be exchanged, if we do not pay attention for keeping the consistency in each series. In our previous study, we applied the stationary analysis for information flow. However, the relations between brain activities are considered as non-stationary processings with time course, so that we developed a non-stationary analysis for information flow with different time range.
Three-Dimensional Display for Multi-sourced Activities
3.1
95
Dipole Estimation
The 2-dipole estimation in the 3-layered concentric spherical model [3,4] is carried out for all samples in a time range of EPs by setting the 3-layered structure, and all electrode positions on the 3rd layer of the sphere. The simplex method [5] is used to solve the inverse problem. The estimation is carried out six times for every sample from each initial simplex position randomly chosen, and a result giving the best estimation accuracy is chosen from the six estimated ones. The dipolarity (%) shows the estimation accuracy as, → → p cal 2 − p meas − − × 100 (1) Dipolarity = 1 − → − p meas 2 where, Pmeas and Pcal are the measured and the calculated scalp potentials based on the two-dipole model respectively, both expressed in n-dimensional column vectors composed of EPs on the n electrodes. The dipolarity ranges from 0 to 100 % and shows that the estimated dipoles are sufficiently accurate as the value approaches to 100. As the result of the estimation, two time-series of the 1st and the 2nd dipoles are obtained. However, they become a mixed state from a time-series point of view, because the order of the estimated 1st and 2nd dipoles at each sample is indefinite. Therefore, we developed a method to align them. An explanatory drawing is shown in Fig. 3. D1, D2 show the 1st and the 2nd dipole locations respectively at a sample time. M1 and M2 show the mean locations of the 1st and the 2nd dipoles respectively, averaged for several samples preceding to D1 and D2, with sufficiently high dipolarity. Thus, we define a criterion for the evaluation of ”Proximity” as, P =
r1 r1 + r2
(2)
where, r1 and r2 show the distance between D1 and M1, D1 and M2 respectively (or between D2 and M1, D2 and M2). Here we suppose, P1 and P2 as the criterion for D1, D2 respectively. We perform the dipole alignment according to the following rule: If If
P 1 > P 2; P 1 ≤ P 2;
Exchange D1 and D2, No change.
(3)
The alignment is carried out by the above exchange rule from the beginning to the end of the time-series with respect to the 1st and the 2nd dipoles for all samples. After the alignment, magnitude of the 1st and the 2nd dipole moments is calculated through all samples of the time-series. It is possible to regard the time-series of the magnitude of the 1st and the 2nd dipole moments derived through our method as being the time-series of the brain activities approximated by the two equivalent dipoles.
96
N. Take, Y. Kosugi, and T. Musha
Fig. 3. Explanatory drawing for the alignment of dipoles.
3.2
Fig. 4. Result of SEPs for input of the analysis (21 electrodes of the international 10-20 standard, 2kHz sampling rate, averaged for 500 times, positive up display).
Information Flow
Stationary Analysis The stationary information flow is analyzed by use of the time-series analysis method based on the directed transinformation [6]. First, the correlation function of the two time-series is calculated. Next, the signal representation by two-dimensional AR model (autoregressive model) is carried out after the selection of the AR model order by FPE (final prediction error criterion). Then, the linear production model is constructed from the derived AR coefficients. When the two time-series of the moments, X and Y, are shown as, X = Xk−N . . . Xk−1 Xk Xk+1 . . . Xk+M = X N Xk X M Y = Yk−N . . . Yk−1 Yk Yk+1 . . . Yk+M = Y N Yk Y M ,
(4)
the directed transinformation from Xk to Yk+m is given by 1 2
I(Xk → Yk+m |X N Y N Yk ) =
log2 1 + m−1 i=0
a2Y Xk+m,k (a2Y Xk+m,k+m−i +a2Y Y k+m,k+m−i )
(5)
where aY X and aY Y are the linear filter coefficients (the impulse response coefficients) in the linear production model. In (5), I(∗ → ∗) shows the mutual
Three-Dimensional Display for Multi-sourced Activities
97
information, and the direction is regulated automatically because k point of time is earlier time-related than k + m point of time. Therefore, we obtain the bi-directional information flows by which both directed transinformation are calculated for all delays of m. Since this method is steady-state analysis expressed by AR model (stationary model), the information flow is same for any point of time k. Non-stationary Analysis The non-stationary information flow is analyzed as an application of the bi-directional communication theory [7]. In the theory, the directed transinformation from symbolized signal X to Y is given by p(Y |Yn ) (6) TY X = lim E − log2 n→∞ p(Y |Xn Yn ) where p(∗|∗n ) is a conditional probability when the occurrence of previous n samples is known. First, two time-series of dipole moments are symbolized as proper number of symbols. Secondary, the n value in (6) is selected. Then, all combinations of symbolized moments divided by same time range are calculated by (6). Therefore, we obtain the bi-directional information flows with different time ranges. As simulation, we made two symbolized time-series X and Y as information only flows from X to Y with some delay and magnitude of the delay changes on the way of the time course. These time-series are simulated information flow between brain activities when the conduction velocity becomes faster by learning a short-cut transmission route on the way of the processing in the brain.
4
Results
We analyzed somatosensory evoked potentials (SEPs) evoked by the electrical stimulation on the median nerve of the right hand, as shown in Fig. 4, that shows a typical example chosen from our experimental results for three subjects. The results of the dipole estimation are shown in Fig. 5 for time-series of dipolarity and dipole moments and shown in Fig. 6 for time-series of dipole positions. The result of the stationary analysis for the bi-directional information flows from the SEPs is shown in Fig. 7. The AR model order is selected as 30 where FPE takes a minimal on the way of the analysis. The positive side of the abscissa is the directed transinformation from the time-series of the 1st dipoles to that of the 2nd dipoles and the negative side of the abscissa is the directed transinformation from the time-series of the 2nd dipoles to that of the 1st dipoles, derived by (5). The horizontal scale is a time expression of the delay of m in (5). The result of the non-stationary analysis for the bi-directional information flows is shown in Fig 8. The X and Y are time-series with 2000 samples and are divided 100 samples by 100 samples. The right lower triangle part on X-Y plane shows the information flow from X to Y and the left upper triangle part on X-Y plane shows the information flow from Y to X.
98
N. Take, Y. Kosugi, and T. Musha
Fig. 5. Results of dipolarity and dipole moments (left: before alignment, right: after alignment).
Fig. 6. Result of dipole positions (left: before alignment, right: after alignment, upper: from 14.5 to 16.5 ms latency, lower: from 20.5 to 22.5 ms latency. Length of the moment is 1 mm per 1 µA · mm).
Three-Dimensional Display for Multi-sourced Activities
99
Information Flow (bit/sample)
X (sample) Y (sample)
Fig. 7. Result of stationary information flow from experimental data of SEPs.
5
Fig. 8. Result of non-stationary information flow from simulation data.
Discussion
In comparing with left and right plots in Fig. 5, the irregularity of Moment1 (time-series of the 1st dipole moments) and Moment2 (time-series of the 2nd dipole moments) in the left plots was eliminated in the right plots. And with left and right displays in Fig. 6, the mixed state of the dipole locations in left displays was solved in right displays for both latencies. These results proved the effectiveness of the alignment procedure we proposed. The result of the stationary information flow in Fig. 7 shows that the information mainly flows from the time-series of the 1st dipoles to that of the 2nd dipoles. According to the neurophysiological knowledge of SEPs by the electrical stimulation on the median nerve of the hand, it is considered that the transmission route of the neuronal activities goes through the thalamus at about 15 ms latency, and reaches the somatic sensory area at about 20 ms latency. From Fig. 7, it is possible to consider that the time-series of the 1st dipoles reflects the neuronal activities of the thalamus because the positions are near the center of the sphere (right upper display in Fig. 7), and the time-series of the 2nd dipoles reflects that of the somatic sensory area because the positions are in the left upper part of the sphere (right lower display in Fig. 7). Therefore, The result that the information mainly flows from the 1st dipoles to the 2nd ones, agrees with the knowledge that the neuronal activities move from the thalamus to the somatic sensory area. In addition, the result indicates that there might be a slight information flow from the 2nd dipoles to the 1st ones around the latency of 11 to 15 ms (from -11 to -15 ms in Fig. 7). This indicates that there might be a feedback from the somatic sensory area to the thalamus. The simulated result of the non-stationary information flow in Fig. 8 shows that information flows from X to Y in which the magnitude of delay changes from 500 samples to 200 samples on the way of the time course. It proves that
100
N. Take, Y. Kosugi, and T. Musha
the method is sensitive for information flow between brain activities as delay changes.
6
Conclusion
We proposed the three-dimensional displaying tool totally with analyses and displays to reveal relations between brain activities. And we applied it for actual experimental data of SEPs in the stationary method and for the simulation data in the non-stationary method. In the stationary analysis, the causal relation between the two time-series of the 1st and the 2nd dipoles is revealed, and its intensity is obtained as a time-series of information quantity (bit/sec). We confirmed that there are the bidirectional information flows between them, and mainly the time-series of the 1st dipoles is a cause and that of the 2nd dipoles is a result, in case of the SEPs. And it agreed with the neurophysiological knowledge. In the non-stationary analysis, we confirmed that the method is sensitive for a kind of delay changing in the information processing of the brain. Thus, our tool can be applied for the elucidation of the information processing mechanism and the diagnosis of diseases in the brain.
References 1. N. Take, Y. Kosugi and T. Musha: Estimation of bi-directional information flow in the human brain from evoked potentials by use of dipole tracing method, IEEE EMBC Proceedings, #835 CD-ROM, 2001 2. T. Inouye, K. Shinosaki, A. Iyama and Y. Matsumoto: Localization of activated areas and directional EEG patterns during mental arithmetic, Electroencephalography and clinical Neurophysiology, Vol.86, pp.224–230, 1993 3. T. Musha and Y. Okamoto: Forward and inverse problems of EEG dipole localization, Critical Reviews in Biomedical Engineering, 27 (3-5): pp.189–239, 1999 4. R. N. Kavanagh, T. M. Darcey, D. Lehmann and D. H. Fender: Evaluation of methods for three-dimensional localization of electrical sources in the human brain, IEEE Trans. Biomed. Eng., BME-25, No. 5, pp.421–429, 1978 5. J. A. Nelder and R. Mead: A simplex method for function minimization, Computer Journal, vol. 7, pp.308–313, 1965 6. T. Kamitake, H. Harashima and H. Miyakawa: A time-series analysis method based on the directed transinformation, Electronics and Communications in Japan, Vol. 67-A, No. 6, pp.1–9, 1984 7. H. Marko: The bidirectional communication theory - a generalization of information theory -, IEEE Transactions on Communications, COM-21, No.12, pp.1345–1351, 1973
2D Guide Wire Tracking during Endovascular Interventions S.A.M. Baert and W.J. Niessen Image Sciences Institute, University Medical Center Utrecht Rm E 01.334, P.O.Box 85500, 3508 GA Utrecht, The Netherlands {shirley,wiro}@isi.uu.nl
Abstract. A method to extract and track the position of a guide wire during endovascular interventions under X-ray fluoroscopy is presented and evaluated. The method can be used to improve guide wire visualization in the low quality fluoroscopic images and to estimate the position of the guide wire in world coordinates. A two-step procedure is utilized to track the guide wire in subsequent frames. First a rough estimate of the displacement is obtained using a template matching procedure. Subsequently, the position of the guide wire is determined by fitting a spline to a feature image in which line-like structures are enhanced. In the optimization step, the influence of the scale at which the feature is calculated is investigated. Also, the feature image is calculated both on the original image and on a preprocessed image in which coherent structures are enhanced. Finally, the influence of explicit endpoint detection is studied. The method is evaluated on 267 frames from 10 sequences. Using the automatic method, the guide wire could be tracked in 96% of the frames, with a greater accuracy than three observers. Endpoint detection improved the accuracy of the tip assessment, which was better than 1.3 mm.
1
Introduction
Endovascular interventions are rapidly advancing as an alternative for invasive classical vascular surgery. During these interventions a guide wire is inserted into the groin and is advanced under fluoroscopic guidance. Accurate positioning of the guide wire with respect to the vasculature is a prerequisite for a successful procedure. Especially during neuro-interventions positioning the guide wire correctly is difficult because of the complexity of the vasculature and narrowness of the blood vessels, causing an increase of intervention time and radiation exposure. In this paper an automated method for guide wire tracking during endovascular interventions is considered. The method can be used to improve visualization of the guide wire, potentially enabling a reduction in radiation exposure. It can also be used to detect the position of the guide wire in world coordinates which enables registration with preoperatively acquired images, so as to provide a navigation tool for radiologists. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 101–108, 2002. c Springer-Verlag Berlin Heidelberg 2002
102
S.A.M. Baert and W.J. Niessen
There is relatively little literature on tracking guide wires from 2D fluoroscopy images. The possibility to use guide wire tracking to extract information regarding myocardial function is evaluated in [6]. However, tracking was only performed in a single frame and not in time. Other research has been directed towards active tracking of guide wires and catheters to control their position inside the human body using external devices [9,11], or to reconstruct 3D catheter paths [2]. There has been a considerable amount of work on the enhancement and extraction of curved line structures. In medical imaging, it is used to extract anatomical features such as (centerlines of) blood vessels, e.g. [4,5,7,10]. In this paper, a multiscale method to extract and track guide wires using a spline minimization approach in a feature image is presented. The influence of the scale at which the feature is determined, the use of coherence enhancing diffusion as preprocessing step and specific endpoint detection are studied. The proposed method has been validated by comparing the results to tracings obtained by three observers.
2
Methods
In order to represent the guide wire, a spline parameterization is used. For all experiments in this paper, we used a third order B-spline curve. To determine the position of the spline in frame n + 1 if the position in frame n is known, a two-step procedure is introduced. First, a rigid translation is determined to capture the rough displacement of the spline. Next, a spline optimization procedure is performed in which the spline is allowed to deform for accurate localization of the guide wire. These steps can be understood as a coarse-to-fine strategy, where the first step ensures a sufficiently good initialization for the spline optimization. 2.1
Rigid Transformation
In order to obtain a first rough estimate of the displacement, a binary template is constructed based on the position in the present frame. The best location of this template in the new frame is obtained by determining the highest cross correlation of the frame with a certain search region in this image (or features derived from it (section 2.3)). Only rigid translations are considered in this step. 2.2
Spline Optimization
After performing the rigid translation, the spline is optimized under internal and external forces. The internal constraints are related to the geometry of the curve and influence the length (first derivative of the B-spline) and the bendedness (second derivative of the spline). The parameters for the curvature are set sufficiently large to avoid strange shapes of the spline and sufficiently small to ensure that the internal forces only have a small influence on the total spline energy. For the external forces, the image intensity or a feature image derived
2D Guide Wire Tracking during Endovascular Interventions
103
from it (see section 2.3) is used. The spline contains four or five control points and one hundred sample points. The spline is optimized using Powell’s direction set method [8]. In order to have a minimum length of the spline, the energy E is defined cumulatively for lengths smaller than L, and relatively for larger lengths:
l
E(s) ds (1) 0 max(l, L) Here l is the length of the spline. In all experiments in this paper a minimum length of L = 60 pixels is used. E=
2.3
External Image Force
Using original images for the matching and optimization steps, the guide wire can not effectively be tracked due to presence of other objects in the image and/or due to the low signal-to-noise ratio of the images. Therefore a filter which enhances line-like structures of the correct orientation is considered. Also, the use of coherence enhancing diffusion, as a preprocessing step in order to reduce noise while maintaining line-like structures, is evaluated. Coherence-Enhancing Diffusion. To reduce the noise in the fluoroscopic images a nonlinear diffusion technique is used, in which coherent flow-like textures are enhanced. The diffusion equation is given by ∂t I(x; t) = ∇ · (D∇I(x; t))
(2)
where D denotes a diffusion tensor that can be chosen such that coherent structures are enhanced [12]. This diffusion tensor depends on the structure tensor M, given by M = ∇I(x; τ )∇I(x; τ )T ,
(3)
eigenwith eigenvalues µ1 and µ2 (µ1 ≥ µ2 ) and the corresponding orthonormal √ vectors v1 and v2 . The gradient is computed at scale σn = 2τ , τ > 0. Using diffusion based on the structure tensor not only the amount but also the direction of diffusion can be regulated. Smoothing along the coherence direction v2 with a diffusity λ2 which increases with respect to the coherence (µ1 − µ2 )2 , gives an enhancement of the coherent structures in an image. This is achieved by constructing D from the following system of orthonormal eigenvectors v1 ∇I(x; τ ), v2 ⊥∇I(x; τ ),
(4) (5)
and eigenvalues λ1 = α, α λ2 = −C α + (1 − α)exp (µ −µ )2 1 2
(6) if µ1 = µ2 , else
(7)
104
S.A.M. Baert and W.J. Niessen
with C ≥ 0 and α ∈ (0, 1) which keeps D uniformly positive definite. Figure 1 shows an example of a frame preprocessed using coherence enhancing diffusion.
Fig. 1. From left to right: The original image and image preprocessed using coherence enhancing diffusion with t = 5, t = 20 and t = 100.
Feature Image. To determine the optimal spline position, a feature image is derived in which line-like structures are enhanced. The feature image is determined on the original image and the image preprocessed with coherence enhancing diffusion. Hereto, the eigenvalues λ1 , λ2 of the Hessian matrix calculated at scale σ are considered: 1 2 2 λ1,2 (x, σ) = Ixx + Iyy ± (Ixx − Iyy ) + 4Ixy (8) 2 where Ixy represents the convolution with the scaled Gaussian derivative. On line-like structures the largest absolute eigenvalue λ1 has a large output. Since we are interested in dark elongated structures on a brighter background, only positive values of λ1 are considered; pixels with negative values of λ1 are set to zero. The feature image is subsequently constructed by inverting this image since the optimization is based on a minimum cost approach. To effectively attract the guide wire only to line structures with similar orientation, we also use directional information in the optimization scheme. Hereto, the inner product between the spline and the orientation of the feature is used given by e2 · x ˆi ) (9) O(ˆ xi ) = λ1 (ˆ where ˆ e2 is the normalized eigenvector corresponding to λ2 and x ˆi is the normalized first derivative of the spline in sample point i. To be sensitive to guide wires of different width, and to reduce sensitivity to noise, the feature image can be calculated at multiple scales (σ). This can also be used to enable a coarse-to-fine optimization strategy. 2.4
Explicit Endpoint Detection
After the fitting procedure, the endpoint of the spline is not necessarily positioned on the endpoint of the guide wire. In order to determine the endpoint,
2D Guide Wire Tracking during Endovascular Interventions
105
the length of the guide wire is increased at the tip by setting L equal to l + ∆L, see Equation 1, while fixing the tail position. This procedure is carried out iteratively such that the endpoint of the spline is advanced beyond the endpoint of the guide wire. From the final spline position, a graph is constructed presenting the likeliness P (i) of each sample point i on the spline to represent the guide wire endpoint. Two criteria are used to determine this likeliness, viz. the proximity to the previous endpoint position, and the derivative of the feature image in i along the spline: ||∆xn(i),n−1 || − ||∆xn−1,n−2 || · ∇F (i)(σgrad ) (10) P (i) = exp − σprox 2 where the first term compares the displacement ∆xn(i),n−1 of a candidate endpoint i in the current frame with the displacement ∆xn−2,n−1 in the previous frame, favoring similar displacements using a Gaussian weigthing function with standard deviation σprox . The value of σprox has been obtained from analysis of the changes in displacements that were observed in a large number of image sequences. ∇F represents the gradient of the feature image. In order to be robust to noise, a coarse-to-fine approach is used. First at a large scale σgrad the gradient maximum is determined whereas precise localization is achieved at smaller scales.
3
Evaluation
The method was applied on ten image sequences, with a sequence length between 14 and 50 frames, with a total of 267 frames. The image series were acquired on a H5000, H3000 and a BV5000 X-ray fluoroscopy system (Philips Medical Systems, Best, the Netherlands). Only J-tipped guide wires were used during the interventions. To evaluate the automatic method, a golden standard is constructed for every dataset. Therefore, three observers have manually outlined the guide wire in every image. This process was repeated after two weeks to limit the dependence between tracings. These six manually obtained paths are averaged to determine the average observer path C (“golden standard”), to which all individual paths are compared. Intra-observer variability can then be measured and inter-observer variability is defined as the distance betweeen individual tracings of observers and the golden standard. Likewise, to determine the accuracy of the automated method the distance between the determined spline and the golden standard is measured. A method to construct the average path C and a definition of the distance between two paths is required, for which more details can be found in [1].
4
Results
The performance of the method is evaluated on feature images calculated on both the original images and images preprocessed with coherence enhancing diffusion.
106
S.A.M. Baert and W.J. Niessen
Also the influence of explicit endpoint detection is investigated. An example of the tracking results is shown in Figure 2. All parameters in the experiments were kept fixed for all image sequences. Coherence enhancing diffusion was applied with C = 1, α = 0.001, and the evolution was stopped at t = 20. The feature
Fig. 2. An image sequence of the thorax with in white the parametrized spline representing the guide wire. The method was applied on the feature image in which the eigenvalues of the Hessian matrix were calculated with σ = 1.5.
enhancement step is carried out on three different scales (σ = 1, 1.5 and 3 pixel units) which implies that a total of twelve possibilities are investigated. For the endpoint detection two scales are used (σgrad = 3 and 1.5 pixels) and σprox is set to 7 pixels. Table 1 shows the intra and inter-observer variability and the results of the method without coherence enhancing diffusion as preprocessing step. The mean distance is the average distance between the corresponding parts of the splines as described in the previous section. The tip distance is the distance between the endpoint of the golden standard and the endpoint of the automatic determined spline. Between the brackets, the maximum distance is represented. It can be observed from Table 1 that best results are obtained at a scale σ = 1.5 pixel units. With these settings, the mean error of the automatic method (0.92 pixels) is smaller than the inter-observer variability (1.04 pixels). Moreover, the method only requires initialization in one frame, and reproducibility was better than intra-observer variability. Owing to motion blur, the guide wire can sometimes become invisible in a number of sequences, which causes the spline to be incorrectly placed. At a scale σ = 1.5 pixel units the number of outliers in our evaluation was ten frames (out of 267 frames) occurring in four sequences (out of ten sequences). These failures appeared mostly in a single frame. Without manual intervention, the guide wire was tracked correctly in the subsequent frames. For the tip distance we can observe that the intra- and inter-observer variability (1.09 and 1.46 pixels, respectively) is smaller than the error by the automatic tracking method (4.38 pixels) for σ = 1.5. This error improved to 3.07 pixels by applying explicit endpoint detection, however the mean distance increased slightly in this case. Since the pixelsize is approximately 0.4 mm, the tip error is smaller than 1.3 millimeters.
2D Guide Wire Tracking during Endovascular Interventions
107
Table 1. The mean intra- and inter-observer variability and the mean result of the automatic method using the Hessian feature filter with and without specific endpoint detection. Mean distance [pixels]
Tip distance [pixels]
Mean distance
Tip distance
with Endpoint
with Endpoint
Intra observer
0.66 [1.46]
1.09 [1.77]
Inter observer
1.04 [2.19]
1.46 [2.44]
σ = 1.0
1.26 [3.10]
5.81 [9.35]
1.22 [2.15]
4.20 [8.56]
σ = 1.5
0.92 [1.49]
4.38 [8.20]
1.13 [2.02]
3.07 [4.20]
σ = 3.0
1.05 [1.47]
5.70 [10.92]
1.29 [2.15]
3.80 [5.77]
Table 2 shows the results obtained using the feature image calculated with the coherence enhancing diffusion method as a preprocessing step prior to enhanceing line-like structures. We can observe that the distance between the automatic determined spline and the golden standard is smaller than the inter-observer variability for all three different scales. The number of total outliers for this method was 10 frames out of 267 frames for σ = 1.5. The failures appeared mostly in a single frame, for example due to motion blur in the image sequence. Results did not degrade significantly when altering the scale in the step of enhancing the guide wire. Specific endpoint detection improved the error for the tip distance, but it slightly increased the mean distance. Table 2. The mean result of the automatic method using the coherence enhancing diffusion filter with and without specific endpoint detection. Mean distance [pixels]
Tip distance [pixels]
Mean distance
Tip distance
with Endpoint
with Endpoint
σ = 1.0
1.04 [1.74]
4.95 [12.50]
1.31 [2.32]
3.60 [5.88]
σ = 1.5
0.96 [1.67]
4.33 [6.18]
1.13 [2.16]
3.26 [5.26]
σ = 3.0
0.92 [1.50]
5.33 [11.89]
1.13 [2.15]
4.01 [7.48]
5
Discussion
A method has been developed to track the guide wire automatically in fluoroscopic guided interventions. During these interventions, 12.5 frames per second are acquired, so manual outlining is not an option. The method is based on a spline optimization in an image where line-like structures with correct orientation are enhanced, with or without coherence enhancing diffusion as a preprocessing
108
S.A.M. Baert and W.J. Niessen
step and with or without explicit endpoint detection. In order to assess whether the proposed method is sufficiently accurate, tracings of observers were acquired in 267 frames. Both with and without preprocessing, the accuracy of spline localization was better than inter-observer variability and the method detected the guide wire correctly in 96% of the frames. Outliers occurred mainly in one single frame owing to motion blur. Given the high temporal resolution (12.5 frames/second) missing one frame does not hamper the interventional radiologist. In fact, in most of these frames the guide wire is not visible to the radiologist either. The tip of the guide wire could be localized within an accuracy of approximately 1.3 mm, using explicit endpoint localization. Whereas the use of coherence enhancing diffusion did not significantly improve the accuracy if the proper scale was used for enhancing line-like structures, the results were more robust with respect to changing the scale parameter. This indicates that the use of coherence enhancing diffusion is useful for robustness in clinical practice.
References 1. C.M. van Bemmel, W.J. Niessen, O. Wink, B. Verdonck and M.A. Viergever, Blood Pool Agent CE-MRA: Improved Arterial Visualization of the Aortoiliac Vasculature in the Steady-State Using First-Pass Data, in Proceedings of MICCAI 2001, Lecture Notes in Computer Science Vol. 2208, pp. 699-706. 2. H.-J. Bender, R. M¨ anner, C. Poliwoda, S. Roth, M. Walz, Reconstruction of 3D Catheter Paths from 2D X-ray Projections, in Proceedings of MICCAI 1999, Lecture Notes in Computer Science Vol.1679, pp 981-989. 3. M. Kass, A. Witkin, D. Terzopoulos, Snakes: Active contour models, International Journal of Computer Vision 1987, Vol. 1, No. 4, pp. 321-331. 4. A.K. Klein, F. Lee, and A.A. Amini Quantitative Coronary Angiography with Deformable Spline Models, IEEE Trans. on Med. Imaging 1997, Vol.16, 5: pp. 468-482. 5. R. Kutka and S. Stier, Extraction of Line Properties Based on Direction Fields, IEEE Transactions on Medical Imaging 1996, Vol.15, No.1: pp. 51-58. 6. D. Palti-Wasserman, A.M. Brukstein, R.P. Beyar, Identifying and Tracking a Guide Wire in the Coronary Arteries During Angioplasty from X-Ray Images, IEEE Transactions on Biomedical Engineering 1997, Vol.44, No.2: pp. 152-164. 7. R. Poli, G. Valli, An Algorithm for Real-time Vessel Enhancement and Detection, Computer Methods and Programs in Biomedicine 1997 No. 52: pp. 1-22. 8. W.H. Press, S.A. Teukolsky, W.T. Vetterling, B.P. Flannery, Numerical Recipes in C: The art of scientific computing, 1992. 9. J. Ragasa, N. Shan, R.W. Watson, Where antecubital catheters go: a study under fluoroscopic control, Anesthesiology 1989, No. 71: pp. 378-380. 10. Y. Sato, S. Nakajima, H. Atsumi, S. Yoshida, Th. Koller, G. Gerig, R. Kikinis, Three-dimensional Multi-scale Line Filter for Segmentation and Visualization of Curvilinear Structures in Medical Images, Med. Im. An. 1998, Vol.2, 2: pp. 143-168. 11. H. Starkhammar, M. Bengtsson, D.A. Kay, Cath-Finder Catheter Tracking System: a new device for positioning of central venous catheters. Early experiments from implantation of brachial portal systems, Acta Anaes.Scand.1990,No.34: pp.296-300. 12. J. Weickert, Coherence-Enhancing Diffusion Filtering, International Journal of Computer Vision 1999, Vol.31, No.2/3, pp 111-127.
Specification Method of Surface Measurement for Surgical Navigation: Ridgeline Based Organ Registration Naomichi Furushiro1, Tomoharu Saito2, Yoshitaka Masutani3, and Ichiro Sakuma2 1
Department of Precision Engineering, Graduate School of Engineering, the University of Tokyo, Hongo 7 3 1, 1138656 Tokyo Bunkyo, Japan
[email protected] 2 Institute of Environmental Studies, Graduate School of Frontier Sciences, the University of Tokyo, Hongo 7 3 1, 1130033 Tokyo Bunkyo, Japan {tomoharu,sakuma}@miki.pe.u-tokyo.ac.jp 3 Department of Radiology, Faculty of Medicine, the University of Tokyo, Hongo 7 3 1, 1130033 Tokyo Bunkyo, Japan
[email protected] Abstract. Surgical navigation for abdominal organs has difficulties, such as dynamic deformation, compared with other organs (i.e. brain, bone). Organ deformations prevent surgical navigators from performing accurate navigation based on preoperative information. We are studying on a method for deforming preoperative organ models so that the models are matched to intraoperative shapes. The method is based on the ICP (iterative closest point) algorithm and modal representation of shape deformation. In this paper, we describe preliminary experiments for rigid parameter estimation in the entire registration process, by using range data and surface model reconstructed from X-ray CT of a liver phantom.
1
Introduction
One of the difficulties in surgical navigation based on preoperative image information is intraoperative deformation of the organs consisting of soft tissues. In neurosurgery, for instance, “brain-shift” is recognized as an essential reason of navigation errors. A simulation of brain deformation was reported by Ferrant [1], based on the finite element method (FEM) for reduction of errors in neurosurgical navigation. In hepatic surgery, however, the shape of liver deforms more dynamically due to patient respirations, posture changes, and surgical operations. Those intraoperative liver deformation includes so-called large displacement, and therefore it requires much more computation cost for numerical simulation based on non-linear FEM. Cotin [2] and Picinbono [3] reported development of a surgical simulator based on fast computation technique of non-linear FEM to show realistic liver deformation of large displacement. For our purpose of surgical navigation, however, relationship between force and displacement is not important, but deformation of preoperative models for registration to intraoperative shape is indispensable for surgical guidance. Herline [4] reported rigid liver surface registration for such purpose. In that literature, many methods for deformation description can be found, including parametric description for non-rigid tracking of objects, and for animation [5-8]. In such parametric descripT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 109–115, 2002. © Springer-Verlag Berlin Heidelberg 2002
110
N. Furushiro et al.
tions, generally, shape deformation should be represented with fewer parameters for faster operation. Masutani [9] reported a new method of modal representation of liver deformation applied for intra-operative non-rigid registration in image-guided liver surgery. In that literature, several experiments with synthetic range data were performed based on error factor analyses. In this paper, we present detailed a new method of rigid-body registration using ridgelines extracted from range images.
2
Materials and Methods
2.1
Liver Phantom and Its Reconstructed 3-D Model
We made a full-scale model of a liver with silicone rubber. Fig. 1. shows the liver phantom model. The size of the phantom is about 210 mm × 180 mm, and the thickness is about 100 mm to 200 mm. We took a series of X-ray CT scan of the phantom and reconstructed the data. Fig. 2. shows the reconstructed 3-D model of the liver phantom.
Fig. 1. A silicone rubber phantom model of a liver. The shape of the liver was segmented from a series of abdominal X-ray CT images
Fig. 2. 3-D reconstructed model of the liver phantom shown in Fig. 1. Shape data of the phantom was acquired by X-ray CT scanning in 0.9 mm thick
2.2
Range Images and Surface Data Acquisition
We use a range sensor based on space encoding method. The sensor has 0.5 mm accuracy, and its view area is about 200 mm × 200 mm at 300 mm away from the sensor. We can take a picture and a range image of a view with the same optical axis in one second. Fig. 3. shows a range image of the liver phantom model, and Fig. 4. shows its photo image. The resolutions of the sensor are 512 × 240 pixels at view plane, and 8 bit (256 scale) in depth direction. We also obtain 3-D coordinate values at each point of a view plane. The origin point of the coordinate system is 300 mm Away from the sensor in depth direction. Gray values of a range image represent the depth at each point; the origin plane is black, and the nearer a point, the brighter the color of the point become. 2.3
Ridgeline Extraction from Range Images
Fig. 5. shows the conceptual diagram of our system . Ridgeline extraction is as follows.
Specification Method of Surface Measurement for Surgical Navigation
111
Fig. 3. A range image of the phantom obtained with range sensor from the distance of 220 mm
Fig. 4. A photo image of the phantom taken at the same time with a range image (Fig. 3)
Fig. 5. Conceptual diagram of the system
Preparation of Range Images We apply a normal distribution gauss function filter to range images to reduce noises and smooth the images (1). & x2 + y2 # 1 ! g (x, y; t ) = exp$$ − (1) 2πt 2t !" % where t represents the radius of the gauss function filter. Curvature Computation We compute curvatures of each point using its gray values after gaussian filter. Here we define Lx as the first differential value of gray value L, and also define Ly, Lxx, Lxy, and Lyy in the same manner, then we can compute the principal curvatures (2).
κ1 =
− ( Lxx − L yy ) + ( Lxx − L yy ) 2 + 4 L2xy
κ2 =
2 1 + L2x + L2y − ( Lxx − L yy ) − ( Lxx − L yy ) 2 1 + L2x + L2y
(2) 2
+ 4 L2xy
112
N. Furushiro et al.
Classification of Shapes We can classify the shape of each point using the principal curvatures. The classification is carried out along the shape index S, which represents the ratio of principal curvatures and its value is between zero and one (3). Table 1. shows how shapes change according to the value of S. S=
κ +κ2 1 1 + tan −1 1 2 π κ1 − κ 2
(3)
Table 1. The Classification of Shapes with the value of S Shape type Pit Valley Saddle Ridge Peak
S 0 to 0.125 0.125 to 0.375 0.375 to 0.625 0.625 to 0.875 0.875 to 1
We also compute the curvedness R, which represents the magnitude of the curvature at each point (4). R=
κ12 + κ 22
(4)
2
Ridgeline Extraction We extract ridge area weighting curvedness R with the shape index S (5). & (0.75 − S )2 R ′ = R × exp$ − $ σr %
# ! ! "
Fig. 6. shows the area for weighting R.
Fig. 6. Weighting factor for the curvedness R along with the shape index S
(5)
Specification Method of Surface Measurement for Surgical Navigation
2.4
113
Registration Method
We use a curve matching method with ICP (Iterative Closest Point) algorithm to register these extracted ridgelines. Before registering measured surface data with model data, we compute ridgeline of the model. Fig. 7. shows the ridgelines of 3-D reconstructed model. After the ridgeline registration, we register measured surface point data to the surface points of model data with ICP algorithm.
Fig. 7. Ridge areas of 3-D reconstructed model of the phantom
3
Results
3.1
Ridgeline Extraction from Range Images
We extract a ridgeline from the range image shown at Fig. 3. The ridgeline locates at the front side of the liver phantom model. The parameters to extract this are gaussian filter radius, weighting area limitation. We use 7 as gaussian filter radius for equation (1), and 0.01 as weighting area limitation number for equation (5). Fig. 8. shows the result of extracted ridgeline image.
Fig. 8. A ridgeline extracted from the range image of Fig. 3
Fig. 9. Result of ridgeline registration. (a)Before registration, (b)After registration (a)
3.2
(b)
Ridgeline Registration Result
Before the ridgeline registration, RMS error between the measured surface point data and the surface data of model was 36.83 mm. After the ridgeline registration, RMS error became 10.24 mm. Fig. 9. shows the locations of measured data and the model data.
114
N. Furushiro et al.
3.3
Point Based Registration
After the ridgeline registration, we apply measured point data to ICP algorithm to register these data to the surface of the model. The number of measured points was 1528, and that of the model was 5479. The algorithm converged at 11th iteration, and the final RMS error was 1.68 mm. Fig. 10. shows the final location of measured surface and the model.
Fig. 10. Registered range data of the phantom with 3D reconstructed model of the phantom
4
Discussion
One of the most important properties of our method is that our system is independent to particular points of organs. To achieve the registration, we prepare ridgeline based posture estimation. This method is not only for the use of celiotomies, but also endoscopic surgery. Our purposes of surgical navigation system development include guidance of surgical robots. One of the potential advantages of such robotic surgery is that surgical operations can be carried out with minimal deformation of organs. Therefore, robotic surgery with navigational information based on our registration method is expected to realize more precise and minimally invasive surgeries.
5
Summary
For intra-operative rigid registration in image -guided liver surgery, a new method for surface measurement based registration was proposed. By using a liver phantom model, the registration error for frontal displacement was aligned. Toward feasibility study in clinical environment, studies using deformed phantom, and collection of intraoperative data are currently in progress.
Acknowledgements This work is a part of the research project: development of robotic surgery system in Research for the Future Program of Japan Society for Promotion of Science (JSPS) and is financially supported by JSPS. The authors are grateful to Dr. Makoto Hashidume for clinical advice, and to all the project members for the inspiring discussion, and to Dr. Shigeru Nawano in the eastern hospital of National Institute of Cancer Center – Japan, for providing the patient CT data set used in this study.
Specification Method of Surface Measurement for Surgical Navigation
115
References 1. M. Ferrant, et al. Registration of 3D Intraoperative MR Images of the Brain Using a Finite Element Biomechanical Model proc. of MICCAI2000 pp.19-27 2000 2. S. Cotin, et al. Real-Time Elastic Deformations of Soft Tissues for Surgery Simulation IEEE trans on Visualization and CG vol.5 no.1 pp62-73 1999 3. G. Picinbono, et al. Non-Linear Anisotropic Elasticity for Real-Time Surgery Simulation INRIA tech. rep. No4028 2000 4. J. Herline, et al. Surface Registration for Use in Interactive Image –Guided Liver Surgery proc. of MICCAI99 1999 5. D. Terzopoulos,et al. Dynamic 3D Models with Local and Global Deformations: Deformable Superquadradrics IEEE trans on PAMI vol.13 no.7 pp703-714 1991 6. S. Sclaroff, et al. Modal Matching for Correspondence and Recognition Boston U. tech. rep. TR95-008, 1996 7. G. Szekely, et al. Segmentation of 2-D and 3-D objects from MRI volume data using constrained elastic deformations of flexible Fourier contour and surface models Medical Image Analysis vol.1 no.1 pp19-34, 1996 8. G. C. H. Chuang, et al.Wavelet Descriptor of Planar Curves: Theory and Applications IEEE trans on Image Proc. vol.5 no.1pp.56-70 1996 9. Y. Masutani and F. Kimura A New Modal Representation of Liver Deformation for NonRigid Registration in Image-Guided Surgery, proc. Of CARS2001 pp.19-24 2001
An Augmented Reality Navigation System with a Single-Camera Tracker: System Design and Needle Biopsy Phantom Trial F. Sauer, A. Khamene, and S. Vogt Imaging & Visualization Dept, Siemens Corporate Research, 755 College Road East, Princeton, NJ 08540, USA {sauer,khamene,vogt}@scr.siemens.com
Abstract. We extended a system for augmented reality visualization to include the capability for instrument tracking. The original system is based on a videosee-through head-mounted display and features single-camera tracking. The tracking camera is head-mounted, rigidly fixed to a stereo pair of cameras that provide a live video view of a workspace. The tracker camera includes an infrared illuminator and works in conjunction with a set of retroreflective markers that are placed around the workspace. This marker frame configuration delivers excellent pose information for a stable overlay of graphics onto the video images. Using the single camera also for instrument tracking with relatively small marker clusters, however, encounters problems of marker identification and of noise in the pose data. We present a multilevel planar marker design, which we used to build a needle placement phantom. In this phantom, we achieved a stable augmentation; the user can see the location of the hidden target and the needle without perceptible jitter of the overlaid graphics. Piercing the needle through a foam window and hitting the target is then intuitive and comfortable. Over a hundred users have tested the system, and are consistently able to correctly place the needle on the 6mm target without prior training.
1
Introduction
Image guidance systems help the physician to establish a mapping between a patient’s medical images and the physical body. In conventional systems, a pointer or an instrument is tracked and the location visualized in the medical images. In contrast, augmented reality (AR) image guidance maps the medical data onto the patient’s body. Anatomical structures are being perceived in the location where they actually are – the patient becomes transparent to the physician. Augmented reality for medical applications has first been suggested in [1], and various groups have since been working on realizing augmented reality systems, based on overlaying graphics onto video streams [2-5], on “injecting” graphics overlays into operating microscopes [6-8], or simply by using semitransparent graphics display configurations through which the user observes the real world. Reference [9] compares some of the efforts. We built an AR system [10] that makes use of a video-see-through head-mounted display (HMD) similar to the one described in [1]. Two miniature color video camT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 116–124, 2002. © Springer-Verlag Berlin Heidelberg 2002
An Augmented Reality Navigation System with a Single-Camera Tracker
117
eras are mounted on the HMD as the user’s artificial eyes. The two live video streams are augmented with computer graphics and displayed on the HMD’s two screens in realtime. With the HMD, the user can move around and explore the augmented scene from a variety of viewpoints. The user’s spatial perception is based on stereo depth cues, and also on the kinetic depth cues that he receives with the viewpoint variations. Our system has been put into a neurosurgical context [11], adapted to an interventional MRI operating room [12,13], and has also been integrated with an ultrasound scanner [14,15]. Tracking is an essential enabling technology both for conventional and AR navigation systems. Commercial systems either employ optical or magnetic trackers. Optical trackers achieve a higher accuracy, with the requirement of an unobstructed lineof-sight between the tracker camera and the tracked markers. The commercial optical tracking systems are all multi-camera systems. They find the 2D marker locations in the cameras’ images and determine their 3D location by triangulation. The most popular optical tracking system in the medical arena is a stereo camera system. Our AR system‘s special feature is the use of single camera tracking with a headmounted tracking camera, which is rigidly attached to the two cameras that capture the stereo view of the scene. Originally we used this tracking camera only in combination with a set of markers framing a workspace. In the current paper, we describe how we extended our single-camera tracking to include instrument tracking with marker clusters. We achieved stable tracking with a cluster that extends only over a small area in the tracker camera’s image. We built a needle placement phantom, where we simultaneously track the phantom with a frame of markers and the needle with the cluster of markers. The tracking works in a very stable manner; targets and needle can be visualized graphically in the augmented view without perceivable jitter. More than one hundred users tried the needle experiment and consistently were able to correctly hit a chosen 6 mm target with the needle. No training was required to succeed with the needle placement. The AR guidance was experienced as very intuitive and comfortable. In section 2, we present technical details of our AR system. Section 3 describes the needle placement phantom. The paper then concludes with a summary in section 4.
2
AR System Details
2.1
System Overview
The centerpiece of the system is a head-mounted display that provides the user with the augmented vision. Figures 1 and 2 show how three miniature cameras are rigidly mounted on top of the HMD. A stereo pair of color cameras captures live images of the scene. They are focused to about arm’s length distance and are tilted downward so that the user can keep his head in a comfortable straight pose. The third camera is used for tracking retroreflective markers in the scene. This black-and-white camera is sensitive only for near infrared wavelengths. It is equipped with a wide angle lens and a ring-shaped infrared LED flash. The flash is synchronized with the tracking camera and allows us to select a fast speed of its electronic shutter. The short exposure time of only 0.36 ms efficiently suppresses background light in the tracker camera’s images, even when the scene is lit with strong incandescent or halogen lamps.
118
F. Sauer, A. Khamene, and S. Vogt
Mounting the tracking camera on the user’s head helps with the line-of-sight restriction of optical tracking; the user cannot step into the tracker camera’s way (though he still can, of course, occlude markers with his hands). Placing a forward looking tracking camera on the head is optimal for the perceived accuracy of the augmentation, as the tracker camera’s sensitivity to registration errors is matched to the user’s sensitivity to perceive these errors. Furthermore, this configuration makes good use of the tracker camera’s field of view. Tracking is only required when the user actually looks at the workspace. And then the tracker camera is automatically looking at the workspace markers. For this reason, the markers can extend over a sizeable part of the tracker camera’s image, yielding good tracking accuracy.
Fig. 1. Video-see-through HMD with mounted tracking camera
Fig. 2. Camera triplet with a stereo pair of camera to capture the scene and a dedicated tracking camera with infrared LED flash
Display and cameras are connected to two PCs. One SGI 540 processes the tracker camera images and renders the augmented view for the left eye, an SGI 320 renders the augmented view for the right eye. Both PCs communicate over an Ethernet connection to exchange information concerning camera pose, synchronization, and choice of graphics objects to be used for augmentation. Table 1 lists the particular hardware components that we are using. Table 1. Hardware Components HMD Scene cameras Tracker camera Computers
2.2
Kaiser Proview XL35, XGA resolution, 35° diagonal FOV Panasonic GP-KS1000 with 15mm lens, 30°diagonal FOV Sony XC-77RR with 4.8mm lens, 90° horizontal FOV SGI 540 and 320 with Windows 2000
Single Camera Tracking
We want to render a computer generated 3D object onto a real-world video sequence in a way that the 3D graphics object is accurately aligned with respect to some real object seen in the video sequence. For this, we need to know the relative location and orientation of video camera and objects of interest. Or in other words, we need to
An Augmented Reality Navigation System with a Single-Camera Tracker
119
know the relationship between two coordinate systems, one attached to the camera, the other attached to the object. Registration initially establishes this relationship in terms of translation and rotation. Tracking denotes the process of keeping track of it. Single camera tracking is possible when the geometry of the tracked object is known and the internal camera parameters have been pre-determined in a calibration procedure. We fabricated objects for camera calibration and for tracking with retroreflective disc shaped markers. We then base our system calibration on 3D-2D point correspondences. The 3D coordinates of the markers we measured with commercial stereo system made by the German company A.R.T. GmbH, the 2D positions we determine from the images we take with the camera. We follow Tsai’s calibration algorithm [8,9], benefiting from an implementation that is available as freeware at http://www.cs.cmu.edu/~cil/v-source.html. The camera calibration object contains over one hundred markers [10], which allows us to estimate the internal camera parameters with sufficient accuracy. The marker sets for tracking then need to provide us with at least seven point correspondences so that we can calculate the external pose, i.e. translation and rotation, for the given camera’s internal camera parameters. For the calibration of our camera triplet (Fig. 2), we determine the internal camera parameters for all three cameras, and the relative external pose between the tracker camera and the two scene cameras. In the realtime tracking mode, we then deduce the pose parameters of the two scene cameras from the measured pose of the tracking camera, which allows us to augment the scene camera images with correctly registered graphics. 2.3
Marker Configuration Design
In our original tabletop system [10], we placed seventeen markers around a workspace. The markers were all lying in the same plane, framing the workspace in three straight lines on the sides and on the top. This marker configuration provided very stable pose estimation in conjunction with the head-mounted tracking camera. The augmented views did not show any perceivable jitter. One main reason for the good results was that the marker frame extended over large part of the tracking camera’s image, providing very good leverage for precise pose estimation. A subsequent system was designed for a neurosurgical setting. A curved frame of markers was fitted onto a head clamp [12,13]. In the first version, the marker locations on this frame were still all coplanar. The resulting pose estimation was in general also still very good. For some viewpoints, however, some jitter could now be perceived. We assume that the slight performance deterioration was at least partially to blame on the reduced number of markers. We added two markers on little posts, sticking out of the plane. This increased the number of markers; at the same time, it turned the co-planar marker configuration into a 3D marker configuration. Now, the tracking was again “perfect”, i.e. we could again not perceive any jitter in the augmented views. We used the same marker frame design for the needle placement phantom that we describe in section 3. Fig. 3 shows a photo of this marker frame. We were designing our marker configurations mainly based on heuristic reasoning, not with strict mathematical simulation. One relationship seemed obvious: the larger the extent of the marker body in the camera image, the more precise the result of the pose determination in regard to the rotation. Large marker configurations are fine as
120
F. Sauer, A. Khamene, and S. Vogt
workspace frames. For instrument tracking, however, large marker configurations are not practical. We want to use small marker clusters, which do not get into the way when handling the instrument, and which we can keep apart from the markers that frame the workspace. We found that we do not obtain stable pose estimation from small clusters when the markers are distributed in a coplanar fashion. For a reliable estimation of the rotation, we need to distribute the markers in 3D. Fig. 4 shows a biopsy needle, and attached to it a marker cluster design that we found to be efficient: it provides good pose results, and it is simple to fabricate at the same time. Flat disc-shaped markers are arranged in a multilevel planar fashion. For a given lateral extent of the marker body, there is a trade-off between its depth extent, and the range of viewing angles for which the markers are seen as separate entities in the tracking camera’s image. Therefore, one wants to spread the markers out evenly. In our design, one marker is placed in the center, the other markers are arranged on a circle around it. The marker body shown in Fig. 4 measures about 8 cm in diameter, and is built from 6 mm thick material. The markers are arranged on several depth levels: The central marker sits two levels (1.2 cm) below the main level, three of the peripheral markers are placed two respectively three levels (1.2 cm and 1.8 cm) above the main level. “High” and “low” markers are mostly alternated in neighboring positions. The tracking camera can reliably locate the individual markers while tilting the marker body within an angle range of about 45° from the normal direction (i.e. the direction where the marker body directly faces the camera). As can be seen on Fig. 4, we attach the marker body to the needle in a tilted way, so that the markers look towards the head-mounted tracking camera when the user holds the needle in a comfortable standard position.
Fig. 3. Marker frame
2.4
Fig. 4. Multilevel planar marker cluster attached to biopsy needle
System Performance
Our AR video system is running at the full standard video rate of 30 frames per second. We synchronize video and graphics, eliminating any time lag between the real and the virtual objects. The virtual objects do not lag behind, neither does one see them swim or jitter with respect to the real scene. As the augmented view shows the graphics firmly anchored in the real scene, the user can assess the information in a comfortable way. Overall, there is a time delay of about 0.1seconds between an actual event and its display to the user.
An Augmented Reality Navigation System with a Single-Camera Tracker
121
We measured the overlay accuracy of our original system. Evaluating a set of augmented video images, we found the mismatch between calibration marks and their overlaid graphical counterparts to be typically smaller than 1 mm in object space, going up to 2 mm at the edges of the images. We do not have measurements for the needle placement configuration described in the present paper, but expect the accuracy to be in the same range. This is supported by simple visual inspection of the real needle as it appears in the video image and the virtual needle that is overlaid onto it. There is no apparent jitter in the overlay, so that such accuracy estimation can be performed with ease.
3
Needle Placement Phantom
3.1
Design
For a needle placement experiment, we designed a box with a set of mechanical pushbuttons. The pushbuttons are like small pistons (Fig. 5) with a head diameter of 6 mm. Pushing down a piston in turn depresses a key of an underlying USB keypad. The keypad is connected to the computer and allows us to provide feedback to the user when he or she correctly places the needle onto one of the piston targets. The targets are accessible through a round window on the slanted top face of the box (Fig. 6). A foam pad covers the window to hide the targets from the user’s direct view. We chose a 5 cm thickness for the foam pad so that it provides mechanical resistance to the needle insertion. The targets lie about 7 cm below the top surface of the foam pad. Fig. 7 shows the box for the needle placement experiment with the foam pad in place. It is sitting on a platform with a marker frame that contains seven coplanar markers on a half circle plus two additional ones that stick out on little posts. We also put retroreflective markers onto the heads of the piston targets. This allowed us to acquire the location of all the targets with respect to the marker frame coordinate system in a single measurement, using our stereo camera system ARTtrack.
Fig. 5. Piston targets for needle placement
Fig. 6. View through window onto targets
122
F. Sauer, A. Khamene, and S. Vogt
3.2
Visualization
We visualize the top surface of the targets as flat discs. We surround each virtual target disc with a ring, rendered as a shaded torus. This torus helps with the 3D perception of the target location. We show the disc-torus target structure in a red color, which switches to a green color when the needle is pointing towards the target. The needle itself is visualized as a blue wireframe cylinder. A yellow wireframe cylinder marks the extrapolation of the needle path. Observing the intersection of the path cylinder intersects with the disc target, the user can easily see whether the needle is correctly pointing towards the target. Fig. 8 shows an example of an augmented view that guides the user. The needle is already partially inserted through the foam window, positioned about 1 cm above and correctly pointing to one of five targets shown.
Fig. 7. Phantom box with foam window and frame of markers
3.3
Fig. 8. Augmented view for needle guidance
Needle Placement Experiment
Over one hundred users have tested the needle experiment. We usually slide the cover aside to show the real targets to the user, before we hide them again with the foam window. We then explain the needle visualization so that the user understands how to judge correct orientation and insertion depth of the needle. The user is now asked to perform three needle placements. Initially, three targets are being shown. When the user correctly hits one of the targets (i.e. depresses the real piston with the needle), an audio signal sounds and the corresponding virtual target fades away. Consistently, the users were able to hit the targets. In fact, the visualization is so intuitive and the visual feedback so conclusive, that one is basically not able to miss once one understands the basic concept. Most people grasped the concept immediately, some after a bit of experimentation. Even for the latter group the learning curve was below a minute. A couple of test users with a competitive attitude could successfully and repeatedly perform the three 7cm-deep needle placements at a rate of one per second, after only a few training trials. The most common initial problem the test users had was to hold the needle in a way that the markers face towards the head-mounted tracking camera. In our opinion, the fact that we can track the marker body only over about ±45° away from the nor-
An Augmented Reality Navigation System with a Single-Camera Tracker
123
mal does not really represent a practical limitation for the needle placement. The user just needs to be aware not to turn the needle around its axis away from the tracking camera.
4
Summary and Conclusions
We developed an augmented reality system based on a stereoscopic video-seethrough head-mounted display. Looking at the patient, the user can perceive medical images in-situ, e.g. see a virtual representation of a tumor in the location of the actual tumor. We extended this original system to include instrument tracking with our headmounted tracking camera. For this, we designed a marker body in a multilevel planar configuration that provided very stable pose estimation results for single-camera tracking. Making use of the new capability of instrument tracking, we designed a phantom box for a needle placement experiment. The user has to insert a needle through a foam pad and hit an underlying mechanical target. He or she is guided by the stereoscopic video view that is augmented with a graphics overlay showing the hidden target and the needle, including a forward extrapolation of the needle as an aiming aid. The user sees where the needle path intersects with the target, and can easily bring the needle into correct alignment. The user interface was experienced as very intuitive, and among a group of over one hundred test users, all were able to consistently succeed with the needle placement. Augmented Reality guidance may be especially helpful when the user encounters complex anatomy, where vital structures like nerves or blood vessels have to be avoided while the needle is advanced towards a target like a tumor. Our system not only gives intuitive access to understanding the 3D geometry of the anatomy, it also provides a comfortable and believable augmented reality experience, where the graphical structures appear firmly anchored in the video scene. They do not jitter or swim, nor do they exhibit any time lag to the real objects in the video images. Currently, we are working towards testing the system in a clinical context.
References 1. M. Bajura, H. Fuchs, and R. Ohbuchi. "Merging Virtual Objects with the Real World: Seeing Ultrasound Imagery within the Patient." Proceedings of SIGGRAPH '92 (Chicago, IL, July 26-31, 1992). In Computer Graphics 26, #2 (July 1992): 203-210 2. .Andrei State, Mark A. Livingston, Gentaro Hirota, William F. Garrett, Mary C. Whitton, Henry Fuchs, and Etta D. Pisano, “Technologies for Augmented Reality Systems: realizing Ultrasound-Guided Needle Biopsies, “ Proceed. of SIGGRAPH (New Orleans, LA, August 4-9, 1996), in Computer Graphics Proceedings, Annual Conference Sereis1996, ACM SIGGRAPH, 439-446. 3. Michael Rosenthal, Andrei State, Joohi Lee, Gentaro Hirota, Jeremy Ackerman, Kurtis Keller, Etta D. Pisano, Michael Jiroutek, Keith Muller, and Henry Fuchs, “Augmented Reality Guidance for Needle Biopsies: A Randomized, Controlled Trial in Phantoms,” Proceedings of Medical Image Computing and Computer-Assisted Intervention – MICCAI 2001 (Utrecht, The Netherlands, October 14-17, 2001), Lecture Notes in Computer Science 2208, W. Niessen and M. Viergever (Eds.), Springer Berlin, Heidelberg, New York, pages 240-248.
124
F. Sauer, A. Khamene, and S. Vogt
4. Henry Fuchs, Mark A. Livingston, Ramesh Raskar, D’nardo Colucci, Kurtis Keller, Andrei State, Jessica R. Crawford, Paul Rademacher, Samual H. Drake, and Anthony A. Meyer, MD, “Augmented Reality Visualization for Laparoscopic Surgery, “ Proceedings of Medical Image Computing and Computer-Assisted Intervention – MICCAI ’98 (Cambridge, MA, USA, October 11-13, 1998), 934-943. 5. W. Eric L. Grimson, Ron Kikinis, Ferenc A. Jolesz, and Peter McL. Black, “ImageGuided Surgery,” Scientific American, June, 1999, 62-69. 6. P.J. Edwards, D.J. Hawkes, DLG Hill, D. Jewell, R. Spink, A. Strong, and M. Gleeson, “Augmentation of Reality in the Stereo Operating Microscope for Otolaryngology and Neurosurgical Guidance,” Computer Aided Surgery 1:172-178, 1995. 7. King AP, Edwards PJ, Maurer CR, de Cunha DA, Gaston RP, Clarkson M, Hill DLG, Hawkes DJ, Fenlon MR, Strong AJ, Cox TCS, Gleeson, MJ, “Stereo augmented reality in the surgical microscope,” Presence: Teleoperators and virtual environments 9:360-368 2000. 8. W. Birkfellner, K. Huber, F. Watzinger, M. Figl, F. Wanschitz, R. Hanel, D. Rafolt, R. Ewers, and H. Bergmann, “Development of the Varisocope AR, a See-through HMD for Computer-Aided Surgery,” IEEE and ACM Int. Symp. On Augmented Reality – ISAR 2000 (Munich, Germany, October 5-6, 2000), pages 54-59. 9. J.P.Rolland and H. Fuchs, “Optical versus Video See-Through Head-Mounted Displays in Medical Visualization,” Presence (Massachusetts Institute of Technology), Vol. 9, No. 3, June 2000, pages 287-309. 10. F. Sauer, F. Wenzel, S. Vogt, Y.Tao, Y. Genc, and A. Bani-Hashemi, “Augmented Workspace: Designing an AR Testbed,” IEEE and ACM Int. Symp. On Augmented Reality – ISAR 2000 (Munich, Germany, October 5-6, 2000), pages 47-53. 11. Calvin Maurer, Frank Sauer, Chris Brown, Bo Hu, Benedicte Bascle, Bernhard Geiger, Fabian Wenzel, Robert Maciunas, Robert Bakos, and Ali Bani-Hashemi, “Augmented Reality Visualization of Brain Structures with Stereo and Kinetic Depth Cues: System Description and Initial Evaluation with Head Phantom,” talk presented at SPIE’s Int. Symp. on Medical Imaging 2001 (San Diego, CA, February 2001). 12. Frank Sauer, Ali Khamene, Benedicte Bascle, and G.J. Rubino, “A Head-Mounted Display System for Augmented Reality Image Guidance: Towards Clinical Evaluation for iMRI-guided Neurosurgery,” Proceedings of Medical Image Computing and ComputerAssisted Intervention – MICCAI 2001 (Utrecht, The Netherlands, October 14-17, 2001), Lecture Notes in Computer Science 2208, W. Niessen and M. Viergever (Eds.), Springer Berlin, Heidelberg, New York, pages 707-716. 13. Frank Sauer, Ali Khamene, Benedicte Bascle, Sebastian Vogt, and Gregory J. Rubino, “Augmented Reality Visualization in iMRI Operating Room: System Description and Pre-Clinical Testing,” to appear in SPIE Proceed. of Medical Imaging, San Diego, February 2002. 14. Frank Sauer, Ali Khamene, Benedicte Bascle, Lars Schimmang, Fabian Wenzel, and Sebastian Vogt, “Augmented Reality Visualization of Ultrasound Images: System Description, Calibration, and Features,” IEEE and ACM Int. Symp. On Augmented Reality – ISAR 2001 (New York, NY, October 29-30, 2001), pages 30-39. 15. Frank Sauer, Ali Khamene, Benedicte Bascle, and Sebastian Vogt, “An Augmented Reality System for Ultrasound Guided Needle Biopsies,” Medicine Meets Virtual Reality MMVR 02/10 (Newport Beach, CA, January 2002), J.D.Westwood et al. (Eds.), IOS Press, 2002, pages 455-460. 16. Roger Y. Tsai, "A versatile Camera Calibration Technique for High-Accuracy 3D Machine Vision Metrology Using Off-the-Shelf TV Cameras and Lenses", IEEE Journal of Robotics and Automation, Vol. RA-3, No. 4, August 1987, pages 323-344. 17. http://www.cs.cmu.edu/~cil/v-source.html: freeware implementation of the Tsai algorithm.
A Novel Laser Guidance System for Alignment of Linear Surgical Tools: Its Principles and Performance Evaluation as a Man–Machine System Toshihiko Sasama1 , Nobuhiko Sugano2 , Yoshinobu Sato1 , Yasuyuki Momoi3 , Tsuyoshi Koyama2 , Yoshikazu Nakajima1 , Ichiro Sakuma4 , Masakatsu Fujie5 , Kazuo Yonenobu6 , Takahiro Ochi2 , and Shinichi Tamura1 1
2
Division of Interdisciplinary Image Analysis, Osaka University Graduate School of Medicine Department of Orthopaedic Surgery, Osaka University Graduate School of Medicine 3 Mechanical Engineering Research Laboratory, Hitachi Ltd 4 Department of Environmental Studies, Graduate School of Frontier Sciences, The University of Tokyo 5 Department of Mechanical Engineering, Waseda University 6 Department of Orthopaedic surgery, Osaka Minami National Hospital
Abstract. A novel laser guidance system that uses dual laser beam shooters for the alignment of linear surgical tools is presented. In the proposed system, the intersection of two laser planes generated by dual laser shooters placed at two fixed locations defines the straight insertion path of a surgical tool. The guidance information is directly projected onto the patient and the surgical tool. Our assumption is that a linear surgical tool has cylindrical shape or that a cylindrical sleeve is attached to the tool so that the sleeve and tool axes are aligned. The guidance procedure is formulated mainly using the property that the two laser planes are projected as two parallel straight lines onto the cylindrical tool surface if and only if the cylinder axis direction is the same as the direction of the intersection of the two laser planes. Unlike conventional augmented reality systems, the proposed system does not require the wearing of glasses or mirrors to be placed between the surgeon and patient. In our experiments, a surgeon used the system to align wires according to the alignment procedure, and the overall accuracy and alignment time were evaluated. The evaluations were considered not to be simply of a mechanical system but of a man–machine system, since the performance depends on both the system accuracy and the surgeon’s perceptual ability. The evaluations showed the system to be highly effective in providing linear alignment assistance.
1
Introduction
Most surgical navigation systems display the graphical images used for navigation on a computer monitor positioned adjacent to the surgical scene, and T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 125–132, 2002. c Springer-Verlag Berlin Heidelberg 2002
126
T. Sasama et al.
operative procedures are performed using a hand-held pointer and instruments with tracking markers [1],[2],[3]. These systems leave the surgeon the mental task of combining two sources of spatial information; because the images for navigation are supplied on a computer monitor, the surgeon has to look away from the surgical scene to obtain navigational information. Several types of augmented reality (AR) systems have been developed to address this problem, including image overlay displays [4],[5],[6], stereo image injection into binocular operating glasses [7], and a simple laser pointer [8]. However, these systems still have some drawbacks. Image overlay displays and binoculars may be cumbersome for use in some power-demanding orthopaedic procedures like hammering and drilling. A laser pointer is easy for surgeons to follow, but it has to move extensively around the surgical scene in order to point in various directions. Here, we propose a novel laser guidance configuration that overcomes the problems inherent in previous AR systems. In this paper, we describe the principle and evaluate the performance of a laser guidance system that uses dual laser beam shooters fixed to a stand, the aim of which is to achieve the alignment of linear surgical tools such as drills and wires. In the proposed system, the intersection of two laser planes generated by dual laser shooters defines the straight insertion path of a surgical tool. The two laser planes are directly projected onto the patient and surgical tool to provide guidance information. Our assumption is that a linear surgical tool has cylindrical shape, or that a cylindrical sleeve is attached to a wire so that their axes are aligned. Using the properties the two laser plane projections on the entry surface of the patient and the cylindrical surface of the tool or attached sleeve, we propose a two-step procedure for position and orientation alignment to the straight insertion path. The procedure for orientation alignment is based on the human ability to perceive parallel lines. We evaluate the performance as a man–machine system since it is dependent on both the accuracy of the system and the perceptual ability of the surgeon.
2 2.1
System and Guidance Procedure Dual Laser System
Figure 1 shows the configuration of the dual laser system and its appearance. The laser guidance system consists of two laser beam shooters that are fixed on stands 100–150 cm apart. Each shoots a 0.25-mW red (635 nm) laser beam with 1-mm diameter spot. Each beam oscillates within the range of angle α at 50 Hz, resulting in a beam tract in space shaped like a fan (in our system, α = 10 degrees.) The beam tract defines a fan-shaped plane segment, which we call the “laser plane”. The two laser planes intersect in a line that can be controlled in any direction by changing the angle and direction of the beam oscillation using a galvanometer. Let OL be the origin of the reference frame of the laser beam shooter. Let ex , ey , and ez be the coordinate axes. We assume that ez is the reference direction
A Novel Laser Guidance System for Alignment of Linear Surgical Tools
(a)
127
(b)
Fig. 1. Laser guidance system. (a) System configuration. (b) Appearance. Two laser beam shooters (shown by arrows) are attached to the OPTOTRAK camera.
of laser beam shooting. The laser plane is restricted to pass through OL 1 . The field of projection (FOP) of the laser plane of the laser shooter is roughly approximated by the cone-shaped volume whose apex, axis, and apex angle are OL , ez , and α, respectively. Let n = (cos φ cos θ, cos φ sin θ, sin φ) be the laser plane normal represented in the laser shooter frame. The possible parameter ranges of φ and θ are described as − β2 ≤ φ ≤ β2 and −90◦ ≤ θ ≤ 90◦ , respectively (in our system, β = 10 degrees.) The two laser shooters are arranged in the real 3D space so that their reference directions of laser beam shooting, ez , roughly intersect. The approximated intersection point and the two origins of the reference frames form a triangle, the z-axes of the two shooters being projected to the intersection point. It is desired that this be an isosceles triangle satisfying |OL1 − OI | = |OL2 − OI | = L, where OI is the approximated intersection point and OL1 and OL2 are the origins of the two laser shooter reference frames, respectively. Using this configuration, a straight line of any direction passing through the spherical volume, whose center and diameter are respectively OI and D = 2L tan β2 , can be specified by the intersection of the two laser planes. We used L = 150 cm as the length of the isosceles triangle. In this case, the diameter is approximately D = 2×150×tan 5◦ = 26 cm. 2.2
Guidance Procedure
Given an insertion path described as a straight line, a laser plane that contains the given line and is generated by the laser shooter can be determined. Using the two generated planes whose intersection line is the insertion path, a guidance procedure is formulated below. 1
Precisely speaking, OL may deviate slightly from the laser plane depending on the mechanical properties of the galvanometer. This deviation is, however, practically negligible in terms of understanding the basic configuration of the system. In the actual system, we precisely calibrate the deviation to generate the laser plane.
128
T. Sasama et al.
The guidance procedure consists of two steps (Fig. 2). Firstly, the entry point is determined. The two laser planes are projected as crossing lines onto the entry surface. The tip of a surgical tool is then placed at the crossing point (Fig. 2(a)). Secondly, the orientation is determined after stabilization of the entry point. Two methods were considered for orientation guidance: – Assuming that a coaxial point can be localized at the tail of the tool, the same method as that used for entry point guidance is employed. – Assuming that a linear surgical tool has a cylindrical surface whose axis corresponds to the linear insertion path, projections of the laser planes onto the cylindrical surface of the tool are employed for the guidance. We employed the latter method in the experiments described in this paper. The former method needs a relatively wide surface to receive the laser planes at the tail as well as a precisely localizable coaxial point, which is not easy to determine in a real situation. On the other hand, many linear surgical tools have a coaxial cylindrical surface. Also, in the case of wire insertion by drilling, it is easy to attach cylindrical guide sleeves to the wire. In general, a laser plane is projected as a quadratic curve onto a cylindrical surface. When the laser plane normal is orthogonal to the axis direction of the cylinder, however, the plane is projected as a straight line. Thus, the following property is derived: – Two laser planes are projected as two parallel straight lines onto a cylindrical surface if and only if the direction of cylinder axis is aligned in the direction of the intersection line of the two laser planes. Using this property, the orientation is adjusted while pivoting at the entry point so that the projections of the two laser planes onto the cylindrical surface are parallel and straight (Fig. 2(b)).
3
Experiments
3.1 Calibration The proposed system was evaluated using an optical 3D position sensor (OPTOTRAK 3020; Northern Digital Inc., Waterloo, OT, Canada). The laser shooter reference frame was registered with the OPTOTRAK reference frame using the following method. Several laser beam lines (which do not oscillate), whose directions and positions are known in the laser shooter frame, are projected onto a plate at different positions. The projections of the laser beams are imaged as points, and their 3D positions digitized using an OPTOTRAK pen-probe. Assuming correspondences between the several laser beams and their projected points, the transformation matrix between the OPTOTRAK and laser shooter frames is determined. 3.2 Method for Measuring Accuracy Given the insertion path as the intersection of the laser planes in the OPTOTRAK reference frame, the subject conducting the experiments aligned the entry
A Novel Laser Guidance System for Alignment of Linear Surgical Tools
(a)
129
(b)
Fig. 2. Guidance procedure. (a) Positional alignment of entry point. (b) Orientation alignment of tool.
(a)
(b)
(c)
(d)
Fig. 3. Experimental materials for system evaluation. (a) Wire with cylindrical sleeve and AdapTrax. (b) Sleeves of different diameters and lengths. (c) Different sleeve positions. (d) Different intersecting angles of two laser planes.
point and direction of the cylindrical guide sleeve attached to the wire according to the guidance procedure described in section 2.2. The subject was an orthopaedic surgeon sufficiently trained to be able to perform the procedure appropriately. A tracking marker with LEDs (AdapTrax Tracker; Traxtal Technologies, Toronto) was fixed to the end of the wire. After calibration of the wire, the position of the entry point and direction of the wire were measured using an OPTOTRAK sensor (Fig. 3(a)). Repeated measurements were made, and the measured positions and directions were compared with the specified positions and directions. The RMS error of the measurement system itself was confirmed to be about 0.5 mm in position and 0.5 degrees in direction. 3.3
Method for Evaluating Effects of Diameter, Length, and Position of Cylindrical Sleeve
To evaluate the effects of the diameter and length of the sleeve used to receive the laser beam on the accuracy of the wire insertion procedure and its usability by surgeons, twelve wire guide sleeves with different diameters and lengths were
130
T. Sasama et al.
made (Fig. 3(b)). The diameters were 3 (without sleeve), 10, 20, 30, and 40 mm. The lengths were 50, 100, and 150 mm. Gray was chosen for the sleeve color because the red laser beam was projected onto this color clearly. Kirshner wire 3 mm in diameter and 300 mm in length was used for the tests. To evaluate the effects of the guide sleeve position and the intersecting angle of the laser planes, sleeves were placed 50, 100, or 150 mm from the wire tip (Fig. 3(c)) and intersecting angles of 10, 20, 30, 60, 90, 120, and 150 degrees were tested (Fig. 3(d)). The intersecting angle depends on both the direction of the specified insertion path and the arrangement of the two laser beam shooters. Measurements were repeated 20 times for each condition. Time spent by the subject on position alignment and orientation was also measured for each condition as an index of usability. 3.4
Results
Fig. 4 shows the experimental results on the angular accuracy of the orientation alignment and time spent on the alignment under various conditions. The effects of length of the guide sleeve, position of the sleeve, and the intersecting angle of the laser planes on angular accuracy are respectively shown in subfigures (a), (b), and (d) of Fig. 4. The effect of the sleeve diameter on the alignment time is shown in Fig. 4(c). The length and position of the sleeve had little effect on the angular accuracy of the wire direction (Fig. 4(a) and (b)). While the sleeve diameter also had little effect on the angular accuracy (result not shown), there was a tendency for the surgeon to need a longer time to align a wire with a smaller diameter sleeve (Fig. 4(c)). Considering both the accuracy of the wire direction and usability, the minimum sleeve length was judged to be 50 mm and the optimum sleeve diameter was in the range from 20 to 30 mm. With respect to its position, the sleeve should be as distant from the wire tip as possible (Fig. 4(b)). When the sleeve was set within the above ranges, the accuracy of the wire direction was 0.6 degrees with 0.8 degrees of RMS. The intersecting angle of the laser plane affected the accuracy of the wire direction (Fig. 4(d)). The direction of the wire was most accurate when the angle was 60 degrees or 90 degrees, and these angles were the easiest for the surgeon to align the wire to the laser planes. The positional accuracy of the wire tip placement was not much affected by the above conditions, being around 0.5 mm of bias with 0.9 mm of RMS (results not shown), except that the intersecting angle did have a small effect that was similar but weaker than the effect on the angular accuracy.
4
Discussion and Conclusions
The proposed laser guidance system with dual laser planes was shown to be capable of assisting surgeons in wire insertion with an accuracy of less than 1 mm for the wire tip position at the entry point and of less than 1 degree for the wire direction. The accuracy of the wire direction was not affected by the
A Novel Laser Guidance System for Alignment of Linear Surgical Tools
131
(a)
(b)
(d) (c)
Fig. 4. Results of evaluations (a) Effect of length of guide sleeve on angular accuracy of orientation alignment. (b) Effect of position of guide sleeve on angular accuracy. (c) Effect of diameter of guide sleeve on alignment time. (d) Effect of intersecting angle of laser plane on angular accuracy.
length, position, or diameter of the sleeve, but it because easier for the surgeon to align the wire with the laser beam tracts as the length of the sleeve and its distance from the wire tip got shorter, and as the diameter of the sleeve became smaller. Considering the accuracy of the wire direction and the usability of the system by surgeons, the minimum sleeve length was judged to be 50 mm, while the optimum sleeve diameter was in the range from 20 to 30 mm. The optimum intersecting angle of the laser beam tracts was between 60 and 90 degrees. The proposed system has the following advantages. The two laser shooters can be placed an fixed stands, while in a conventional laser system a laser pointer with a robotic arm is needed because it has to move extensively around the surgical scene to point in various directions [8]. Furthermore, since the laser beam direction is often close to surgeon’s viewing direction, in the conventional system the beam can easily be occluded by the surgeon [8]. This does not happen in our system. Unlike in various other AR systems [4],[5],[6],[7], the proposed system does not require the wearing of glasses or mirrors to be placed between the surgeon and patient, although its function is limited to linear tool guidance. The surgeon who was the subject in our experiments reported that it was mentally quite easy to concentrate on aligning linear surgical tools since the guidance
132
T. Sasama et al.
information was directly projected onto the surgical scene, allowing him to focus on it. In addition, he did not need to worry about occluding the laser beams. It should be noted that the alignment performance was evaluated as a man– machine system. That is, the accuracy and alignment time depends on the overall performance of both the surgeon’s perceptual ability (especially in judging whether or not two lines are parallel) and the accuracy of the system (including calibration and the mechanical accuracy of the galvanometer). The system effectively employs the keen ability of humans to judge parallel lines in achieving the alignment task, thereby requiring cooperation between human and machine. The overall accuracy and usability of this man–machine system was confirmed to be sufficiently high for clinical use. We have already tested and reported on the clinical feasibility of the proposed guidance system [9], which was combined with our total hip replacement navigation system [2] and employed for guide wire insertion in acetabular cup placement. We confirmed that the system functioned successfully in the operating room [9]. A future problem to be addressed is guidance for the insertion depth. The current system does not provide guidance on how deep a tool should be inserted. We are now investigating methods for depth guidance, including the use of sound, blinking laser beams, and changing the laser beam color. Acknowledgement: This work was partly supported by JSPS Research for the Future Program JSPS-RFTF99I00903.
References 1. Y Sato, M Nakamoto, Y Tamaki, et al., “Image Guidance of Breast Cancer Surgery using 3-D Ultrasound Images and Augmented reality visualization,” IEEE Trans. Med. Imaging, vol.17, no.5, pp.681-693, 1998 2. Y Sato, T Sasama, N Sugano, et al., “Intraoperative Simulation and Planning Using a Combined Acetabular and Femoral (CAF) Navigation System for Total Hip Replacement,” Lecture Notes in Computer Science, vol.1935 (MICCAI2000): pp.1114-1125, 2000. 3. A M DiGioia, B Jaramaz and B D Colgan, “Computer assisted orthopaedic surgery. Image guided and robotic assistive technologies,” Clinical Orthopaedics and Related Research, No. 354, September, 1998. 4. M Blackwell, C Nikou, A M DiGioia, et al., “An Image Overlay System for Medical Data Visualization,” Lecture Notes in Computer Science, vol.1496 (MICCAI’98): pp.232-240, 1998. 5. K Masamune, Y Masutani, S Nakajima, et al., Takeyoshi Dohi, Hiroshi Iseki and kinotomo Takakura, “Three-Dimensional Slice Image Overlay System with Accurate Depth Perception for Surgery,” Lecture Notes in Computer Science, vol.1935 (MICCAI2000): pp.395-402, 2000. 6. S Nakajima, S Orita, K Masamune, et al., “Surgical Navigation System with Intuitive ThreeDimensional Display,” Lecture Notes in Computer Science, vol.1935 (MICCAI2000): pp.403411, 2000. 7. H Fuchs, A State, E D Pisano, et al., Gentaro Hirota, Mark A. Livingston, Mary C. Whitton, and Stephen M. Pizer, “Towards Performing Ultrasound-Guided Needle Biopsies from within a Head-Mounted Display”, Lecture Notes in Computer Science, vol.1131 (VBC’96): pp.591-600, 1996 8. S Lavall´ ee, J Troccaz, P Sautot, et al., “Computer-Assisted Spinal Surgery Using AnatomyBased Registration,” in Russell H. Tayler, St´ ephane Lavall´ ee, Grigore C. Burdea, Ralph M¨ osges eds., “Computer-Integrated Surgery: Technology and Clinical Applications,” The MIT Press, Cambridge, Massachusetts, pp.425-449, 1996. 9. N Sugano, T Sasama, S Nishihara, et al., “Clinical applications of a laser guidance system with dual laser beam rays as augmented reality of surgical navigation,” Computer Assisted Radiology and Surgery, in press, 2002.
Navigation of High Intensity Focused Ultrasound Applicator with an Integrated Three-Dimensional Ultrasound Imaging System Ichiro Sakuma1, Yuichi Takai1, Etsuko Kobayashi1, Hiroshi Inada1, Katsuhiko Fujimoto2, and Tekehide Asano3 2
1 Graduate School of Frontier Sciences, Graduate School of Engineering, The University of Tokyo. 7-3-1, Hongo, Bunkyok-ku, Tokyo 113-8656 Japan
[email protected], {takai,estuko,inadah}@miki.pe.u-tokyo.ac.jp 2 Toshiba Medical Systems Company, 1385 Ogami, Ohtawara Tochigi 324-8550 Japan
[email protected] 3 2nd Department of Surgery, Chiba University School of Medicine, 1-8-1, Inohana, Chuo-ku, Chiba City, Chiba 260-8670 Japan
[email protected]
Abstract. A three-dimensional ultrasound imaging system was integrated in a HIFU applicator. This makes it easy to register the focal position of the HIFU applicator in the obtained three-dimensional volume data by three-dimensional ultrasound imaging system. The applicator was mounted on a mechanical manipulator with three degrees of freedom. The HIFU probe was positioned according to the volume data of the tissue around the target obtained by the ultrasound imaging system. A phantom study was conducted to evaluate the accuracy of navigation. Navigation errors were within 3 mm. It could also detect the monitor the change in acoustic properties of tissue due to HIFU application. The system uses history of HIFU application during therapy can be recorded in the system to assist appropriate manipulation of the probe by surgeons. The developed system is a simple, a cost-effective, and compact device for minimally invasive liver surgery.
1
Introduction
Recently, several minimally invasive surgical techniques have been studied as an alternative to liver resection. These include chemical ablation by percutaneous injection of ethanol, radio-frequency microwave thermal ablation and thermal ablation by high intensity focused ultrasound (HIFU).[1-3] HIFU showed promising results in non invasive thermal ablation techniques of liver tumor. Since ultrasound can penetrate deep into soft tissue, there is no need for implanting any medical devices in a patient. MR imaging has been used for guidance and temperature monitoring of HIFU therapy [4-6]. MR imaging is powerful both for guide and monitor HIFU procedure. However, it requires expensive facilities. In this study, as an alternative to guide HIFU procedure, three-dimensional ultrasound imaging system was integrated in HIFU applicator. We integrated a three-dimensional ultrasound imaging system in a HIFU applicator. This makes it easy to register the focal position of the HIFU appliT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 133–139, 2002. © Springer-Verlag Berlin Heidelberg 2002
134
I. Sakuma et al.
cator in the obtained three-dimensional volume data by three-dimensional ultrasound imaging system. We also mounted it on a mechanical manipulator and positioned it based on obtained three-dimensional ultrasound images. We evaluated the accuracy of applicator positioning through phantom experiments.
2
Integration of Three-Dimensional Ultrasound Imaging System to HIFU System
2.1
HIFU System [7]
In this study, we used a high intensity focused ultrasound applicator consisting of 12 piezoelectric oscillators placed on the spherical base. Frequency if ultrasound wave was 1.65MHz. The opening of the probe was 110 mm and the focal length of the HIFU applicator was 100 mm. It is well known that formation of micro bubbles during ultrasound application due to cavitations interfere with ultrasound propagation in tissue. It leads to attenuation of focused ultrasound energy and dislocation of ablated area towards the HIFU applicator relative to its focal point. In this system, frequency modulation of applied ultrasound was used to suppress generation of cavitations. When the excessive power is concentrated at the very narrow area of the liver tissue, instantaneous destruction of tissue occurs leading to hemorrhage and formation of void in tissue. Thus moderate focusing of ultrasound power near the focal point is required. In this system, neighboring piezoelectric oscillators were driven in opposite phase. Interference between ultrasound waves caused moderate ablation in larger area near the focal point avoiding destruction of tissue structure. 2.2
Three Dimensional Ultrasound Imaging System [8]
A prove of the ultrasound imaging system (TOSHIBA PowerVision4000) was installed at the center of the HIFU applicator as shown in Figure 1. The distance between the ultrasound imaging prove and the focal point of the HIFU applicator was 78mm. The probe was equipped with a stepping motor with reduction gears. To acquire 3D images, we rotated the probe with the stepping motor. Space under the HIFU applicator should be filled with de-gassed water to enable ultrasound propagation. An o-ring was to seal the rotating part of ultrasound imaging system and HIFU applicator. A sets of two-dimensional ultrasound image data were obtained at interval of the fixed angle when the probe was rotated at a fixed speed. The volume data could be reconstructed from sequence of 2D images. Because 2D images are obtained in sequence at interval of the fixed angle, the datasets are discrete. Thus we interpolated the vacant space between neighboring two images to obtain three-dimensional volume information. In this study, we measured 120 images in 180 degrees, consequently the resolution of this image was approximately 2 mm at the edge of a circle whose radius is 82mm. We used volume rendering as visualization method. However, since our system displays 3D image by volume rendering, the system does not have to execute boundary extraction and segmentation. In our system, the volume rendering graphic accelerator (VolumePro, Mitsubishi Electronics of America, MA. U.S.A.) enabled real time volume rendering in the conventional PC system.
Navigation of High Intensity Focused Ultrasound Applicator
2.3
135
IFU Probe Navigation System
The HIFU applicator with integrated three-dimensional ultrasound imaging system was mounted on a mechanical arm having three degrees of freedom: X, Y, and Z direction. A stepping motor drove each axis. The total system configuration is shown in Figure 2. In this system, we integrated mechanically the ultrasound imaging prove and the HIFU applicator. Thus it is easy to register location of the focal point of the HIFU applicator in the acquired three-dimensional information of tissue around the target. We developed software to display the focal point of the HIFU applicator in the volume data or arbitrary cross sectional data obtained by the three-dimensional ultrasound imaging system as shown in figure 3. It can also record the history of applied ultrasound energy locations to assist intra-operative therapeutic planning of thermal ablation.
Pipe for water supply Ultrasound imaging probe
O-ring
Electronic circuit Piezo electric oscillators
Filled with water
Fig. 1. HIFU applicator with an integrated three-dimensional ultrasound imaging system
Stepping motors for maipulator drive
0Z Pipes for water supply
0Y 0X
Cable for ultrasound imaging probe Integrated ultrasound imaging probe HIFU applicator
Fig. 2. HIFU applicator with an integrated three-dimensional ultrasound imaging system mounted on a manipulator with three degrees of freedom
136
I. Sakuma et al.
Fig. 3. An example of display for HIFU applicator navigation. In the right three panel arbitrary cross sections can be displayed together with focal point coordinate. In the left panel, reconstructed volume is displayed together with estimated heated region
3
Results
3.1
Accuracy of the Three-Dimensional Ultrasound Imaging System
To evaluate accuracy of the three-dimensional ultrasound imaging system, we have conducted the following experiment. A silicone phantom (50mm×60mm×40mm) was placed in a water filled container with it upper surface perpendicular to the axis of ultrasound probe. The distance between the phantom and the probe was 90mm. Ultrasound image of the upper surface was obtained and length of each side was measured. Measurement error was within 3 mm. 3.2
Navigation of HIFU Probe
(1) Modeling of Heated Region by the HIFU Applicator Temperature sensitive phantom was made of thermal solidified polysaccharide, thermo-sensitive dye vinyl chloride, and water. The color of the phantom was red before heating. When the phantom was heated up to approximately 60 °C, its color changed to white. We could determine the localization of the heated area in the phantom by identifying its color change. We placed the phantom (height: 40mm) under the HIFU applicator with distance of 80mm and applied HIFU with power of
Navigation of High Intensity Focused Ultrasound Applicator
137
120 and 150W for 15 seconds considering usual condition in clinical evaluation of the same HIFU system. Using the obtained data, we simply modeled the area of heated with HIFU as conical shape region. (2) Phantom Experiment of Navigation of HIFU Applicator with Three-Dimensional Ultrasound Image Three metal bolts (diameter: 3mm, length: 40mm) were placed in a temperature sensitive phantom in a triangular configuration as markers for HIFU applicator positioning. Distance between bolts was set as long as 30mm. We tried to apply ultrasound energy in this triangular space surrounded by these three bolts. The locations of three bolts were identified by the three-dimensional ultrasound imaging system. The applicator was positioned so that HIFU applicators focal point was set at a certain distance from the tip of three markers using the mechanical manipulator. The heated area in the phantom was identified by change in color of temperature sensitive phantom. We applied 120W of ultrasound energy for 15 seconds. We evaluated the distances between the center of gravity of the area where its color changed to white due to temperature rise and the planned focal position of the HIFU applicator. The result is shown in Figure 4. The distances between the planned focal point and actual center of gravity of the heated areas were 1.5, 2.0, 3.0 mm respectively.
Fig. 4. Results of a phantom experiment Dots near the marker bolts stand for measured position of markers by the three-dimensional ultrasound imaging system. Dots in a circle stand for planed focal position of HIFU and point on the cross in each circle stands for center of gravity of the heated region in the phantom.
138
I. Sakuma et al.
3.3
Ultrasound Imaging of HIFU Ablation in Porcine Liver Sample
We applied HIFU to a porcine liver sample. We obtained ultrasound threedimensional images before after application of HIFU. Figure 5 shows the image obtained after application of 120W HIFU for 15 seconds. The optical image of the heated area is also displayed in the figure. We could identify the heated area as area with high contrast region. However, the contrast decreased as time elapsed after HIFU application.
Fig. 5. Ultrasound images after application of 120W HIFU for 15 seconds to porcine liver sample. High contrast region are circled in the figures. Actual optical image of the heated area is also shown
4
Discussion
We integrated three-dimensional ultrasound imaging system in a HIFU applicator. This made it easy to register the focal point of HIFU applicator in volume information of the target tissues. Davies et al proposed use of robot to position the HIFU applicator for neuro-surgery [ 9]. It still requires frame and position measurement system to
Navigation of High Intensity Focused Ultrasound Applicator
139
localize the HIFU applicator. Since the developed system requires no additional devices for navigation, it is a cost-effective, simple, and compact device for minimally invasive liver surgery. Intra-operative monitoring of the change in acoustic properties of the tissue due to HIFU therapy and recording of the history of the HIFU application assist surgeon to manipulate appropriately HIFU applicator Since the source of error inherent to the system is limited to mechanical alignment of HIFU applicator and ultrasound imaging system. The results in phantom experiment showed that the navigation error was within 3 mm. This was mainly due to image resolution of the currently used ultrasound imaging system and errors in modeling the heated region by HIFU. Various computational models have been proposed for estimation of ablated area by HIFU therapy [10]. We will incorporate these theoretical models in navigation system to improve navigation accuracy.
References 1. Diederich CJ., Nau WH., Stauffer PR.: Ultrasound Applicators for Interstitial Thermal Coagulation, IEEE Trans. Ultrasonics, Ferrpelectrics and Frequency Control, 46 (1999) 12181227 2. Arefiev A., Chapelon JY, Tavakkoli J., Cathignol D.: Ultrasound-induced tissue ablation: studies on isolated perfused porcine liver, Ultrasound in Medicine and Biology, 24 (1998) 1033-1043 3. Vaezy S., Martin R., Caps M., Taylor S., Beach K., Carter S., Kaczkowski P., Keilman G., Helton S., Chandler W., Mourad P., Rice M., Roy R., Crum L.: Liver hemostasis using high-intensity focused Ultrasound, Ultrasound in Medicine and Biology, 23 (1997) 14131420 4. Daum DR., Smith NB., King R., Hynynen K.: In vivo Demonstration of Noninvasive Thermal Surgery of the Liver and Kidney Using a Ultrasnoic Phased Array, Ultrasound in Medicine and Biology, 25 (1999) 1087-1098 5. Chen L., Bouley D., Yuh E., C’Arceul H., Butts K.: Study of Focused Ultrasound Tissue Damage Using MRI and Histology, J. Magn. Reson. Imaging 10 (1999) 146-153 6. Chung A., Hynynen K., Cline HE., Colucci V., Oshio KM, Jolesz F.: Optimaization of spoiled gradient-echo phase imaging for in vivo localization of focused ultrasound beam. Magn Reson Med, 36 (1996) 745-742 7. Asano T., Ito T., Sugamoto Y., Kondo S., Takayama W., Kenmoch T., Maruyama M., Miyauchi H., Mitsui Y., Ochiai T., Kainuma O., Tokoro Y., Jingu K., Nakagori T., Yamamoto H., Fujimoto K., Satou K.: Cancer Therapy by focused ultrasound, Geka 62 (2000) 1689-1695 8. Sakuma I., Tanaka Y., Takai Y., Kobayashi E., Dohi T., Schorr O. Hata N. Iseki H., Muiragaki Y., Hori T., Takakura K.: Three dimensional digital ultrasound imaging system for surgical navigation, Proc. CARS2001 (2001) 118-123 9. ,Davies BL., Chauhan S., Lowe MJS.: A Robotic Approach to HIFU Based Neurosurgery, LNCS 1496 (1998) 386-396 10. Bortros YY., Volakis JL., VanBaren P., Ebbini ES.: A hybrid computational model for ultrasound phased-array heating in presence of strongly scattering obstacles. IEEE Trans Biomed Eng 44 (1997) 1039-1050
Robust Registration of Multi-modal Images: Towards Real-Time Clinical Applications S´ebastien Ourselin1,2 , Radu Stefanescu1 , and Xavier Pennec1 1
INRIA Sophia-Antipolis, Projet Epidaure, 2004 Route des Lucioles, F-06902 Sophia-Antipolis Cedex {Radu.Stefanescu, Xavier.Pennec}@sophia.inria.fr http://www-sop.inria.fr/epidaure/Epidaure-eng.html 2 CSIRO Telecommunications and Industrial Physics, PO Box 76, Epping NSW 1710 Australia
[email protected]
Abstract. High performance computing has become a key step to introduce computer tools, like real-time registration, in the medical field. To achieve real-time processing, one usually simplifies and adapts algorithms so that they become application and data specific. This involves designing and programming work for each application, and reduces the generality and robustness of the method. Our goal in this paper is to show that a general registration algorithm can be parallelized on an inexpensive and standard parallel architecture with a mall amount of additional programming work, thus keeping intact the algorithm performance. For medical applications, we show that a cheap cluster of dual-processor PCs connected by an Ethernet network is a good trade-off between the power and the cost of the parallel platform. Portability, scalability and safety requirements led us to choose OpenMP to program multi-processor machines and MPI to coordinate the different nodes of the cluster. The resulting computation times are very good on small and medium resolution images, and they are still acceptable on high resolution MR images (resp. 19, 45 and 95 seconds on 5 dual-processors Pentium III 933 MHz).
1
Introduction
One major concern in Image-Guided Therapy (IGT) is the simultaneous need for high performance algorithms for planning, targeting, and monitoring, and the time constraints imposed by the operating room [3]. For instance, in neurosurgery, pre-operative guidance using stereotactic systems allows the surgeon to select the best and safest trajectory to penetrate the tissue. This step drastically reduces the surgery time in the operating room. During surgery, the surgeon may use an intra-operative guidance, in order to control his trajectory. However, these image-guided surgery systems are limited by the static knowledge of the anatomical brain structures, since cerebrospinal fluid (CSF) leaks or tumor removal deform the anatomical structures [14]. Intra-operative (interventional) imaging is being developed to solve these problems, and also to detect complications during surgery, such as bleeding. Typically, fluoroscopic, sonographic and more recently ultrasound images are used. Concurrently, 3D modalities were developed, such as CT [4] or MRI [13,2] guided interventional procedures. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 140–147, 2002. c Springer-Verlag Berlin Heidelberg 2002
Robust Registration of Multi-modal Images
141
A typical example is a surgical tracking and visualization system developed at the Brigham and Women’s Hospital, based on open-MR image guidance, and used in 45 neurosurgical interventions [1]. During each craniotomy, 3 to 5 intra-operative MR datasets were acquired (with a typical resolution of 256 × 128 × 60) and rigidly registered to the pre-operative image to guide the initial approach. The registration maximizes the mutual information and typically takes five minutes. Since each acquisition lasts from 1 to 5 min, one would like to obtain a registration time of under 1 min to add a minimal overhead to the image acquisition and to remain faster than the medical need. To decrease the computation time, Netsch et al. presented a multi-modal registration algorithm [6] based on local correlation (LC) optimized by a GaussNewton technique, using only 10% of image voxels that have the largest local variance. They obtain a registration time of about 1 mn for typical MR and CT images. Nevertheless, the optimization procedure used is not robust to the outliers since it is a least-square minimization [12]. Recently, we presented a multi-modal registration algorithm, also based on a local similarity measure, which explicitly takes into account the presence of outliers [8]. We believe that our algorithm could lead to faster and more robust results while considering more image information. Our goal in this paper is to show that such a registration algorithm can be parallelized on a cheap and standard parallel architecture with a reasonably small amount of additional programming work while keeping intact the algorithm’s performance. Other important requirements for the parallel environment are portability, scalability and safety, since the software is intended to be used in a safety-critical environment. Thus, the choice of a mature and well-tested environment is important. We detail in Section 2 the possible hardware platforms for parallel computing and the chosen software environments, namely OpenMP and MPI (Message Passing Interface). Then, we recall in Section 3 the principles of the registration algorithm, a block matching technique detailed in [9,8], and we detail some improvements. Section 4 focuses on successive parallelizations of the main time consuming steps. Finally, we analyze in Section 5 the gain in computation time with respect to the number of processors and the data size.
2
The Parallel Environment
Hardware Choices. A parallel computer is essentially made of processors, memory, and an interconnection network that connects processors between each other and with the memory. In shared memory computers, all the processors are connected to the same memory and do not need to explicitly exchange information. However, they are limited to a few processors and in the amount of memory that can be addressed. Distributed memory computer are composed of nodes, containing processors and memory, and interacting by sending each other messages through the interconnection network. There is virtually no limit on the number of processors but the synchronization and the information exchange takes much longer. Global data structures often have to be replicated on every node for performance reasons, which also leads to larger memory needs.
142
S. Ourselin, R. Stefanescu, and X. Pennec
For medical applications, especially IGT, we believe that a cluster of symmetric multi-processors (SMP), typically PCs, connected by an internal network is a good trade-off between the power and the cost of the parallel platform. The system can be used as a research tool or as an embedded system in a clinical environment. In this article, we used up to 10 PCs of our lab (dual-processors Pentium III 933 MHz), connected by a fast Ethernet Network (100 Mbps using 2 interconnected 1Gbps switches). Such a hardware configuration is already present in many labs and hospitals. Software Choices. A parallel programming model that seemed appropriate to adapt a sequential algorithm with minimum reprogramming is the SingleProgram-Multiple-Data (SPMD) approach, in which all the processors involved execute the same program, but on different data. We used OpenMP to program multi-processor machines and the Message Passing Interface (MPI) to coordinate the different nodes of the cluster. One of their main advantages is the portability over many different kind of SMP clusters: both are standards and do not depend on the machine architecture, operating system, and network topology. OpenMP is a set of compiler directives and library functions that specifies the behavior of a program when executed on shared memory computers [7]. A large part of the OpenMP C standard is implemented as “pragma” compiler directives. This eases the parallelization of the sequential code and enables a sequential compilation by standard C compilers. The core notions of OpenMP are the parallel sections (a piece of code executed by all the processors with shared or replicated variables), and parallel “for” statements that enable the parallelization of independent iterations of a loop on multiple processors. MPI is a standard for communication libraries between the nodes of a cluster [5]. Among the most powerful functions, we can send a message to a node, receive a message from a node, broadcast a message to all other nodes, scatter subparts of a “list” to different nodes, or gather the subparts of a “list”. An important difference between MPI and OpenMP is that each MPI process has its own data and variables, as the memory is not shared. Using MPI, we run a UNIX process on each PC of the cluster. Each process uses OpenMP to start one thread per processor on its machine. To coordinate the different processes, a master process does everything that cannot be done in parallel, such as input/output (I/O) operations and tasks that have be done sequentially. All the other processes are called slaves. In our case the master is not dedicated, which means that it also can do everything that regular slaves do.
3
The Sequential Algorithm
The algorithm we parallelize in this article computes a parametric transformation (rigid, similarity, or affine) from correspondences between very similar areas in both images, with a block matching strategy. This procedure is extensively described in [9] for rigid registration of anatomical sections, in [8] for multi-modal rigid registration of medical images, and in [10,11] to compute the mid-sagital plane of the brain. In order to quickly approach the desirable optimum and to extend the capture range of the search, a multi-resolution scheme is used. We
Robust Registration of Multi-modal Images
143
previously used a coarse to fine strategy for the block sizes. Here, we use a small and constant block size (43 voxels) and we subsample the original images by (at most) a factor of two in each axis at each level of the pyramid. At each resolution level, the correspondences are searched using a block matching strategy around the current position. This is done by minimizing the simple but efficient local linear correlation criterion, which is well suited to multi-modal registration [10]. Then, a robust transformation is computed by minimizing the distance between matched points using Least Trimmed Squares (LTS) [12]. This process is iterated until convergence. As we will discuss in Section 2, an acceptable registration can be obtained before the highest resolution. To further improve the robustness and speed-up the algorithm, we only select points with a high local variance. However, unlike [6], we use all the points for the low level, and we halve the number of relevant voxels at each pyramid level, with a lower bound of 20%. Thus, all the image information is taken into account for large displacements, while the algorithm adaptively focuses on relevant image parts when estimating a more precise motion. The accuracy and the high robustness of the algorithm (100% of the CT/MR registrations within the voxel size) has been validated [10] using the Vanderbilt database [15].
4
Parallel Implementations
A profiling of the sequential version of the algorithm showed that the program spent most of its computation time (about 93%) in the vector field computation. The remaining time was shared between image resampling, LTS minimization, and I/O operations. Hence, our first concern was the vector field computation. MPI / OpenMP Implementation of the Vector Field Computation. At each iteration, one has to compute the block B i that best matches each selected block Bi , compute the new transformation and resample the image. For the parallel algorithm, each process has a copy of the initial images, and locally resamples it to limit the communication time. The master divides the list of Nblocks blocks into Nprocs sub-lists and scatter them to the processes. Each process computes Nblocks /Nprocs block displacements, and returns its local sub-list Subi of correspondences to the master (gathering step). Then, the master computes the new transformation and broadcasts it to all the slaves. We observed an acceleration of 48% on two processors, which is close to the 50% theoretical gain. We may further reduce the communication time by using OpenMP to drive dual-processors workstations. In practice, this only decreased the necessary amount of memory by avoiding the replication of large floating point images. OpenMP Implementation of Image Resampling. As the number of processors used becomes high, the resampling of images takes a higher percentage of the total computation time (see Fig. 1). An MPI implementation of the resampling would ask each process to take a part of the image, resample it and then send it to every other process, thus requiring much communication time. With an OpenMP implementation, each process (i.e. machine) does the resampling of
144
S. Ourselin, R. Stefanescu, and X. Pennec
100
7
90
Total mono−pro Total bi−pro Vector field mono−pro Vector field bi−pro Remaining mono−pro Remaining bi−pro
80
60
5 accelaration
percentage
70
real mono−pro theoretical mono−pro real bi−pro theoretical bi−pro
6
50 40
4
3
30 20
2
10 0 1
2
3
4
5 6 7 number of machines
8
9
10
1 1
2
3
4
5 6 7 number of processors
8
9
10
Fig. 1. Left: Computation time of parallel part (vector field computation), sequential part (remaining), and total computation time, using 1 to 10 processors on monoand dual-pro workstations. Right: Speedup of the implementation, normalized by the execution time on a single CPU Pentium III 933 MHz, with the theoretical Amdahl’s value.
all the points, but it shares the load among all the processors available on the machine with a “for” loop on all the image points. We expected an acceleration of about 50%, and obtained 40%. The difference between theory and experimental results are essentially due to the bounded memory access speed (in our case 133 MHz for 933 MHz Pentium III dual-processors).
5
Experiments
We illustrate our algorithm with two data sets. The first one is an MR/CT multimodal case from the Vanderbilt database [15]. These low resolution images1 represent a typical multi-modal registration in a clinical application. To simulate an IGT application, we also used high resolution2 sets of pre- and post-operative MR images, acquired by La Piti´e Salpˆetri`ere Hospital (Paris) in the context of the treatment of Parkinson’s disease by deep brain stimulation (DBS). We used up to 10 PCs of our lab (dual-processors Pentium III 933 MHz), connected by a fast Ethernet Network (100 Mbps). We applied our algorithm with mono-processor and dual-processors architecture independently, from one to ten machines. For each case, we made 100 registrations to estimate the mean time of the vector field computation and the mean time for rest of the program. Computation Times of the Parallel Section. We present the wallclock time (Fig. 1, left) for a typical registration problem, as a percentage of the computation time with one single CPU workstation. The total computation time is drastically reduced with the first machines (82% of gain for 4 dual-processors), and decreases much slower with each additional machine (only a 10% gain for the next 6 machines). The computation time of the vector field drops from 93% of the total computation time (for a mono-processor machine) to 51% (for a cluster of 10 dual-processors), reaching the constant computation time of the 1 2
T1 MR: 2562 × 26 voxels of 1.252 × 4 mm3 , CT: 5122 × 28 voxels of 0.652 × 4 mm3 . T1 MR: 2562 × 124 voxels of size 0.93752 × 1.4 mm3 .
Robust Registration of Multi-modal Images
145
sequential part. This means that we need to investigate the parallelization of the “constant time operations” to further improve the time performance. Performance Analysis. (Fig. 1, right) shows the speedup of the parallel implementation w.r.t. the number of CPUs, normalized by the execution time on a single CPU. This experiment was done using 1 to 10 mono-processor machines, and 1 to 10 dual-processors. To evaluate the quality of the performance, we also plot the theoretical speedup factors provided by Amdahl’s law: if α% of the code is sequential, and the remaining code is parallelized on N processors, the maximum speedup factor (with no communication delays) is S = 1/(α + 1−α N ). Using “constant time” values (Fig. 1, right), we estimated that α ≈ 7% for the mono-processor case, and a slightly inferior value of 6% for dual-processors, due to the OpenMP parallelization of the image resampling. As expected, the measured speedup is higher for dual-processors machines (the sequential part takes a smaller time) but it is also relatively closer to its theoretical curve. This can be explained by a lower communication overhead du to the shared memory (OpenMP part). Therefore, it is definitely cheaper, faster, and more efficient to use dual-processor machines. Influence of the Data Size. In the previous sections, we were only interested in the speedup factor, but verified on many datasets that the computation time was directly proportional to the volume of the data being registered. We report in the following table the registration times (in seconds) for different cluster configurations and the mean computation time per million of voxels. Even though the computation times are excellent for small image volumes with a small cluster, we still need at least 10 machines for comparable computation times on high resolution images. However, extrapolating our measurements to an intermediate image size typical for IGT(256×256×60 voxels [1]) gives acceptable computation times for only a few machines. Data type Sequential One dual-pro 5 dual-pro. 10 dual-pro. Time for 106 voxels 72 s 38 s 11.5 s 8.3 s Vanderbilt (1.7 106 voxels) 118 s 63 s 19 s 14 s Typical IGT (3.9 106 voxels) 280 s 150 s 45 s 33 s Salpˆetri`ere (8.1 106 voxels) 600 s 316 s 95 s 69 s
Trade-Off between Precision and Computation Time. Even if we finally obtained registration times of the order of one minute, which was our aim for IGT applications, one could think of applications where we need to register higher resolution images and/or reduce the number of machines. In this case, it seems difficult to further improve the computation times without modifying the parallel algorithmic scheme. However, the pyramidal approach used in the algorithm provides a multi-scale transformation estimation where the last level takes about 73% of the total computation time. This suggests that we could drastically speed-up the registration by stopping the algorithm before this level.
146
S. Ourselin, R. Stefanescu, and X. Pennec
To estimate the loss in precision with respect to the pyramid level, we used the transformation T0 obtained with the high level as the reference (our “ground truth” for this experiment), and we computed the mean localization error of representative points for the transformation Ti at each level i. We report in the following table this relative precision for the three high resolution datasets. As these are pre- and post-operative MR images, there are some important deformations between the images due to the surgical operation and artifacts (distortion due to air-brain interfaces, presence of electrodes...). In all cases, the relative precision of the transformation at level 1 is still below the voxel size, and corresponds to a relative precision of 0.2 mm at the center of the brain. Thus, an optimal trade-off between precision and computation time seems to be obtained by stopping the algorithm at pyramid level 1; we obtain a one minute registration with only two dual-processors machines. With low resolution images (the Vanderbilt database), we obtain even more impressive computation times: from 21 seconds for 1 dual-processors PC to 7 seconds for 5 dual-processors PCs. Data set 1 Data set 2 Data set 3 Voxel size 0.942 × 1.3 mm 0.902 × 1.0 mm 0.982 × 1.4 mm RM S(T3 ) 3.54 mm 10.7 mm 8.96 mm RM S(T2 ) 2.80 mm 1.08 mm 2.60 mm RM S(T1 ) 0.92 mm 0.68 mm 0.71 mm RM S(T0 ) reference reference reference
6
Discussion and Conclusion
The registration algorithm we chose to parallelize computes at each step a sparse vector field by block matching, which is used to estimate a robust parametric (rigid to affine) transformation using Least-Trimmed-Squares, and embedded in a multi-scale framework [10]. We proposed in this article parallel MPI and OpenMP implementations of the two main time-consuming steps: the vector field computation, with a resulting acceleration of approximately 48% on two processors, and the image resampling, with a speedup of a bit less (40%), mainly due to memory access limitations. The speedup results on more than two processors closely follow the theoretical bound given by Amdahl’s law. For the same number of processors, dual-processors machines are cheaper than mono-processor ones, and the parallelization is slightly more efficient (thanks to memory sharing in OpenMP). The computation times themselves are excellent on small images (19 seconds for 5 dual-processors on Vanderbilt images), and still very good on typical intra-operative and high resolution MR images: 45 seconds and 1min35 for 5 dual-processors, 33 and 70 seconds for 10 dual-processors workstations. Thus, we can conclude that a small cluster of dual-processors PCs is an optimal choice for real-time registration in IGT. One way to further accelerate the algorithm (or reduce the number of machines) is to slightly decrease its accuracy by stopping one level before the end in the multi-scale pyramid: the relative precision of the transformation is still below the voxel size, for only 27% of the total computation time. The registration of
Robust Registration of Multi-modal Images
147
high resolution MRI now takes only 1 min on a cluster of only 2 dual-processors PCs. With lower resolution images (the Vanderbilt database), we can even obtain a registration time of 15 seconds. The results obtained in this article, both in terms of computation time for registration and methodology for parallelization, open research avenues for performing huge computational tasks on large medical image databases, such as the quantification of disease evolution on a large number of patients or information retrieval and exploration in large image collections. Acknowledgments: This work was done during the PhD Thesis of S. Ourselin in the Epidaure Project, INRIA Sophia-Antipolis. The authors would like to thank Pr. D. Dormont from the Neurological Dpt. of La Piti´e Salpˆetri`ere Hospital (Paris) for the high resolution MR images. We would also like to thank Briony Doyle and Bhautik Joshi for proofreading of this article.
References 1. D.T. Gering, A. Nabavi, R. Kikinis, N. Hata, L.J. O’Donnell, E.L. Grimson, F.A. Jolesz, P.M. Black, and W.M. Wells. An integrated visualization system for surgical planning and guidance using image fusion and open MR. JMRI, 13:967–975, 2001. 2. D.F. Kasher, S.E. Maier, H. Mamata, Y. Mamata, A. Nabavi, and F.A. Jolesz. Motion robust imaging for continuous intraoperative MRI. JMRI, 13:158–161, 2001. 3. R. Kikinis. IGT: Today and tomorrow. Technical Report 192, Surgical Planing Laboratory, Brigham and Women’s Hospital, Boston, MA, USA, 2000. 4. L.D. Lunsford, R. Parrish, and L. Albright. Intraoperative imaging with a therapeutic computed tomographic scanner. Neurosurgery, 15:559–561, 1984. 5. Message Passing Interface Forum. MPI: A Message-Passing Interface Standard, May 1995. http://www.mpi-forum.org/. 6. T. Netsch, A. Van Muiswinkel, and J. Weese. Towards real-time multi-modality 3D medical image registration. In Proc. of ICCV’01, pages 718–725, 2001. 7. OpenMP Architecture Review Board. OpenMP C and C++ Application Program Interface Version 1.0, October 1998. 8. S. Ourselin, A. Roche, S. Prima, and N. Ayache. Block Matching: A General Framework to Improve Robustness of Rigid Registration of Medical Images. In Proc. of MICCAI’00, pages 557–566, Pittsburgh, Penn. USA, October 11-14 2000. 9. S. Ourselin, A. Roche, G. Subsol, X. Pennec, and N. Ayache. Reconstructing a 3D Structure from Serial Histological Sections. Im. Vis. Comp., 19(1-2):25–31, 2001. 10. S´ebastien Ourselin. Recalage d’images m´edicales par appariement de r´egions. Application ` a la construction d’atlas histologiques 3D. PhD thesis, Nice, France, Jan. 2002. http://www-sop.inria.fr/epidaure/BIBLIO/Author/OURSELIN-S.html. 11. S. Prima, S. Ourselin, and N. Ayache. Computation of the Mid-Sagittal Plane in 3D Brain Images. IEEE Transactions on Medical Imaging, In Press. 12. Peter J. Rousseeuw and Annick M. Leroy. Robust Regression and Outlier Detection. Wiley Series in Probability and Mathematical Statistics, first edition, 1987. 13. J. Schenk, F. Jolesz, and P. Roemer. Superconducting open-configuration MR imaging system for image-guided therapy. Radiology, 195:805–814, 1995. 14. S.K. Warfield, M. Ferrant, X. Gallez, A. Nabavi, F.A. Jolesz, and R. Kikinis. Realtime biomechanical simulation of volumetric brain deformation for image guided neurosurgery. In SC 2000: High Performance Networking and Computing Conf., volume 230, pages 1–16, Dallas, USA, Nov 4–10 2000. SPL Technical Report 188. 15. J. West and al. Comparison and evaluation of retrospective intermodality brain image registration techniques. J. of Comp. Assist. Tomography, 21:554–566, 1997.
3D Ultrasound System Using a Magneto-optic Hybrid Tracker for Augmented Reality Visualization in Laparoscopic Liver Surgery Masahiko Nakamoto1 , Yoshinobu Sato1 , Masaki Miyamoto1 , Yoshikazu Nakamjima1 , Kozo Konishi2 , Mitsuo Shimada3 , Makoto Hashizume2 , and Shinichi Tamura1
3
1 Division of Interdisciplinary Image Analysis, Osaka University Graduate School of Medicine 2 Department of Disaster and Emergency Medicine, Department of Surgery II, Graduate School of Medical Science, Kyushu University
Abstract. A three-dimensional ultrasound (3D-US) system suitable for laparoscopic surgery that uses a novel magneto-optic hybrid tracker configuration. Our aim is to integrate 3D-US into a laparoscopic AR system. A 5D miniature magnetic tracker is combined with a 6D optical tracker outside the body to perform 6D tracking of a flexible US probe tip in the abdominal cavity. 6D tracking parameters at the tip are obtained by combining the 5D parameters at the tip inside the body, the 6D parameters at the US probe handle outside the body, and the restriction of the tip motion relative to the handle. The system was evaluated in comparison with a conventional 3D ultrasound system. Although the accuracy of the proposed system was somewhat inferior to that of the conventional one, both the accuracy and sweet spot area were found to be acceptable for clinical use.
1
Introduction
The use of the laparoscope is becoming common as a minimally invasive procedure. However, its restricted views and lack of tactile sensation can limit the surgeon’s proficiency as well as make his/her task more stressful. Since laparoscopic surgery is essentially a monitor-based procedure, monitor-based augmented reality (AR) visualization can be naturally integrated into the system so as to both enhance the surgeon’s proficiency and reduce stress. For microscopic neurosurgery, which is also naturally combinable with monitor-based AR, AR systems that utilize preoperative CT or MR 3D data have been developed to enhance the surgeon’s capability especially in recognizing spatial relationships between tumors and vessels [1][2]. Unlike neurosurgery, in which rigid registration can be assumed, liver surgery requires nonrigid registration so as to be able to register preoperative CT or MR 3D data with the actual liver intraoperatively. However, accurate nonrigid registration between preoperative 3D data and the intraoperative liver is still considered difficult to achieve [3][4]. Currently, ultrasound is regarded as a useful intraoperative imaging modality that allows the surgeon to T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 148–155, 2002. c Springer-Verlag Berlin Heidelberg 2002
3D Ultrasound System Using a Magneto-optic Hybrid Tracker
149
recognize spatial relationships between tumors and vessels of the liver. In this paper, we describe the development of 3D ultrasound (3D-US) for the laparoscope with the aim of integrating 3D-US into an AR system for laparoscopic liver surgery. If we assume that liver motion is negligible between the 3D-US and laparoscopic image acquisitions, nonrigid registration is unnecessary. Such an assumption is clinically valid, since respiratory motion is controllable under anesthesia and breath-holding lasting as long as a minute is attainable without any problem. 3D-US has been used in AR systems, and has been shown to be effective for surgical guidance [5][6]. Unlike in conventional 3D-US, the tip of a laparoscopecompatible ultrasound probe can be flexibly moved inside the abdominal cavity. However, tracking the probe tip poses a particular challenge. Conventional magnetic trackers are too large to be inserted into the abdominal cavity without an additional incision, while optical trackers suffer from the line of sight constraint. To circumvent these problems, we employ a miniature magnetic tracker only 1 mm in diameter (Aurora; Northern Digital Inc., Waterloo, OT, Canada), which can be inserted into the abdomen without the need for an additional incision. Although this tracker is suitable for tracking flexible tools inside the body, it has two major restrictions: – It measures only five degrees of freedom (5D). – Its sweet spot, within which acceptable accuracy is attainable, is narrow. To overcome these drawbacks, we employ a novel magneto-optic hybrid configuration [7] in which the reference frames of both the magnetic and optical trackers are registered, with the optical trackers being used to track the mobile field generator of the magnetic tracker. This configuration provides the following advantages: – Six degrees of freedom (6D) for the position and orientation of the flexible probe tip are measurable by linking it with the optical tracker outside the abdomen. – Since the laparoscope is rigid, it can be accurately and robustly tracked by the optical tracker outside the abdomen for the superimposition of laparoscopic images and 3D-US. – The field generator can be arranged so that magnetic tracking is performed within (or near) the sweet spot, which effectively widens the sweet spot of the magnetic tracker. We previously reported preliminary experimental results [8] for a laparoscopic 3D-US pilot system in which a 5D magnetic tracker was simulated using a conventional 6D magnetic tracker (Fastrak; Polhemus Inc., Colchester, VT) under the simplified assumption that the US probe tip motion relative to the US probe handle outside the abdomen has three degrees of freedom. In this paper, we describe a clinically applicable 3D-US system for laparoscopic AR visualization that utilizes the Aurora miniature magnetic tracker without the above simplified assumption in regard to the probe tip motion. We also evaluate the accuracy of the system.
150
2 2.1
M. Nakamoto et al.
Methods Basic Formulation of Magneto-optic Hybrid Tracker
A 6D optical tracker is modeled as a system measuring the transformation Tot→or from the optical tracker frame Σot to the optical rigid-body frame Σor . A 6D magnetic tracker is modeled as a system measuring the transformation Tmt→mr from the magnetic tracker frame Σmt to the magnetic receiver frame Σmr . A 6D magneto-optic hybrid tracker centered at the optical tracker frame Σot is modeled as a system measuring both the transformation Tot→or the from optical tracker frame Σot , and the transformation Tot→mr the from optical tracker frame Σot to the magnetic receiver frame Σmr . A special form of magneto-optic hybrid tracking to measure the transformation Tot→mr is formulated as Tot→mr = Tot→or Tor→mt Tmt→mr ,
(1)
where Tor→mt represents the transformation from the optical rigid-body frame Σor , which is fixed to the magnetic field generator, to the magnetic tracker frame Σmt [8]. Note that Tor→mt is a static transformation while Tot→or and Tmt→mr change dynamically. If Tor→mt is known, the magnetic field generator can be placed anywhere so long as it can be optically tracked. We call the process of obtaining Tor→mt “magneto-optic calibration.” 2.2
Basic Formulation of 3D Ultrasound
Freehand 3D ultrasound using a 6D tracker is modeled as a system measuring the 3D coordinates, xst , in some tracker frame from the 2D coordinates, xus , in a US image frame, which is formulated as xst = Tst→sr Tsr→us xus ,
(2)
where Tst→sr represents the transformation from some tracker frame Σst to some reference frame Σsr , and Tsr→us represents the transformation from some reference frame Σsr to a US image frame Σus . Σst corresponds to Σot (optical tracker), Σmt (magnetic tracker), and so on. Σsr corresponds to Σor (optical rigid-body), Σmr (magnetic receiver), and so on. While Tst→sr is dynamic, Tsr→us is static. The process of obtaining Tsr→us is often called “3D ultrasound calibration” [6]. 2.3
6D Tracking by Combining 5D Magnetic and Additional Optical Trackers
We consider a 3D-US system described as xot = Tot→pt Tpt→us xus ,
(3)
where Tot→pt represents the transformation from Σot to the probe tip “moving” frame Σpt (magnetic receiver attached), and Tpt→us represents the transformation from Σpt to Σus , which is obtained by preoperative 3D-US calibration.
3D Ultrasound System Using a Magneto-optic Hybrid Tracker
(a)
151
(b)
Fig. 1. Miniature magnetic tracker (Aurora; Northern Digital Inc.). (a) Field generator. (b) Miniature magnetic receiver 1 mm in diameter.
Fig. 2. Ultrasound probe for laparoscope (Aloka, Tokyo).
When the Aurora miniature magnetic tracker (Fig. 1) is used in magneto-optic hybrid tracking, only 5D parameters (2D rotation, 3D translation) are procured in Tot→pt , the reason being that the 1-mm diameter rod-shaped Aurora magnetic receiver does not provide a rotation parameter around the rod axis. Since 3D parameters are provided with respect to translation, the objective is to obtain the rotation Rot→pt using the available 2D rotational parameters and additional information. The ultrasound probe for the laparoscope (a 7.5-MHz intraoperative electronic linear probe; Aloka, Tokyo) is shown in Fig. 2. The probe tip inside the abdomen is flexible but its motion is restricted. We define the probe tip “reference” frame Σpt0 to describe the restriction of the rotational motion. We describe the probe tip rotation using Pitch–Yaw–Roll angles in Σpt0 (Fig. 2.) The main component in the tip rotation is the Pitch component θ, which is controllable using the dial attached to the probe handle located outside the abdomen. The Yaw component φ can occur due to external force, while the Roll angle can be assumed to be always zero. Thus, the tip rotation Rpt0 →pt is described using Pitch θ and Yaw φ. By combining the above rotations, we have Rot→pt = Rot→ph Rph→pt0 Rpt0 →pt ,
(4)
where Rot→ph is provided by an optical tracker and Rph→pt0 is assumed to be known from the preoperative calibration. Since only the z-axis direction is provided in Rot→pt , and Rpt0 →pt can be described by θ and φ, we have
M. Nakamoto et al.
"
(a) Fastrak.
!
"
!
152
(b) Aurora.
Fig. 3. Positional and rotational errors of magnetic trackers.
z ot→pt = Rot→ph Rph→pt0 Rpt0 →pt (θ, φ)z 0 ,
(5)
T
where z 0 = (0, 0, 1) . Solving the above equation, we have θ and φ, and thus the 6D parameters of Tot→pt are determined. 2.4
Probe Tip Calibration
Probe tip calibration is the process of obtaining Tph→pt0 . Firstly, the US probe is put in an arbitrary initial position without external force and Tot→pt and Tot→ph are obtained. Let tot→pt0 and z ot→pt0 be the translational component and the z-axis direction of Tot→pt measured at the initial position, respectively. Secondly, the Pitch component θ is controlled using the dial attached to the probe handle and the tot→pt0 positions are obtained at several values of θ. We define the y-axis direction, y ot→pt0 , as the normal direction of the plane fitted to these positions. The x-axis direction is obtained by xot→pt0 = y ot→pt0 × z ot→pt0 . Thus, we have xot→pt0 y ot→pt0 z ot→pt0 tot→pt0 Tot→pt0 = , (6) 0 0 0 1 where xot→pt0 , y ot→pt0 , and z ot→pt0 are unit vectors. Finally, by combining Tot→ph , we have −1 Tph→pt0 = Tot→ph Tot→pt0 .
3
(7)
Experiments
Experiments were performed to evaluate the accuracy of the system. Polaris (Northern Digital Inc.) was employed as the optical tracker, and it was combined with either the miniature 5D magnetic tracker, Aurora or the conventional 6D magnetic tracker, Fastrak in order to compare the 3D-US accuracy of these two magneto-optic hybrid systems. Figure 3 shows the Fastrak and Aurora positional and rotational RMS errors, which arise only from the magnetic tracker itself, measured by a procedure based on the pivot method. The errors were plotted for different distances between the receiver and the origin of each magnetic tracker frame (the field generator). The
3D Ultrasound System Using a Magneto-optic Hybrid Tracker
153
(a)
(b)
Fig. 4. Laboratory experiments for accuracy evaluation. (a) Experimental set-up. (b) Ultrasound image of phantom object. Three pit depths are imaged.
accuracy of Aurora was more affected by the distance from the field generator than that of Fastrak1 . That is, the sweet spot of Aurora was narrower. Figure 4 shows the experimental set-up (Fig. 4(a)) and a US image of a phantom in a water bath (Fig. 4(b)). The 3D positions at three phantom pit depths in the water bath were measured using the following methods: 1. The Polaris pen-probe digitizer (let xP olaris be its measurements). 2. The proposed 3D-US system with the Aurora-based hybrid tracker described in 2.3 and 2.4 (xAurora ). 3. The conventional 3D-US system with the Fastrak-based hybrid tracker (xF astrak ), in which the Fastrak 6D receiver was attached to the probe tip. As noted earlier, unlike the proposed 3D-US system, the conventional 3D-US system needs an additional incision to be made in the abdomen for its clinical application. The accuracies of the proposed and conventional 3D-US systems were evaluated by regarding the measurements of the Polaris pen-probe digitizer as the gold standard, that is, the errors were defined as follows: ∆xAurora = |xP olaris − xAurora |, ∆xF astrak = |xP olaris − xF astrak |.
(8) (9)
Two sizes of rigid body, each triangular in shape, were used for the optical tracking of the field generator. Polaris markers were attached at the vertices. The larger rigid body has an arc of 180 mm, while that of the smaller one was 120 mm. 3D-US images were acquired for nine different arrangements of field generators of both magnetic trackers relative to the Polaris camera, that is, Σot . The 2D positions, xus , of the three phantom pits were acquired by manual specification. One hundred measurements for their 3D positions were obtained for each arrangement. 1
Aurora is still under development. Northern Digital Inc. has just announced that the accuracy of Aurora will be improved.
M. Nakamoto et al.
154
(a) Fastrak.
(b) Aurora.
Fig. 5. Positional errors of 3D ultrasound system. The effects of the distance between the receiver and the origin of each magnetic tracker frame are shown.
Figure 5 shows the RMS errors of the positions estimated by the 3D-US systems for different distances between the receiver and the origin of each magnetic tracker frame. In the conventional (Fastrak-based) system, the error was within 1.5 mm for all three phantom pit depths when the distance from the tracker origin was within 50 cm. In the proposed (Aurora-based) system, the error was within 2.0 mm for 50 and 25 mm depths when the distance from the tracker origin was within 30 cm. In the US images, the effect of depth was thus significant in the proposed method while there was little effect in the conventional system. In both the Fastrak- and Aurora-based systems, the error difference due to the rigid body size for the optical tracking of the field generator was only 0.1 – 0.2 mm between the large and small rigid bodies (results not shown).
4
Discussion and Conclusions
We have described a 3D-US system for laparoscopic surgery using a miniature 5D magnetic tracker combined with an optical tracker to realize 6D tracking inside the abdomen, intended for integration into a laparoscopic AR system. Although the accuracy of the proposed system was somewhat inferior to that of the conventional one, it was acceptable within a 30-cm radius field of view. Since the field generator of the magnetic tracker is mobile inside the field of view of the optical tracker, the sweet spot can be effectively widened. One potential criticism against magneto-optic hybridization is the possibility of error propagation arising from the combining of two trackers. We confirmed both by simulations and laboratory experiments that the increase in error introduced by magneto-optic hybridization was 0.1 – 0.2 mm for an appropriate rigid body size as compared that using only a magnetic tracker. Considering that the field generator is mobile, better average accuracy is, in practice, attainable by magneto-optic hybridization. In our accuracy evaluations, the difference in accuracy between the proposed and conventional 3D-US systems was closely related to the RMS errors of the magnetic trackers themselves. Hence, the accuracy of the proposed system is expected to be improved by developmental improvements in the accuracy of Aurora itself. However, a significant difference was observed in depth dependence in the US images; the dependence was relatively large in the proposed system
3D Ultrasound System Using a Magneto-optic Hybrid Tracker
155
but negligible in the conventional system. This is considered to be due to the inaccuracy of the 1D rotation parameter, which is not estimated in Aurora, since an error in this rotation results in a proportionate 3D-US depth error. Future work will include analyzing the error factors in the calibration process described in section 2.4 and improvement of the calibration method based on this analysis. In this paper, we employed two magnetic trackers, Fastrak and Aurora. An alternative choice is miniBird (Ascension Tech Corp., Burlington, VT), a 6D magnetic tracker with a 5 × 5 × 10 mm3 receiver. Although the miniBird receiver is smaller than that of Fastrak, in terms of receiver size (as well as cable thickness), Aurora is superior. However, considering the trade-off between size and accuracy, we are planning to evaluate a miniBird-based 3D-US system. Our proposed 3D-US system has already been integrated into an AR configuration that superimposes 3D-US renderings onto laparoscopic images, and its clinical feasibility in laparoscopic surgery has been tested [9]. The laparoscope, which is a rigid endoscope, is optically tracked, while 3D-US images are obtained using the proposed system. The 3D-US images are superimposed onto those of the laparoscope using the method described in [6]. The AR configuration incorporating the proposed 3D-US system was confirmed to function successfully in the operating room environment [9].
Acknowledgements This work was partly supported by JSPS Research for the Future Program JSPSRFTF99I00903 and JSPS Grant-in-Aid for Scientific Research (B)(2) 12558033.
References 1. P. J. Edwards, et al. Design and Evaluation of a System for Microscope-Assisted Guided Interventions (MAGI). IEEE Trans. Med. Imaging, 19(11):1082–1093, 2000. 2. Y. Akatsuka, et al. AR Navigation System for Neurosurgery. Lecture Notes in Computer Science, 1935 (MICCAI2000):833–838, 2000. 3. A. J. Herline, et al. Surface Registration for Use in Interactive Image-Guided Liver Surgery. Lecture Notes in Computer Science, 1679 (MICCAI’99):892–899, 1999. 4. Y. Masutani, et al. Modally Controlled Free Form Deformation for Non-rigid Registration in Image-Guided Liver Surgery. Lecture Notes in Computer Science, 2208 (MICCAI2001):1275–1278, 2001. 5. H. Fuchs, et al. Towards Performing Ultrasound-Guided Needle Biopsies from within a Head-Mounted Display. Lecture Notes in Computer Science, 1131 (VBC’96):591– 600, 1996. 6. Y. Sato, et al. Image Guidance of Breast Cancer Surgery Using 3-D Ultrasound Images and Augmented Reality Visualization. IEEE Trans. Med. Imaging, 17(5):681– 693, 1998. 7. M. Nakamoto, et al. Magneto-Optic Hybrid 3-D Sensor for Surgical Navigation. Lecture Notes in Computer Science, 1935 (MICCAI2000):839–848, 2000. 8. Y. Sato, et al. 3D Ultrasound Image Acquisition Using a Magneto-optic Hybrid Sensor for Laparoscopic Surgery. Lecture Notes in Computer Science, 2208 (MICCAI2001):1151–1153, 2001. 9. K. Konishi, et al. Development of AR Navigation System for Laparoscopic Surgery Using Magneto-optic Hybrid Sensor: Experiences with 3 Cases. CARS2002, 2002, in press.
Interactive Intra-operative 3D Ultrasound Reconstruction and Visualization David G. Gobbi1,2 and Terry M. Peters1,2 1
Imaging Research Laboratories, John P. Robarts Research Institute, University of Western Ontario, London ON N6A 5K8, Canada
[email protected],
[email protected] http://www.imaging.robarts.ca/Faculty/peters.html 2 Department of Medical Biophysics, University of Western Ontario, London ON N6A 5C1, Canada
Abstract. The most attractive feature of 2D B-mode ultrasound for intra-operative use is that it is both a real time and a highly interactive modality. Most 3D freehand reconstruction methods, however, are not fully interactive because they do not allow the display of any part of the 3D ultrasound image until all data collection and reconstruction is finished. We describe a technique whereby the 3D reconstruction occurs in real-time as the data is acquired, and where the operator can view the progress of the reconstruction on three orthogonal slice views through the ultrasound volume. Capture of the ultrasound data can be immediately followed by a straightforward, interactive nonlinear registration of a pre-operative MRI volume to match the intra-operative ultrasound. We demonstrate the our system on a deformable, multi-modal PVA-cryogel phantom and during a clinical surgery.
1
Introduction
The intra-operative use of 3D ultrasound during neurosurgical procedures is recent and still very rare. While there are significant advantages to using 3D ultrasound over 2D ultrasound, many technical hurdles remain with respect to optimization of the instrumentation and software. A significant amount of development in this area has been performed by SINTEF in Norway [1,2], leading to a commercial product for neuronavigation with 3D ultrasound. The two specific developments in this field that we are investigating are real-time reconstruction of the 3D ultrasound volumes and the use of multi-modal nonlinear registration to warp pre-operative MR images to match intra-operative ultrasound. The non-linear registration of MRI to ultrasound is of particular interest because it allows the pre-operative MRI images (along with any contrast enhancement or functional information) to be used for accurate surgical guidance, even in the presence of the significant degree of brain shift [3,4] that is common in most craniotomy procedures. Our primary focus in this work is to demonstrate how real-time 3D ultrasound reconstruction can be performed on standard computer hardware, and T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 156–163, 2002. c Springer-Verlag Berlin Heidelberg 2002
Interactive Intra-operative 3D Ultrasound Reconstruction and Visualization
157
furthermore how the reconstruction can be integrated into a 3D visualization package to allow visualization of arbitrary slices through the 3D image while data collection is taking place.
2
Materials and Methods
Our software platform uses VTK (http://www.kitware.com/vtk.html) [5] for 3D rendering and a high-level application framework written in python for the user interface (http://www.atamai.com). We have contributed most of the computational C++ code that we developed for this application to the VTK library, including our code for performing nonlinear transformations on images [6]. For image acquisition, we use an Aloka SSD-1700 ultrasound scanner with a 5.0 MHz curved-array neuro ultrasound probe (Aloka Co., Ltd., Tokyo, Japan). The probe is fitted with a set of infrared LEDs that are tracked by a POLARIS optical measurement system (Northern Digital Inc., http://www.ndigital.com). The 3D image reconstruction and visualization is performed using a 2 CPU 933 MHz Pentium III workstation in real time as the the B-mode ultrasound video and POLARIS tracking measurements are acquired. The C++ reconstruction module is integrated into the VTK framework to allow for flexible analysis and visualization of the reconstructed image. 2.1
Reconstruction
Our reconstruction method relies on a splatting technique for high-quality interpolation, where each pixel of a B-mode images is smeared into a N ×M ×O kernel which is then either compounded or ‘alpha-blended’ into the 3D image reconstruction volume at the appropriate (x, y, z) location. The 2D B-mode images are splatted one-by-one as they are captured to provide real-time reconstruction. Compounding requires the use of an accumulation buffer which is the same size (i.e. same number of voxels) as the reconstruction volume. For each pixel Ipixel in the B-mode images, the splat kernel coefficients bk are calculated and then both the intensity values Ik voxel for voxels in the reconstruction volume that are touched by the splat and the corresponding values ak in the accumulation buffer are updated as follows: Ik voxel := (bk Ipixel + ak Ik voxel )/(bk + ak ) ak := bk + ak
(1) (2)
Our second method, which we refer to as ‘alpha blending’ because it uses the same equation that is used for image compositing via alpha blending, provides interpolation without the use of an accumulation buffer: Ik voxel := bk Ipixel + (1 − bk )Ik voxel
(3)
An additional heuristic is required here: the first time a voxel Ik voxel is hit by a splat, Ik voxel := Ipixel . Otherwise the initial voxel value of Ik voxel = 0 would be
158
D.G. Gobbi and T.M. Peters
blended into the final voxel value. In this method each new splat tends to obscure previous splats which hit the same voxels, whereas compounding provides an optimal averaging of the new splat with the previous splats. In the case where the splat kernel is a single point a0 = 1, these equations simplify respectively to those for pixel nearest neighbor (PNN) interpolation with compounding and PNN without compounding [7]. Of greater interest is a 2×2×2 kernel where ak are trilinear interpolation coefficients, which corresponds to a splat that is a boxcar function with the same dimensions as a voxel of the reconstruction volume. We refer to reconstruction with this boxcar splat as pixel trilinear (PTL) interpolation, and apply it either with or without compounding. It provides higher quality reconstruction than PNN but is still fast enough to be used for real-time reconstruction. In order to ensure that there are no gaps in the 3D image after reconstruction, the separation between B-mode ultrasound images must be less than the kernel width. This means a separation of less than 1 voxel width for PNN, and less than 2 voxel widths for PTL. In practice we found that maintaining a separation of approximately 1 voxel width (typically 0.5 mm) was not difficult to achieve after a short period of practice. Note that the use of large kernels and radially symmetric splats with compounded reconstruction is equivalent to the radial basis function (RBF) [8] interpolation technique, which is very computationally intensive but produces very high quality reconstructions. 2.2
Data Flow Requirements for Real-Time Reconstruction
Among the significant problems associated with providing interactive visualization of the ultrasound volume during its reconstruction is the fact that the acquisition of tracking information from the POLARIS, the digitization of the ultrasound video stream, and the reconstruction of the 3D ultrasound volume all have to take place simultaneously in the background while the user interface for the application must continue to operate smoothly. We did not want the application interface to freeze while the reconstruction was taking place, but instead required that the user be able to adjust the view of the 3D ultrasound volume while the reconstruction progressed. This interactivity is maintained by breaking the application into several threads that run in parallel, and by ensuring proper synchronization of data transfers between the threads. In total there are five threads that perform the following tasks (Fig. 1): (a) One thread waits for tracking information to arrive from the POLARIS via the serial port, converts the information to a 4×4 matrix, and places the matrix onto a pre-allocated stack along with a timestamp. A spinlock (a “mutex lock” in modern programming terms) is used to ensure that both the tracking thread and the reconstruction thread can safely access the transformation stack without any possibility of data corruption.
Interactive Intra-operative 3D Ultrasound Reconstruction and Visualization (a) 60 Transforms/s Mn M n−1 M n−2 M n−3 M n−4
Serial Port Buffer
...
Tracked Ultrasound Probe
XForm Stack XForm Stack Spinlock Image Stack Spinlock
(c)
159
N Insertions/s
111111111 000000000 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111
Reconstruction Volume
x (d)
(b) 30 Frames/s −y
M Frames/s
Video Frame Buffer
... Image Stack
Multimodal 3D Visualization
Fig. 1. Interactive 3D freehand ultrasound reconstruction/visualization.
(b) A second thread moves each captured ultrasound video frame onto a stack along with a timestamp that describes when the video frame was digitized. Capture of the video at the full rate of 30 fps is guaranteed by the use of hardware interrupts and buffering within the video driver. (c) Two threads insert the most recent ultrasound video frame into the ultrasound reconstruction volume. Each of the two threads runs on one of the two processors in the computer: one thread performs the splats for the top half of the video frame and the other thread performs the splats for the bottom half. The rate N at which video frames can be inserted into the reconstruction volume depends on the available processor speed and on the interpolation method that is used. The position and orientation of the ultrasound probe for each video frame are interpolated from the stack of timestamped matrices from the tracking system, taking the lag of the video information relative to the tracking information into account. (d) The main application thread displays the partially reconstructed 3D ultrasound volume on the computer screen. The rate M at which the 3D view is refreshed is generally set to between 3 and 10 refreshes per second (lower than or equal to the rate N at which new video frames are inserted into the reconstruction volume). Even if the reconstruction rate N is less than the 30 fps of the video feed, the video frames and the POLARIS tracking information are buffered at 30 fps and up to 60 measurements per second respectively. Once a freehand data-acquisition sweep has been completed, all of the video frames that were not previously used in the reconstruction can be compounded into the volume in order to further improve the final image quality.
160
2.3
D.G. Gobbi and T.M. Peters
Image Warping
Once the 3D ultrasound volume has been acquired, a manual nonlinear 3D registration can be performed to warp the 3D MRI to match the ultrasound volume. This registration is achieved via a 3D thin-plate spline, where the operator can drop a control point for the spline at any location in 3D space and, by then dragging the point to a new position with the mouse, interactively warp the MRI volume to match it to the ultrasound volume. We have previously reported on this technique [9,10] and will not cover it further here. 2.4
Validation with Deformable Phantoms
For validation of our methods, we have constructed a deformable PVA-cryogel [11] phantom that matches one of our 3D MRI anatomical data sets (see Fig. 2). The phantom is very soft, and we have enclosed it within a thermo-plastic shell that mimics the physical constraint provided by the cranium. A hole was left in the shell to provide access for an ultrasound probe. This phantom is suitable for imaging with MRI, CT or ultrasound. The re-usable mould for the phantom was created via stereo laser lithography from a segmented MRI data set. The only significant difference between the mould and the cortical surface segmented from the data set is that the depths of some of the sulci were reduced before the mould was produced. This was necessary to ensure that the phantom could be removed from the mould after it was produced.
3
Results
Three slices through a reconstructed 3D ultrasound image are shown in Fig. 4. The reconstruction was done using our PTL compounded interpolation method from B-mode data cropped from video frames captured at 320×240 resolution with 0.4 mm pixels. The reconstruction volume size was 256×193×256 and the voxel size was 0.4 mm. During the reconstruction, the display update rate for viewing the reconstructed volume (which is independent of the reconstruction rate) was limited to 5 updates per second due to the high computational load on the computer. Magnified images of slices perpendicular to the original 2D B-mode scans are shown in Fig. 5 for each interpolation mode. The reconstructions were all done in real-time, and each image represents a different set of scan data. Because the POLARIS system provides a tracking accuracy of better than 0.4 mm (i.e. less than the voxel size) there are visible dislocation artifacts from jitter of the measured ultrasound probe position. The noise in the images is noticeably reduced for PTL interpolated reconstruction relative to PNN reconstruction. The use of compounding also reduces the noise, but it must be noted that this effect depends on how slowly the operator moves the ultrasound probe during the scan. The use of compounding is
Interactive Intra-operative 3D Ultrasound Reconstruction and Visualization
161
Fig. 2. The deformable PVA-cryogel phantom (left) and the mould (right) that was generated from a high-quality MRI scan of an actual brain.
Fig. 3. Phantom image reconstructed by PTL via compounding, sliced in three orthogonal orientations. The maximum ultrasound scan depth was 8 cm, and one freehand scan sweep was performed to acquire the data. The phantom features that can be seen are the shallow ‘central sulcus’ that partially separates the two ‘hemispheres’ (upper left in first image) and thick strands of strongly echogenic PVA that are placed throughout the phantom (seen both lengthwise and in cross section).
Fig. 4. From left to right: PNN, PNN with compounding, PTL via alpha blending, PTL via compounding. The noise in the image decreases as a result of both interpolation and compounding.
162
D.G. Gobbi and T.M. Peters
Fig. 5. A clinical 3D ultrasound image that we acquired through a craniotomy just prior to the removal of a small subdural tumor (seen at the top of the image). The image on the right includes an MRI image that has been linearly registered to the ultrasound image.
not expected to have any impact on quality unless the scan spacing is less than the voxel size, or unless the operator scans through the volume multiple times. The reconstruction rate (i.e. the number of video frames splatted into the reconstruction volume per second) was the maximum possible value of 30 s−1 for the two PNN methods, 20 s−1 for PTL via alpha blending, and 12 s−1 for PTL via compounding. This is a very encouraging result, because 30 s−1 should be possible for all methods with a modern 2.2 GHz dual-CPU PC system. We have used our system clinically for a transdural scan prior to a tumor excision. For this case the craniotomy was very small and no brain shift was noted. The image was captured and reconstructed within our in-house surgical guidance software package, and both the ultrasound and a pre-operative MRI were used for guidance during the surgical procedure.
4
Conclusion
We have demonstrated a technique for providing high-quality 3D freehand ultrasound reconstructions in real time on conventional PC hardware. This reconstruction technique is optimally suited for intra-operative use because the the 3D image can be viewed before the scan is completed, and the operator can re-scan areas that were poorly covered on the first pass. The ultrasound volume can immediately be viewed as a multi-modal overlay on the pre-operative 3D MRI, and furthermore a manual nonlinear registration of the MRI to the ultrasound can be performed.
Interactive Intra-operative 3D Ultrasound Reconstruction and Visualization
163
Acknowledgments The PVA-cryogel phantom used for this project was constructed by Kathleen Surry at The Robarts Research Institute. Our surgical navigation system was developed as a shared effort between the first author, Kirk Finnis, and Dr. Yves Starreveld of the London Health Sciences Centre. Dr. Starreveld also performed the intra-operative ultrasound scan, under the supervision of Dr. Wai Ng. Funding for this research is provided through grants from the Canadian Institutes for Health Research and the Institute for Robotics and Intelligent Systems.
References 1. G. Unsgaard, S. Ommedal, T. Muller, A. Gronningsgaeter and T.A. Nagelhus Hernes. Neuronavigation by Intraoperative Three-dimensional Ultrasound: Initial Experience during Brain Tumor Resection. Neurosurgery 50:804–812, 2002. 2. A. Gronningsaeter, A. Kleven, S. Ommedal, T.E. Aerseth, T. Lie, F. Lindseth, T. Langø and G. Unsg˚ ard. SonoWand, an Ultrasound-based Neuronavigation System. Neurosurgery 47:1373–1380, 2000. 3. D. L. G. Hill, C. R. Maurer Jr., R. J. Maciunas, J. A. Barwise, J. M. Fitzpatrick and M.Y. Wang. Measurement of Intraoperative Brain Surface Deformation under a Craniotomy. Neurosurgery 43:514–528, 1998. 4. D. W. Roberts, A. Hartov, F. E. Kennedy, M. I. Miga and K. D. Paulsen. Intraoperative Brain Shift and Deformation: A Quantitative Analysis of Cortical Displacement in 28 Cases. Neurosurgery 43:749–760, 1998. 5. W. Schroeder, K. W. Martin and W. Lorensen. The Visualization Toolkit, 2nd Edition. Prentice Hall, Toronto, 1998. 6. D.G. Gobbi and T.M. Peters, Generalized 3D nonlinear transformations for medical imaging: An object-oriented implementation in VTK, Computerized Medical Imaging and Graphics, accepted for publication, 2001. 7. R. Rohling, A. Gee, and L. Berman. A Comparison of Freehand Three-dimensional Ultrasound Reconstruction Techniques. Medical Image Analysis 3:339–359. 8. R. Rohling, A. Gee, L. Berman and G. Treece. Radial Basis Function Interpolation for Freehand 3D Ultrasound. Information Processing in Medical Imaging - IMPI’99. A. Kuba, M. S´ amal and A. Todd-Pokropek (eds) Lecture Notes in Computer Science 1613:478–483, Springer-Verlag, Berlin, 1999. 9. D.G. Gobbi, B.K.H. Lee, and T.M. Peters. Correlation of pre-operative MRI and intra-operative 3D ultrasound to measure brain tissue shift. Proceedings of SPIE 4319:264–271, 2001. 10. D.G. Gobbi, R.M. Comeau and T.M Peters. Ultrasound/MRI Overlay With Image Warping for Neurosurgery. Medical Image Computing and Computer Assisted Intervention - MICCAI 2000. Scott L. Delp, Anthony M. DiGioia, Branislav Jaramaz (eds) Lecture Notes in Computer Science 1935:106–113, Springer-Verlag, Berlin, 2000. 11. I. Mano, H. Goshima, M. Nambu and I. Masahiro. New Polyvinyl Alcohol Gel Material for MRI Phantoms. Magnetic Resonance in Medicine 3: 921–926, 1986.
Projection Profile Matching for Intraoperative MRI Registration Embedded in MR Imaging Sequence Nobuhiko Hata, Junichi Tokuda, Shigeo Morikawa, and Takeyoshi Dohi Graduate School of Information Science and Technology, The University of Tokyo Molecular Neuroscience Research Center, Shiga University of Medical Science {noby,junichi,dohi}@atre.t.u-tokyo.ac.jp
[email protected]
Abstract. Fast image registration for magnetic resonance image (MRI)-guided surgery using projection profile matching embedded in MR pulse sequence is proposed. The method can perform two-dimensional image registration by matching projection profiles acquired with zero-degree phase encoding. The matching process continuously measures displacement by optimizing cross correlation value from profiles acquired 64 times by a special pulse excitation and echo acquisition in one imaging cycle. A phantom experiment concluded that the method can perform the registration in 25ms with the accuracy of 0.50mm out of 100mm field of view. The paper also includes in-vivo experiment to registration MRI arm in motion. Unlike previously reported image registration by post-processing, the method is suitable in intraoperative setting where fast registration is in great need.
Introduction The role of intraoperative Magnetic Resonance Image (MRI)-guided surgery becomes increasingly important reflecting the trend of minimally invasive therapy1-3. The MRI is suitable for intraoperative use for its oblique imaging capabilities, soft tissue discrimination and the detailed delineation of vascular structures. Additionally, MR imaging has unique potential capabilities for functional, thermal, and physiologic imaging. Despite its obvious advantage in surgical guidance and monitoring, intraoperative MRI has some limitations in the image quality in comparison to conventional diagnostic high tesla MR, since requirements for interventional use have some impact on intraoperative MR imaging capability. An example of such requirement is the use of surface coils for surgical access, thus causing the inhomogeneous sensitivity in intraoperative images. One of the solutions to overcome these limitations in intraoperative MRI is to fuse pre-operative MRI and/or other multi-modality images to provide complementary information4-8. Fusion of pre-operative image and intraoperative MRI is possible by registering skin markers8,9, intensity-based image matching4,7, or by taking preoperative MRI scans just after anesthesia, assuming that the patient doesn’t move thereafter6,10. While these previously published registration methods have indicated the significance of the intra-operative multi-modal registration, they include postT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 164–169, 2002. © Springer-Verlag Berlin Heidelberg 2002
Projection Profile Matching for Intraoperative MRI Registration
165
processing of intra-operative images or cumbersome marker setting, thus inherently requires a few minutes or more to achieve a registration. Though the other methods5,6 do not need procedures for registration, they rely on an assumption that patient or cured lesion does not move throughout the case, which is specific to their targeted clinical applications. A novel image fusion method proposed in this paper, is to perform fast registration in 10 to 100 ms order by embedding the image registration process in MRI imaging sequence. The method tries to match the projection profiles taken before and after the motion, along phase and frequency encoding direction, by optimizing the cross correlation. The newly proposed method is significant in engineering perspective since it is, by the authors’ knowledge, the first proposal in publication to include image registration in MR image acquisition, thus notably reducing the computing time. The paper is clinically significant since the newly proposed method, when implemented in clinical intraoperative imager, will enable physicians’ real-time access to preoperative multi-modal image thus achieving the enhanced and detailed assessment of cured region. The objective of this paper is to present the principle of profile matching and its ease of inclusion in MR imaging sequence. The method is then evaluated by phantom study to assess the accuracy and performance of the method, and by in-vivo experiments to investigate the clinical feasibility.
Materials and Methods MR Projection Profile In two-dimensional Fourier encoding, the signal received by Radio Frequency (RF) coil is given by:
S (t ) = !
∞
!
∞
− ∞ −∞
M ( x, y ) exp{−i (γGx xt + γG y yt y )} dxdy
(1)
where M ( x, y ) is the distribution of magnetization, γ is the gyromagnetic ratio, and Gx , G y is the magnetic field gradient. Consider the case that the RF coil receives echo without phase encoding ( G y = 0 ). The received signal is
S (t ) = !
∞
!
∞
−∞ −∞
M ( x, y ) exp(−iγG x xt ) dxdy .
(2)
For notational convenience, we can rewrite this equation as
S (k x ) = !
∞
!
∞
−∞ −∞
M ( x, y ) exp(−ik x x) dxdy
(3)
where k x = γGxt . By applying Fourier Translation to equation (3), we obtain the projection profile along x-axis. ∞ 1 ∞ (4) p ( x) = S ( k x ) dk x = ! M ( x, y ) dy ! −∞ 2π −∞ Similarly, the projection onto y-axis can be obtained by taking echo without phase encoding ( G x = 0 ).
166
N. Hata et al.
Profile Matching by Cross Correlation Under the condition of rigid body motion, the two-dimensional translation of the subject causes the 1-dimensional translation of projection profiles along each axis. The shape of projection profile is not deformed by the translation. This means that the displacement of the subject can be quantified by matching projections before and after the motion. Now we are given two projection profiles: p0 ( x ) and pn (x) . p0 ( x ) is the baseline projection profile along x-axis, and pn (x) is the n th projection profile along x-axis. Estimating the x-directional translation of the subject between the baseline and n th profile acquisition at ∆x , pn ( x) is almost identical to p0 ( x + ∆x) when the parameter ∆x is actual value. Thus we can determine the translation parameter ∆x by maximizing the similarity measure between pn ( x) and p0 ( x + ∆x) . We used cross correlation for this similarity measure. The problem is denoted as follows: (5) ∆x = arg max(C ( p0 ( x + ∆x), pn ( x))) where C ( p0 ( x + ∆x), pn ( x)) is cross correlation between p0 ( x + ∆x) and pn ( x) . The method can be applied to the other direction by exchanging the direction of phase encoding and frequency encoding.
Experiments The method was evaluated by two set of experiments: first, the combined MR imaging sequence for imaging and matching is implemented in 2.0T experimental MRI scanner to compare the matching results with gold standard measured by external displacement sensor. In the second set of experiments, an arm was tracked by the newly proposed registration method to assess the feasibility of the method in clinical setting. In the phantom feasibility study, the goal of the experiment was to evaluate the accuracy of the proposed method in a 2.0 Tesla experimental scanner (CSI Omega System, Bruker, Fremont, CA). The specially designed placement stage is installed on the scanner's bore. The movement of the stage was restricted to only frequencyencoded direction for imaging. The moving phantom was fixed on the stage (Fig. 2). The pulse sequence was modified gradient echo (TR/TE: 40ms/7ms, field of view (FOV) 100mm, matrix 256x128), which is near the sequence of semi-real time imaging used in an intraoperative MRI. In this pulse sequence, non-phase-encoded pulse excitation and echo acquisition process was inserted every other echo acquisition for imaging. Total of 64 projection profiles per each axis were obtained during one imaging (approx. 10 seconds). The stage was moved manually up to 30-40 mm while scanning, and the stage was tracked by Charge-Coupled Device (CCD) laser displacement sensor (LK-500, Keyence, Osaka, Japan) with resolution of 50 µm and sampling cycle of 1024 µs . Both scanning and laser displacement sensor were synchronized by a shared pulse generator. We also performed in-vivo study by imaging axial images of an arm of a volunteer (male, 22-y-o). The arm was continuously moved along horizontal and vertical di-
Projection Profile Matching for Intraoperative MRI Registration
167
rection in the scanner’s bore at approximately 20 to 30 mm per second. Imaging sequence and the scanner were same as those used in phantom experimental study. In both phantom and in-vivo experiments, projection matching process was executed retrospectively in a workstation (SunBlade1000, Sun Microsystems, Palo Alto, CA) after all data acquisition had been completed. All matching program is developed on MATLAB (The MathWorks, Natick, MA). Ager Phantom Laser CCD displacement Sensor
∆xd
∆x p
Stage
MR Scanner Bore
Fig. 1. A schematic presentation of configuration of phantom experiment (top) and the experimental setup in 2.0T experimental MR scanner (bottom). The stage was tracked by the laser CCD displacement sensor to measure actual displacement of the age phantom,
∆x d , and compared to ∆x p
computed by the proposed method
Results Eighteen images (64 projection profiles per image for each direction) were acquired for each experiment and 1140 were successfully completed. The standard deviation of the error between the value of the phantom displacement computed from projection data and the value measured by the CCD laser displacement sensor was 0.3 mm out of 100m FOV. Optimization of each projection profile was completed in 36 ms (CPU time) in average. From the in-vivo arm study, we could confirm the re-alignment of the displaced arm in the original position. Figure 2 illustrates an illustrative example from the experiment.
168
N. Hata et al.
Fig. 2. The results from the in-vivo arm scan using 2.0T experimental MRI scanner. (a) is an illustrative example of non-registered image, and (b) shows the image (a) registered to the reference image. (c) and (d) are the subtraction of the image (a) to the reference image, and (b) to the reference image, respectively. Note that the registered image (b) well matches the baseline image generating less mismatched area in the subtraction image (d)
Discussions The objective results from the phantom image were satisfactory, especially the computing time 36ms per one profile matching. This indicates, if the matching and image acquisition can be performed in parallel and TR value in MR imaging is small enough, the matching requires nominal additional time to normal imaging. Our experience, not presented in this paper, indicates that the accuracy of matching is affected by signal-to-noise ratio (SNR). The accuracy is also affected by inhomogeneity of coil sensibility. Since coil sensitivity is highest in the center of the coil, the larger the displacement of phantom becomes, the weaker the signal from phantom is. The matching error reported in this paper (0.3mm out of 100mmFOV) is presumable smaller than the number we can expect in clinical small tesla intraoperative MRI scanner. Our method is based on the assumption that the subject is rigid, and not deformable. Therefore, the current implementation of the method does not function well when a part of anatomical structure in an image moves, while the rest is still. A pos-
Projection Profile Matching for Intraoperative MRI Registration
169
sible modification to the method to solve this issue is the segmenting out the moving anatomy from a baseline image and takes profiles from the segmented part. The projection profiles of the moving object during scans were calculated by subtracting the baseline profile of the static object from each projection profiles.
Conclusion We have demonstrated the method for fast registration embedded in MR pulse sequence. The method successfully performed on phantom study with enough accuracy and speed. This method has capability to apply to 3-dimensional Fast Fourier Transformation (FFT) imaging, and estimate the 3-dimensional motion of the subject.
Acknowledgements This study was funded by Grant-in-Aid Scientifi Research (A-14702070 and B13558103). NH was supported by Suzuken Memorial Foundation, Kurata Grants, and Toyota Physical & Chemical Research Institute.
References 1. Ladd, M. E., Quick, H. H. & Debatin, J. F. Interventional MRA and intravascular imaging. J Magn Reson Imaging 12, 534-46. (2000). 2. Quesson, B., de Zwart, J. A. & Moonen, C. T. Magnetic resonance temperature imaging for guidance of thermotherapy. J Magn Reson Imaging 12, 525-33. (2000). 3. Lewin, J. S., Metzger, A. & Selman, W. R. Intraoperative magnetic resonance image guidance in neurosurgery. J Magn Reson Imaging 12, 512-24. (2000). 4. Gering, D. T. et al. An integrated visualization system for surgical planning and guidance using image fusion and an open MR. J Magn Reson Imaging 13, 967-75. (2001). 5. Hata, N. et al. MR imaging-guided prostate biopsy with surgical navigation software: device validation and feasibility. Radiology 220, 263-8. (2001). 6. Nabavi, A. et al. Serial intraoperative magnetic resonance imaging of brain shift. Neurosurgery 48, 787-97; discussion 797-8. (2001). 7. Roche, A., Pennec, X., Malandain, G. & Ayache, N. Rigid registration of 3-D ultrasound with MR images: a new approach combining intensity and gradient information. IEEE Trans Med Imaging 20, 1038-49. (2001). 8. Samset, E. & Hirschberg, H. Neuronavigation in intraoperative MRI. Comput Aided Surg 4, 200-7 (1999). 9. Flask, C. et al. A method for fast 3D tracking using tuned fiducial markers and a limited projection reconstruction FISP (LPR-FISP) sequence. J Magn Reson Imaging 14, 617-27. (2001). 10. Jolesz, F. A., Nabavi, A. & Kikinis, R. Integration of interventional MRI with computerassisted surgery. J Magn Reson Imaging 13, 69-77. (2001).
A New Tool for Surgical Training in Knee Arthroscopy Giuseppe Megali1 , Oliver Tonet1 , Marcello Mazzoni1 , Paolo Dario1 , Alberto Vascellari2 , and Maurilio Marcacci2 1
CRIM, Scuola Superiore Sant’Anna, Pisa, Italy {g.megali,m.mazzoni,o.tonet,p.dario}@mail-arts.sssup.it http://www-mitech.sssup.it 2 Biomechanics Lab, Istituti Ortopedici Rizzoli, Bologna, Italy {M.Marcacci,A.Vascellari}@biomec.ior.it http://www.ior.it/biomec/
Abstract. This paper presents an educational method for minimally invasive surgery (MIS) and an integrated system to train a priori knowledge and to exercise manual dexterity. The approach is generally suitable for MIS interventions but has been developed specifically for knee arthroscopy. Based on a classification of the knowledge required for performing arthroscopy procedures, the system provides multimedia modules to train and assess anatomical and procedural knowledge and a virtual reality-based simulator for training perceptual-motor skills. The system is currently being experimented for metrics definition and extended incorporating networked database management.
1
Introduction
Minimally invasive surgery (MIS) has had a great expansion in the last few years. This kind of procedures introduce great advantages for patients, but also severe limitations for surgeons [3]. Although the complication of the surgical techniques impose a longer and more difficult training, the technological development provides new tools, such as multimedia and virtual reality, that can contribute to solve educational problems in the surgical domain. Surgical skills involved in MIS, at the current state of knowledge, can be divided into cognitive and perceptual-motor skills. The development of tools capable to train each of them is a big challenge, as is the definition of metrics for objective assessment of individual progress and comparison of skills between different users [1]. The generation of exercises capable to reproduce the complex relationships between cognitive, perceptual-motor, and experiential factors in addition to the experience of assistants and the quality of equipment used is a very hard issue. Bench models [6] and virtual environments [9] can be used as tools for understanding the development of perceptual motor skills and their relationship to higher cognitive abilities and skills in surgery. Multimedia material, based on images and videos illustrating many complex phenomena [8], can play an important role in training cognitive skills, especially in MIS where the surgeon carries out the T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 170–177, 2002. c Springer-Verlag Berlin Heidelberg 2002
A New Tool for Surgical Training in Knee Arthroscopy
171
intervention relying mostly on visual feedback. Evaluation methods are a major issue [5,7]: in fact, metrics for evaluating surgical ability is not unequivocal, and even in a non-technological environment, there have been former studies [4] where parameters such as aptitude test scores, duration of surgical experience or consultant technical skill rating have shown no significant correlation. The things get even more complex in case of surgical simulators [2]. In this paper, after an introduction of our educational approach, we present a training system integrating multimedia and virtual-reality based modules for separate training of surgical skills.
2
Educational Strategy
Our approach to solve the educational problem is aimed at the identification of the educational objectives, followed by the definition of the suitable methods. Our work has been structured aiming at three main objectives: 1. to conceive and formalize a novel educational route for MIS surgeons; 2. to identify the elementary cognitive and perceptual-motor skills to train, and to develop a technological suite, consisting of an organic set of elementary modules, for training and assessment of specific skills; 3. to carry out the experimentation of the method in order to evaluate the teaching effectiveness and to determine the evaluating metrics. According to our view of the field, the problem of training junior surgeons to a specific MIS technique can be solved in a three-phase process: 1. theoretical study of the surgical technique performed by using interactive multimedia material which has the function of guiding the surgeon through the different steps of the intervention and the understanding of image-based information; 2. training manual dexterity by means of a set of specific exercises focused on the single elementary skills: hand-eye coordination, depth perception, movement synchronization, execution of complex paths, etc.; 3. guidance by an expert surgeon in the execution of the intervention on patients in the operating room. The presented Knee Arthroscopy Training System (KATS) aims at reducing the learning curve in knee arthroscopy interventions, and at preparing the junior surgeon to his active participation in the operating room. The identification of methods and the definition of specifications for the educational tools has been approached from a general point of view, so that the learning system can be easily ported to other areas of articular surgery. KATS is divided into three modules: 1. the Anatomical Knowledge Module (AKM) aims at providing a tool for training and assessing the junior surgeon’s skills to identify anatomical structures and pathologies, starting from diagnostic images (CT, MRI, X-Ray), direct vision (open surgery, cadaver dissection), and arthroscopic images;
172
G. Megali et al.
2. the Procedural Knowledge Module (PKM) aims at providing a tool for training and assessing the junior surgeon’s knowledge about the single steps involved in performing a specific arthroscopy intervention; 3. the Navigation Training Module (NTM) consists of a simulator, based on dynamic virtual environments and on a realistic man/machine interface that allows the learner to acquire, improve and assess specific elementary skills needed for moving the surgical tools in a dexterous way during arthroscopic interventions.
3
Knowledge Assessing and Training Modules
The AKM and PKM share a common design: both are structured as interactive tests with questions based on interactive image maps (IIM) and multiple-choice questions (MCQ)(Fig. 1). The AKM and PKM provide two running modalities: 1. training: every question has an associated hypertext, explaining the correct answer with text, images and movie clips; 2. test: the aim in this case is to assess knowledge, so during the test phase some parameters are recorded for the evaluation of the performance of the junior surgeon (type and number of the correct answers, difficulty level, time used, repetition of errors). We classified knowledge in a two-dimensional matrix, where each column represents a different type of arthroscopy procedure (meniscal resection, cruciate ligament reconstruction, chondral fractures debriedment, chondral fractures perforations, synovial plica resection, patellar lateral release) and each row refers to a different type of information which is presented to the learner (external physiological anatomy, external pathological anatomy, image-based physiological knowledge, image-based pathological knowledge, physiological technique, pathological technique). Tests, i.e. set of questions on a specific topic, are generated dynamically, according to parameters like logical sequence, user level, former difficulties encountered, etc. and presented to the learner. The multimedia material is structured in a relational database (implemented in Microsoft Access), comprehensive of theoretical knowledge, medical images and movie clips. Every image and clip has associated MCQs, each with relative answers. Moreover, image maps can be associated to still images: the learner is then required to select a point of the image instead of selecting an answer. The value of the pixel on the associated image map act then as evaluation of the user’s response. 3.1
Anatomical Knowledge Module
Questions in the AKM are knowledge-oriented. The database contains all the questions, the possible answers and the multimedia material divided by topics. A subset of questions are extracted and presented to the learner, who must demonstrate adequate knowledge of that topic. In the AKM tests, the learner must demonstrate:
A New Tool for Surgical Training in Knee Arthroscopy
Fig. 1. An example of multiple-choice question.
173
Fig. 2. An example of plot of the learner’s performance.
– anatomical knowledge in physiological case, i.e. the ability to correctly identify anatomical landmarks and structures in a variety of cases and image types; – diagnostic knowledge, i.e. the ability to correctly identify pathologies starting from suitable medical images; – arthroscopic localization skills, i.e. the ability to recognize anatomical structures in a given arthroscopic image and to determine camera orientation and surroundings. 3.2
Procedural Knowledge Module
Questions in the PKM are procedure-oriented, i.e. the learner must demonstrate to possess enough knowledge about a specific surgical procedure to be able to complete all steps required by the procedure, in the correct sequence and being able to take the correct decision in routine and emergency cases. The module is based on movies of specific, selected, surgical interventions, that have been fragmented into elementary, sequential, steps. The multimedia material is created dividing into fragments videotapes of whole arthroscopy procedures of topical cases. Video clips are isolated and catalogued according to the most significant steps. The clips are then associated to questions and presented to the learner in the correct sequential order. The learner must answer the questions and identify the next step of the procedure to be performed. All the material relative to the single procedures, complete with the associated questions and answers, image maps and evaluation forms are classified in a similar way as in the AKM. The way database queries are made is also the same as in the case of the AKM. The learner must demonstrate: – step-by-step knowledge of specific surgical procedure, assessed through interactive playback of movies. 3.3
Evaluation of Performances and Statistics
In order to have a proper feedback on the progresses of the junior surgeon in AKM and PKM a database of the users is provided. After a learner performed a
174
G. Megali et al.
Fig. 3. The NTM module in the experimental setup.
test his performance is recorded in the database, the score is divided by sections and topics according to our classification of knowledge. The results of the tests can be plotted in order to have a visual feedback on the progresses made by the learner over time (see Fig. 2).
4
Navigation Training Module
Arthroscopic interventions are performed without directly looking at the site of intervention. The point of view of the endoscopic image varies, according to the arthroscope position and differs in position and orientation from the surgeon’s direct vision viewpoint. The perceptual-motor coordination process, necessary to operate in similar circumstances is defined as triangulation. When performing knee arthroscopy, the patient’s leg has to be moved (bent, flexed, stretched) in order to allow or facilitate access to certain knee regions. To keep the leg in the correct position a second operator is needed in the operating room. The objective of NTM is to train the principal and basic spatial and perceptual motor skills that underlie the execution of a good arthroscopic intervention. To this purpose we implemented training exercises, in which the learner interacts with dynamical virtual worlds, using real tools for arthroscopy on a mock-up knee. The tools and the mock-up are sensorized to reproduce their spatial position and orientation in the virtual world and to monitor the parameters that contribute to performance assessment. 4.1
The NTM Setup
An anatomically realistic knee model, with flexible hip joint and knee joint, is integrated in the NTM so it is possible to reproduce all the position of the knee necessary during the real surgical procedure and to train the second operator (operating under the direct control of the first operator.)
A New Tool for Surgical Training in Knee Arthroscopy
175
Fig. 4. Virtual Environments for exercises based on colored areas.
The virtual model of the knee has been constructed starting from a CT scan of the mock-up, while the models of the surgical tools have been realized by means of a CAD program. After the registration step, the movement of the knee mock-up and of the surgical tools are tracked in real-time by the localization system (FlashPoint 5000). For this purpose, frames instrumented with infrared emitters are fixed to femur and tibia, and to the tools. During the training-phase position information is processed and the virtual environment is opportunely updated and displayed. The data required for performance evaluation are recorded and processed. 4.2
The Training Exercises
The virtual environments of NTM combine dynamic representation of the virtual models of the mock-up and surgical tools, with geometrical shapes (or colored areas) located on anatomical landmarks relevant for the training. The exercises created for the training have three basic structures with different training objectives: Pointing: point to geometrical shapes located on an anatomical landmark of the knee joint with the arthroscope (Fig. 4.a) (single tool management); Touching: point to geometrical shapes with the arthroscope and touch them with the second tool (Fig. 4.b)(cooperative tool management); Sweeping: “sweep” or “clean” (i.e. to touch the colored areas with the second surgical tool) some areas in the knee joint (Fig. 4.c)(cooperative tool management and fine movement tuning). The evaluation of the performance is obtained by measuring time consumption, total path of the surgical tools, number of target misses, and bad contacts with the anatomical parts. 4.3
The Visual User Interface
Different types of visual user interface (VUI), designed with different training objectives, are implemented. The VUI based on external and projection views
G. Megali et al.
b)
Sagittal Projection
External Viewpoint
a)
External Viewpoint
176
Endoscope Tracking
Endoscope Tracking
Endoscopic View
Frontal Projection
Fig. 5. VUIs for training: global views for training triangulation of the scope and probe (a), endoscope view with assistant views for adaptation to the arthroscopic view(b)
(Fig. 5.a) addresses the training of tool management, whereas the VUI where the virtual endoscope view is dominant (Fig. 5.b) addresses adaptation to the arthroscopic view. The NTM module runs on a Windows NT 4.0 workstation, Intergraph TDZ 2000 GX1 (Pentium II Xeon 450 MHz, 512 MB RAM, Intense3D RealiZm II). The software has been developed with Microsoft Visual C++ 6.0, using MFC library for windows management and OpenGL Optimizer for 3D visualization.
5
Preliminary Results and Performance Assessment
In the development of a training tool, the experimentation phase is important for validating educational effectiveness, and for fine-tuning of the parameters that determine the final metric for evaluation of the learners’ performance. We have preliminary tested the AKM, PKM and NTM in making three surgeons (a junior resident, a senior resident, and a certified arthroscopist) perform a series of tests in the same conditions. The results show that performance (evaluated on the basis of time consumption and number of correct answers) increases with the degree of experience in the surgical room. To compare inter-operator performance, we should be able to determine apriori the basic surgical skills of a surgeon. The methods of evaluation of the learners’ performance and the choice of the parameters to be measured will be refined progressively during the oncoming experimental phase. Several users with different degrees of expertise will test the three modules and reference values for the final version of the formulas will be computed by means of statistical data analysis.
6
Conclusions and Future Work
In this paper we presented a training system aimed at providing increased experience of arthroscopy to the junior surgeon. The system, developed in the context of the VOEU3 Project, consists of modules for separate training of cognitive 3
Virtual Orthopaedic European University (VOEU), #IST-1999-13079 EU-IST Programme.
A New Tool for Surgical Training in Knee Arthroscopy
177
and perceptual-motor skills. Preliminary results encourage the prosecution of the research. Future actions will focus on client/server web access for remote generation of tests and management of student profiles, on experimental validation of the system, and on the establishment of suitable metrics for performance evaluation.
References 1. S. Peterson-Brown A. M. Paisley, P. Baldwin, Feasibility, reliability and validity of a new assessment form for use with basic surgical trainees, The American Journal of Surgery 182 (2001), 24–29. 2. C. Cao, C. MacKenzie, and S. Payandeh, Task and motion analyses in endoscopic surgery, Proceedings of ASME Dynamic Systems and Control Division (K. Danai, ed.), 1996 ASME International Mechanical Engineering Congress and Exposition, 1996, pp. 567–573. 3. A. Faraz and S. Payandeh, Engineering approaches to mechanical and robotic design for minimally invasive surgeries (MIS), ch. 1, pp. 1–11, Kluver Academic Publisher Group, 2000. 4. A.M. Paisley, P. Baldwin, and S. Paterson-Brown, Relationship between aptitude testing, simulation and technical skill rating in the assessment of surgical trainees, ASME Annual Scientific Meeting, 1999. 5. J. Patrick, Training: Research and practice, Academic Press, San Diego, CA, USA, 1992. 6. D. J. Scott, P. C. Bergen, R. V. Rege, R. Laycock, S. T. Tesfay, R. J. Valentine, D. M. Euhus, D. R. Jeyarajah, W. M. Thompson, and D. B. Jones, Laparoscopic training on bench models: Better and more cost effective than operating room experience?, Journal of American College of Surgeons 191 (2000), no. 3, 272–283. 7. W. Sjoerdsma, Surgeons at work: Time and actions analysis of the laparoscopic surgical process, Ph.D. thesis, Delft University of Technology, Delft, The Netherlands, 1998. 8. B. K. Smith and E. Blankinship, Justifying imagery: Multimedia support for learning through explanation, IBM Systems Journal 39 (2000), no. 3-4, 749–767. 9. F. Tendick, M. Downes, T. Goktekin, M.C. Cavusoglu, D. Feygin, X. Wu, R. Eyal, M. Hegarty, and L.W. Way, A virtual environment testbed for training laparoscopic surgical skills, Presence 9 (2000), no. 3, 236–255.
Combining Volumetric Soft Tissue Cuts for Interventional Surgery Simulation Megumi Nakao1, Tomohiro Kuroda2, Hiroshi Oyama2, Masaru Komori3 Tetsuya Matsuda1, and Takashi Takahashi2 1
Graduate School of Informatics, Kyoto University, Yoshida, Sakyo, Kyoto, 606-8501, Japan
[email protected] 2 Department of Medical Informatics, Kyoto University Hospital Shogoin, Sakyo, Kyoto, 606-8507, Japan 3 Computational Biomedicine, Shiga University of Medical Science, Seta Tsukinowa, Otsu, 520-2192, Japan
Abstract. This paper proposes a framework to simulate soft tissue cuts for interventional surgery simulation. A strained status of soft tissues is modeled as internal tension between adjacent vertices in a particle based model. Both remodeling particle systems and an adaptive scheme in tetrahedral subdivision provide volumetric and smooth cuts on large virtual objects. 3D MRI datasets are applied to a developed system with a force feedback device. Measurement of the calculation time and visualization of simulation quality confirms that the framework contributes to surgical planning and training with tissue cutting.
1
Introduction
Minimally invasive surgery (MIS) is focused in the medical field, because small incision contributes to patient’s beauty and risk reduction. Although MIS benefits patients, surgeons have to learn high procedural skills and have to experience more surgeries. Results of the latest surgeries, such as minimally invasive cardiac surgery (MICS), mostly depend on how to settle their surgical fields or local views. However, surgeons are currently forced to make decisions on surgical strategies empirically or intuitively using 2D or 3D images. For these issues, an advanced virtual reality based simulation has possibilities of giving a solution to both preoperative planning and training in surgical intervention. The planning system presented by Pflesser B. et al is a remarkable example in brain surgery [1]. The KISMET surgery simulator [2] gives an effective environment to learn surgical procedures in MIS. This study aims to construct an advanced system with soft tissue cutting, which helps surgeons to discuss or learn optimum incision (cutting point and length), surgical path (direction and approach) and surgical field (view and space) in MICS [3]. However, providing accurate soft tissue cuts is one of challenging studies. The cutting model must handle topological change, biomechanical deformation, collision detection and haptic rendering. To support users with visual and haptic feedback, fast computational schemes are also important. So far, several approaches have been reported. Cutting methods applied to polygons are not effective to represent deep cut surface where inner tissues spread apart [4]. Voxel-based models [1] provide volumetric disT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 178–185, 2002. © Springer-Verlag Berlin Heidelberg 2002
Combining Volumetric Soft Tissue Cuts for Interventional Surgery Simulation
179
play, whereas real time deformation of soft tissues is not achieved due to their large calculation cost. Although cutting frameworks based on a tetrahedral subdivision scheme [5] simulate volumetric cuts, there are still some problems in physical consistency, shape of cuts, and global manipulation on large dataset. In this scope, further development of advanced frameworks is mandatory to provide surgical realism in real time. This paper presents an advanced framework that simulates interactive soft tissue cutting in surgical intervention. A strained status of soft tissues is modeled as internal tension between adjacent vertices in a particle based model. Both remodeling particle systems and an adaptive scheme in tetrahedral subdivision provide volumetric and smooth cuts on large virtual objects. In the following sections, details of the methods are described.
Fig. 1. A simulation framework for soft tissue cutting
2
A Framework for Soft Tissue Cutting
The proposed framework not only deals with large dataset (more than 10000 elements) but also aims to perform both topological change and real time deformation via global manipulation. In order to provide an interactive update rate, particle based models [6, 7] are currently our modeling solution capable of adequate accuracy and performance in physical simulation. Fig. 1 shows the developed framework that provides soft tissue cuts. The system has parallel algorithm loops for visual and haptic display. The visual loop is kept at least 100 Hz due to stability of the deformable model and real time rendering. A refresh rate of the haptic loop is 1000 Hz in order to satisfy force perception of the human. The presented cutting model is a kind of package and can be integrated to foregoing physics based simulation. The computational architecture using an element hierarchical structure performs interactive and stable simulation for large datasets.
3
Physical Balance and Transition of Soft Tissues
The cutting model consists of three basic methods: modeling of internal tension, particle system remodeling and adaptive tetrahedral subdivision.
180
M. Nakao et al.
3.1
Modeling of Tension between Tissues
Cutting simulation must deal with physical and geometrical models, and the algorithms have to be applied to virtual objects reconstructed from volumetric dataset of patients. From the viewpoint of consistent physical approaches, the authors focus on internal tension between soft tissues. Actually, all tension is in equilibrium at the ordinary state, and physical balance changes into the next stable state after incision. Although such physical behavior is essential to achieve simulation accuracy, it is not visible in the results of foregoing studies [5, 6]. In a particle system, an initial state that internal tension is acting on tissues is represented by setting initial length of an edge shorter than initial distance between its two vertices. Using a tense parameter, initial length of an edge is determined by the following simple equation: ln = d n (1 − sn ) , 0 ≤ s n ≤ 1
(1)
where on an edge n, ln is suggested length, dn is initial distance between two vertices that the edge belongs to, and sn is a tense parameter. Although soft tissues of human bodies or organs have heterogeneous tension, the tense coefficient sn simulates realistic physical balance between in-vivo tissues. 3.2
Physically Based Remodeling of Particle Systems
This section describes remodeling particle systems in order to represent relaxation of soft tissues. Fig. 2 (a) shows a basic mass-spring system, which does not include damper elements in brief. In the proposed model, a pair of new vertices is created at the point clipped by a virtual scalpel, and then the system is modified on the condition that the system has the same behavior towards neighboring vertices like Fig. 2 (b).
Fig. 2. Remodeling the particle system: initial tense state (a), creation and distribution of new vertices (b) and physical transition after elimination of alternative edges (c)
Mass of the two vertices is distributed to the new vertices. Then, we can cut into the system by elimination of alternative spring and damper elements connected to different vertices. For example in Fig. 2 (b), both edge m3 - m5 and edge m4 - m6 are eliminated. If the system has internal tension presented in the previous section, a pair of vertices physically spreads apart to new stable positions (Fig. 1 (c)). Movement of the vertices represents dynamic spread that provides interactive display. The mass of
Combining Volumetric Soft Tissue Cuts for Interventional Surgery Simulation
181
each vertex, which is proportional to volume of the elements it belongs to, is determined by the equation: 1 (2) mi = ! ρ kVk k∈E ( i ) 4 where mi is mass of a vertex i, E(i) is a set of elements that vertex i belongs to, ρ k is density and Vk is volume of an element k. The mass of the created vertices is also calculated by the equation (2).
4
Adaptive Subdivision of Tetrahedral Objects
According to the internal tension and elimination of spring/damper terms, physically based incision is represented using particle systems. However, we still have another problem as to topological transition and definition of a cut surface in order to display volumetric cuts. To solve this subject, the authors describe topological change of 3D meshes based on tetrahedral subdivision. The generalized subdivision divides a tetrahedron into 17 smaller tetrahedra. One vertex is inserted at each edge and surface, and consequently 10 new vertices are created. However, radical increase of elements is a serious problem to keep the update rate stable and small. In order to achieve an interactive refresh rate, D. Bielser et al have defined a cut surface by inserting between three and five vertices [5]. A tetrahedron is subdivided into between four or six tetrahedra. In the next step, this paper gives adaptive tetrahedral subdivision that only creates three or four vertices and yields four or six minimal tetrahedra. Moreover, the adaptive scheme concerns accurate shape of cuts as well as consistent topological changes based on the concept of edge-plane intersection and movement of vertices. The implementation of the proposed method is simple because complex description of subdivision patterns is not required. 4.1
Topological Change
Intersection between a plane and a tetrahedron is described as two patterns in Fig. 3. The intersection at four edges, shown in Fig. 3 (a), separates four vertices into a pair of two vertices, and the shape of the cross section is a square. The intersection at three edges, shown in Fig. 3 (b), separates four vertices into one vertex and three vertices, and the shape of the cross section is a triangle. This intersection gives a basic concept of creating a cut surface. During cutting manipulation, movement of a virtual scalpel forms a clipping plane C, which clips a cut surface S from a tetrahedral object T. Fig. 4 illustrates a relationship between the elements. Because C is a partial plane, incomplete intersection is occurred at some tetrahedra. Therefore, two patterns of intersection in Fig. 3 cannot represent an adequate cut surface. This incomplete intersection is shown at the right tetrahedron of T in Fig. 4. In order to remove incomplete intersection, on the clipping plane C, adjacent vertices outside of the region C are moved and replaced on its boundary B. A movement vector of the vertex is determined by average of edge vectors that intersect at the boundary B. A solution for incomplete intersection by vertex movement is illustrated in Fig. 5.
182
M. Nakao et al.
Fig. 3. Intersection between a plane and a tetrahedron
Fig. 4. Intersection between a clipping plane and tetrahedra (C: clipping plane, S: cut surface, T: tetrahedral object)
Fig. 5. Vertex movement and removal of incomplete intersection on the clipping plane
After movement of the vertices, incomplete intersection between C and T is removed. Consequently, the cut surface S is defined completely by the two patterns of intersection in Fig. 3. The generated cut surface S displays volumetric cut. The remodeling scheme of particle systems described in the section 3.2 is applied to all edges that intersect at C. The length of all edges that are affected by vertex movement is also recalculated. 4.2
Topology Optimization
Since a created vertex is currently connected to only one spring term, some edges are inserted into the object in order to simulate realistic cuts. Minimal tetrahedral subdivision in Fig. 6 gives consistent and effective patterns of edge insertion. In the case of four edges intersection, six new tetrahedra are generated like in Fig. 6 (a), and the intersection at three edges yields four new tetrahedra in Fig. 6 (b). Although some subdivision patterns can be described, an effective pattern carefully must be selected in order to achieve fine cuts.
Combining Volumetric Soft Tissue Cuts for Interventional Surgery Simulation
183
Fig. 6. Minimal tetrahedral subdivision
In the field of cutting simulation using physical models, improving shape of cuts is one of significant issues. So far, related works have reported that zigzag cut is a key problem for realistic surgery simulation [8]. In order to provide accurate and smooth cuts based on minimal tetrahedral subdivision, the authors provide an adaptive scheme, which selects the subdivision pattern that maximizes the minimum dihedral angle and minimizes the maximum dihedral angle. An indicator µ of each subdivision pattern is solved at the following equation: max (θ i ( j , k ) ) i = 0 min (θ i ( j , k ) ) N
µ =!
(5)
where four or six new tetrahedrons which are subdivided from a tetrahedron are described as Ti (i=0,…,N), and four faces of a new tetrahedron are described as Pi ( j ) (j=0,…,3). At this time, θ i ( j , k ) is a dihedral angle between Pi ( j ) and Pi (k ) . If µ is small, we can estimate that the subdivision pattern has good quality.
5
Results
The overall algorithms of proposed methods are implemented under a standard PC (CPU: Pentium III Dual 933MHz, Memory: 2GB, and OS: Windows2000). A PHANToM (Sensable Technologies) is applied to a simulation system with a force feedback device. When a user manipulates a virtual scalpel and cuts into virtual objects, the system displays simulated soft tissue cuts and returns effective force feedback. To simulate interaction between a tip of virtual scalpel and target objects, the general collision detection scheme [10] is applied. Dataset of human body structures are obtained as volumetric MRI dataset from a normal volunteer. After semiautomatic region labeling from surroundings, 3D surface of a chest wall and a heart are extracted, and then each internal region is divided into tetrahedra. Note that the authors stored basic information of tetrahedral objects such as vertices, edges and surfaces, and reduced computational cost using relational information like neighboring, parent and child pointers in overall simulation. In order to adapt real time performance and simulation quality, dynamic localization is performed using hierarchical structure of elements. Interactive soft tissue cutting that the developed system provides is shown in Fig. 7. The chest wall has 4% tension, the elastic coefficient 100N/m, the damper coefficient 0.05Ns/m and the density 0.90g/cm3. Simulated volumetric cuts are smooth and zigzag cuts are reduced due to the adaptive scheme. A bright part of the chest wall on
184
M. Nakao et al.
near the virtual scalpel means the region where physical simulation is applied. Measurement of calculation time confirms that overall algorithms are calculated within 10 msec. Force feedback architecture is also strictly managed by parallel computing and valid complementation.
Fig. 7. Interactive soft tissue cutting on virtual chest wall (1sec, 2sec, 3sec)
Fig. 8 depicts a practical use of the framework for planning of minimally invasive surgery. A part of the myocardium appears in the back of the incised chest wall. Using this system, surgeons can discuss and learn strategies based on the reconstructed surgical field from patient’s dataset. These figures demonstrate that the proposed methods can improve the applicability of interventional surgery simulators.
Fig. 8. Application for planning and training of minimally invasive surgery
6
Conclusion
In this paper, the authors presented an advanced framework that simulates soft tissue cutting, which is one of basic techniques in surgery simulation. The proposed framework is simple to implement, and integrates topological change into deformable models based on physically consistent manner. 3D MRI dataset acquired from a normal volunteer was applied to the developed system with a force feedback device. Measurement of the calculation time and demonstration of simulation quality confirmed that the model contributes to surgical planning and training. A future direction of this work is to construct a patient specific surgery simulator with tissue cutting and tearing in MIS. The authors also aim to improve surgical realism of soft tissue models such as physiological description and biomechanical deformation between multiple organs.
Combining Volumetric Soft Tissue Cuts for Interventional Surgery Simulation
185
References 1. Pflesser B., Leuwer R., Tiede U. Hohne, K. H.: Planning and Rehearsal of Surgical Interventions in the Volume Model. Proceedings of Medicine Meets Virtual Reality, (2000) 259-264 2. Kuhnapfel U., Cakmak H.K., Mass H.: Endoscopic Surgery Training Using Virtual Reality and Deformable Tissue Simulation. Computers and Graphics, Vol. 24. (2000) 671-682 3. M. Nakao, T. Kuroda, H. Oyama, M. Komori, T. Matsuda, T. Takahashi: Planning and training of minimally invasive surgery by integrating soft tissue cuts with surgical views reproduction, In Proceedings of Computer Assisted Radiology and Surgery, (2002) 4. C. Bruyns, K. Montgomery, S. Wildermuth: A Virtual Environment for Simulated Rat Dissection. Proceedings of Medicine Meets Virtual Reality, (2001) 75-81 5. D. Bielser, M. H. Gross: Interactive Simulation of Surgical Cuts. Proceedings of Pacific Graphics, (2000) 116-125 6. S. Cotin, D. Herve, N. Ayache: A hybrid elastic model for real-time cutting, deformations, and force feedback for surgery training and system. The Visual Computer, Vol. 16, (2000) 437-452 7. L. P. Nedel and D. Thalmann: Real time muscle deformations using mass-spring systems, Computer Graphics International. (1998) 156-165 8. C. Basdogan, Chih-Hao Ho, M. A. Srinivasan, Simulation of Tissue Cutting and Bleeding for Laparoscopic Surgery Using Auxiliary Surfaces, Proceedings of Medicine Meets Virtual Reality 8, (1999) 38-44 9. S.Z. Pirzadeh, “An Adaptive Unobstructed Grid Method by Grid Subdivision, Local Remeshing, and Grid Movement”, AIAA Paper 99-3255, 1999 10. D. Ruspini, K. Kolarov, O. Khatib, The Haptic Display of Complex Graphical Environments, Proceedings of Computer Graphics, (1997) 345-352
Virtual Endoscopy Using Cubic QuickTime-VR Panorama Views Ulf Tiede1 , Norman von Sternberg-Gospos1 , Paul Steiner2 , and Karl Heinz H¨ ohne1 1
Institute of Mathematics and Computer Science in Medicine (IMDM) 2 Dept. of Radiology University Hospital Hamburg-Eppendorf, Germany
Abstract. Virtual endoscopy needs some precomputation of the data (segmentation, path finding) before the diagnostic process can take place. We propose a method that precomputes multinode cubic panorama movies using Quick-Time-VR. This technique allows almost the same navigation and visualization capabilities as a real endoscopic procedure, a significant reduction of interaction input is achieved and the movie represents a document of the procedure.
1
Introduction
Virtual endoscopy (VE) and especially virtual colonoscopy (VC) have gained much attention in diagnostic radiology recently. After acquisition of a MRI or CT volume data set the procedure involves the following computation steps: – – – – – –
segmentation of the colon determination of the (typically central) path through the colon navigation through the colon and visualisation of the colon wall documentation of the flight through the colon as a movie transfer of the document together with the report to the referring physician in a modern environment: archiving and communication of the document in a PACS
In clinical practice it has turned out to be advantageous to do the segmentation in advance and also to compute a path before viewing, because otherwise the navigation is extremely tedious and time consuming and needs a complex user interface. In ideal cases segmentation can be done automatically [1], however, in some cases manual control is required due to noise in the data or motion artefacts. A common approach for path computation is skeletonization [2]. For noisy data and due to wrickles in the colon highly sophisticated heuristics are needed to remove dead ends and loops, to end up with a smooth path. Another way to support navigation is described in [3]. Here a potential field is calculated, which is used to compute a force that pushes the virtual viewer back onto the central path. The strength of the force depends on the distance of the current view point to the path. At the colon wall the force is infinite thus inhibiting the T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 186–192, 2002. c Springer-Verlag Berlin Heidelberg 2002
Virtual Endoscopy Using Cubic QuickTime-VR Panorama Views
187
user to pass through the wall. Other approaches unfold the inner colon surface to a 2D image map [3]. However, clinicians are unfamiliar with these images in addition to geometric distortion problems. The visualization of the colon wall requires a conversion of the colon surface to polygonal meshes in order to utilize standard computer graphics hardware and software, which allow to compute 3D views at near real-time speed for modest size datasets. Available voxel-based accelerators (e.g. Mitsubishi VolumePro) cannot perform perspective transformations directly which are essential for calculating endoscopic views [4]. Outgoing from the fact, that navigation is complex (even when a central path is already defined) and real-time visualization for high quality rendering cannot be achieved on standard PCs, we investigated an approach that also pre-computes a large part of the visualization. Expected advantages are that visualization is restricted to “meaningful” images and interaction becomes easier.
2
Method and Material
If we would try to include any possible view in a pre-computed movie, the amount of data would be huge and navigation would not be simple at all. If we use only a fixed field of view and a constant viewing direction while moving through the colon along the central path one might miss important details which are outside the selected field of view (e.g. behind wrinkles). A well-known technique from classical photography is the use of wide angle (“fish-eye”) lenses, but these yield strong geometric distortions. Another approach that we propose here is the generation of panorama images where a number of overlapping photographs are taken while the camera is rotated around the vertical axis. The single photographs are then connected to form a cylindrical image also known as a panorama view. While this technique needs some effort in photography it can easily be simulated in computer graphics (fig. 1). However, cylindrical projections do not allow
Fig. 1. Cylindrical panorama view of the colon.
straight views up and down, i.e. 90◦ upwards or downwards from the horizon due to geometric distortions that become apparent at viewing angles larger than 45◦ . A generalization of cylindrical panoramas are spherical panoramas, which allow viewing in all directions. Here an image map is computed which is then
188
U. Tiede
“wrapped” around a sphere. This requires complex geometry calculations as the inverse problem, namely unfolding a sphere to a 2D-map, is a well-known problem in cartography. A simplification for general spherical panoramas are cubic panoramas. Cubic panoramas are computed as follows: From a given viewpoint we compute six 3D projections, which correspond to the six faces of a cube. The viewpoint is assumed to be located at the center of the cube, i.e. the first projection looks into the direction along the central path, then we turn 90◦ to the right, compute the 2nd image. Turn again 90◦ to the right, now we are looking back into the direction we came from, turn again and finally we compute the images when looking 90◦ upwards and downwards. As can easily be seen the six projections must be calculated with a field of view of exactly 90◦ so that the images fit nicely together and form the faces of a cube (fig. 2). Recently Apple Computer,
colon wall
view point
central path
5 1
2
3
4
6 cubic panorama
unfolded cube faces
Fig. 2. Cubic panorama generation. At any viewing position 6 projections corresponding to the 6 faces of a cube are computed with a field of view of 90◦ .
Inc. has incorporated cubic panoramas into their QuickTime-VR technology [5]. The QuickTime player has a build-in distortion correction method for viewing these cubic panoramas. The six 3D projections are stored as a sequence of images together with additional information that allows QuickTime to recognize the images to be a cubic panorama. The QuickTime player then allows continuous navigation across the entire panorama in real-time using the two degrees of freedom of the mouse. For a continuous motion through the colon we compute such cubic panoramas along the central path every few millimeters with a high resolution rendering algorithm described in [6]. The single panoramas, which are called nodes in QuickTime terminology, must then be connected to enable the player to “jump” from one node to the next. This is accomplished
Virtual Endoscopy Using Cubic QuickTime-VR Panorama Views
189
using so called hot spots, which are just an additional image layer, where the pixel values correspond to the node number the player may move to. These hot spot images are very simple and can be calculated automatically for colonoscopy, because there are no bifurcations. Thus if we are at node n the pixel value of the hot spot for all possible forward-looking directions, i.e. 180◦ field of view from the initial viewing direction, is n+1, and the value for all backward-looking directions is n-1. The result is a multi-node cubic panorama movie. In contrast to other pre-computed movies, the amount of data is thus decisively reduced. In addition the interaction with QTVR movies is very simple. As each panorama node has its own initial viewing parameters, i.e. pan and tilt angles and field of view, it is important for a smooth transition from one node to the next not to use these initial values to preserve and transfer the parameters of the current node. We modified the behaviour of the player to handle this requirement, so that it become possible to move through the colon while inspecting the colon wall laterally. For diagnostic purposes it is not sufficient to provide endoscopic views only. For the radiologist, who is familiar with interpreting CT and MR images, it is a necessity to have access to the original data at any stage of the viewing process. Therefore we do not only store the panorama images but the corresponding zbuffers and the related viewing transformation matrices as well (fig. 3). With this
Fig. 3. Cubic panorama projections. Right: Image map of the 6 unfolded cube faces (front) and their corresponding z-buffers (rear). The distortion correction for viewing a panorama is performed automatically by the QuickTime player.
additional information available we can easily get back from any surface location into the original gray scale volume. Furthermore this allows the simultaneous
190
U. Tiede
display of different views such as 3D outside views, in which the current virtual camera position and orientation can be marked to facilitate orientation.
3
Results
We applied the described method to about 20 contrast enhanced MRI datasets of the colon. The datasets were segmented by thresholding and the central path as well as up to 200 high resolution cubic panoramas together with their corresponding z-buffers were computed and stored in the QuickTime-VR movie format. Additionally a matrix of 3D overview images that show the colon from outside were created to facilitate orientation. These images are stored as a QuickTimeVR object movie which is very similar to panorama movies. The radiologist can then access the final movie over the intranet on his desktop PC. He can “move” through the colon in real-time while looking around in all directions searching for abnormalities. On the 3D overview he can verify were the virtual endoscope is currently located. Also, just by clicking on the colon surface at any location, the corresponding position is marked on three orthogonal cross-sections showing the original MR values. This helps the radiologist who is familiar with 2D cross-sections in the assessment of suspected lesions. Fig. 4 shows the simple graphical interface we have developed for virtual colonoscopy. Our radiologists
Fig. 4. User interface showing an endoscopic view (right), a 3D overview image (middle bottom) and three orthogonal MR cross-sections. The marked position on the colon wall (white cross) corresponds to the positions on the cross-sections and in the overview.
Virtual Endoscopy Using Cubic QuickTime-VR Panorama Views
191
feel comfortable with the user interface. The entire procedure takes a couple of minutes for establishing a diagnosis.
4
Conclusion
Using pre-computed cubic panorama images for virtual endoscopy allows quasi real-time viewing and easy navigation on a standard PC. Especially the very simple user interface has proved helpful for the acceptance of the procedure. The method can also be applied to structures with bifurcations like blood vessels or the bronchi. Fig. 5 shows as an example the bifurcation of the bronchi computed from the Visible Human data set [7]. As a decisive advantage the panorama movie
Fig. 5. Panorama view of the main bifurcation of the bronchi computed from the Visible Human Dataset.
represents a document similar to classical video recordings of real endoscopic examinations with additional functionality that can be viewed with standard software at any PC or with browser technology in an Intranet. Due to the fact that computation time is not an issue, also more sophisticated rendering methods like automatic color labeling of suspicious regions can easily be incorporated. If connected to a knowledge base [8] additional features like symbolic descriptions and manipulation capabilities become possible, which allow to generate even more comprehensive teaching and learning material [9]. Currently the number of panoramas in a single QuickTime-VR movie is limited to 255 which is not a real restriction for the application shown here. Further research should also take stereoscopic viewing into account.
192
U. Tiede
References 1. M. Sato, S. Lakare, M. Wan, A. Kaufman, Z. Liang, and M. Wax. An automatic colon segmentation for 3D virtual colonoscopy. IEICE Transactions on Information and Systems, E84-D(1):201–208, January 2001. 2. Y. Zhou and A. W. Toga. Efficient Skeletonization of Volumetric Objects. IEEE Transactions on Visualization and Computer Graphics, 5(3):196–209, 1999. 3. A. Vilanova i Bartroli, R. Wegenkittl, A. K¨ onig, and E. Gr¨ oller. Nonlinear Virtual Colon Unfolding. In Mike Bailey and Charles Hanson, editors, Proc. IEEE Visualization 2001, pages 91–98, San Diego, CA, October 2001. 4. M. Wan, W. Li, K. Kreeger, I. Bitter, A. Kaufman, Z. Liang, D. Chen, and M. Wax. 3D Virtual Colonoscopy with Real-time Volume Rendering. In Chin-Tu Chen and Anne V. Clough, editors, Proc. SPIE Medical Imaging 2000, volume 3978, pages 165–171, San Diego, CA., February 2000. 5. Inc. Apple Computer. Interactive Movies: QuickTime VR. http://developer.apple.com/techpubs/quicktime/qtdevdocs/PDF/insideqt qtvr.pdf, 2001. 6. Ulf Tiede, Thomas Schiemann, and Karl Heinz H¨ ohne. High Quality Rendering of Attributed Volume Data. In David Ebert, Hans Hagen, and Holly Rushmeier, editors, Proc. IEEE Visualization 1998, pages 255–262, Research Triangle Park, NC, 1998. (ISBN 0-8186-9176-X). 7. A. Pommert, K.H. H¨ ohne, B. Pflesser, E. Richter, M. Riemer, T. Schiemann, R. Schubert, U. Schumacher, and U. Tiede. Creating a high-resolution spatial/symbolic model of the inner organs based on the Visible Human. Medical Image Analysis, 5(3):221–228, 2001. 8. Rainer Schubert, Bernhard Pflesser, Andreas Pommert, Kay Priesmeyer, Martin Riemer, Thomas Schiemann, Ulf Tiede, P. Steiner, and Karl Heinz H¨ ohne. Interactive volume visualization using “intelligent movies”. In James D. Westwood, Helene M. Hoffman, Richard A. Robb, and Don Stredney, editors, Medicine meets Virtual Reality, Proc. MMVR ’99, volume 62 of Studies in Health Technology and Informatics, pages 321–327. IOS Press, Amsterdam, 1999. 9. K.H. H¨ ohne, B. Pflesser, A. Pommert, K. Priesmeyer, M. Riemer, T. Schiemann, R. Schubert, U. Tiede, H.-C. Frederking, S. Gehrmann, S. Noster, and U. Schumacher. VOXEL-MAN 3D-Navigator: Inner Organs. Regional, Systemic and Radiological Anatomy. Springer-Verlag Electronic Media, Heidelberg, 2000. (3 CD-ROMs, ISBN 3-540-14759-4).
High Level Simulation & Modeling for Medical Applications – Ultrasound Case A. Chihoub Siemens Corporate Research Inc., 755 College Road East, Princeton, NJ 08540, USA GLMLSYF$WGVWMIQIRWGSQ Abstract. In this paper we will present the results of mapping and simulating the B-Mode (echo), and Doppler (flow) algorithms in ultrasound processing onto 1D and 2D based architectures. A scaleable parallel architecture using commercial off the shelf DSP processors was simulated/evaluated for performance and cost feasibility to replace an existing dedicated hardware solution. The results showed that such an architecture was both feasible and cost effective.
1
Introduction
Ultrasound processing generally has very high computational requirements. Developing the hardware necessary to meet such requirements, in real-time, can be quite expensive. Until recently, meeting such demands required the development of custom designed hardware. Recent advances in processor technology combined with the drop in the price of digital signal processors (DSPs) have made it possible to develop cost efficient parallel processing systems using off the shelf commercial processors for use in ultrasound applications. In this paper we will present the mapping of the algorithms used to perform BMode (echo) and Doppler (flow) processing in a typical ultrasound machine onto a linear, and a 2D mesh array of Digital Signal processors. We will also present the simulation results from running the parallelized algorithms on the target architectures using a high level simulation tool. The goal is to verify the functionality and extract the performance measures of the parallelized algorithms on the target architectures. The rest of the paper is divided into the following sections. In the next section we will briefly describe the algorithms used in B-Mode, and Doppler processing in a typical ultrasound machine. In section three we will describe the mapping of the algorithms to the linear, and 2D mesh arrays. In section four we will briefly describe the SES/Workbench simulation model we used to validate the approach. In section five we will present the results of the simulation. Finally, in section six we will give some concluding remarks.
2
Overview of Ultrasound Processing
In this section we will give a brief overview of ultrasound processing and a summary of the processing steps encountered in B-Mode, and Doppler processing. The overT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 193–201, 2002. © Springer-Verlag Berlin Heidelberg 2002
194
A. Chihoub
B-Mode Processing
Doppler Processing
Merge and Display
Beam Vector Formation
Routing
Transducer
view is included to help the reader understand better the algorithm mapping discussions given in the next section. Figure 1 gives the flow diagram of the overall process. More details about ultrasound processing can be found in [1],[2] and [3]. Ultrasound data is typically gathered by sending acoustic pulses along some selected directions. The reflected pulses are then received and conditioned into broadband RF beam vectors for further processing. Once the RF vectors are formed they are passed through the digital receivers to perform the coherent down conversion and decimation filteration steps. The coherent down conversion step is used to downconvert the RF vectors into baseband signals each with an in-phase and a quadrature component. The decimation filteration step is then used to independently low pass filter, and sub-sample the in-phase and quadrature components of each signal. The low pass filtering step is used to remove the un-wanted frequency components generated by the coherent down conversion step. The sub-sampling step is used to reduce the size of each beam vector to the desired number of samples. Once the vectors pass through the digital receivers they are routed to the proper processing segment to perform the processing steps dictated by the selected mode of operation of the ultrasound machine. The major steps in typical B-Mode, and Doppler processing pipelines are described next.
Fig. 1. Overview of ultrasound processing
2.1
B-Mode Processing
The first step in our B-Mode processing pipeline is magnitude calculation. In this step the filtered in-phase and quadrature (I,Q) components of the echo signal are used to calculate the magnitude of each vector for all samples. The next step is focal zone blending. In this step data from several beam vectors focused at different depths along the same direction are combined together into a single vector to increase the spatial resolution. The focal zone blending is followed with a look-up table step. This step compresses the dynamic range of the data using a logarithmic conversion function. An FIR filter with up to sixteen taps is then used to enhance the data along the axial (beam) direction in a step called axial edge enhancement. In systems with parallel beam formation features, the RF data may be down converted with several different frequencies. The resulting vectors have the same direction but different, possibly overlapping, frequency bands. In such systems a step, called frequency compounding, is used to sum up these sets of vectors into a single vector. The frequency compounding step is followed by the dynamic reject step to remove low level signals. A threshold value calculated using a neighborhood of samples around the sample of interest is used to accept or reject the sample. If a sample is
High Level Simulation & Modeling for Medical Applications– Ultrasound Case
195
rejected its new value is set to zero. A three tap FIR filter is then used to provide additional edge enhancement in the lateral direction. The step used to perform this function is called lateral edge enhancement. The next step, black hole filing, is used to compensate for the null regions caused by the coherent addition of individual scatterers which appear as black spots (holes) in the image. The black hole filing step calculates a threshold for a given sample based on its two nearest neighbors in the lateral direction. If the difference between the sample’s value and either of its neighbors is less or equal to the threshold, the sample is passed unchanged to the next processing stage. Otherwise, its new value is calculated as the average between its original value and that of one of its neighbors. The black hole filing step is followed by the variable persistence step. The variable persistence step applies an IIR filter between succeeding frames to determine how much the values in the current frame are allowed to decay with respect to the previous frame. A user selected persistence and dynamic range values are used to determine the filter coefficients. Finally, another look-up table which implements logarithmic conversion to further compress the dynamic range of the data is used. The resulting data is passed to the scan converter for display. This step is not part of the echo processor and is not discussed here. 2.2
Doppler Processing
The first step in our Doppler processing pipeline is corner turning. In this step the I and Q flow vectors are re-arranged for ease of processing in later steps. The vectors come out of the digital receivers in depth order, the corner turning step re-arranges them in time order. This step does not require any explicit processing. It can be achieved by writing and reading the data into memory in a proper sequence. The adaptive wall filter step is used to remove the clutter component of the Doppler signal (Doppler signal due to the probe and tissue movement) while keeping the Doppler signal due to the flowing blood. This step can be done by passing the time ordered I and Q vectors through a complex mixer so that the clutter component of the signal falls within the notch of two FIR filters used to remove the clutter from the I and Q vectors. The filtered I and Q may then be shifted back to the original frequency by passing them through the conjugate of the complex mixer used to do the original shift. This shifting step is application dependent and may be skipped. The frequency location of the clutter signal is calculated by the clutter estimator. The calculated estimate is then used to select the coefficients of the FIR filters used to remove the clutter. The FIR filters may operate in one of three modes: Linear Time Invariant (LTI) mode, Circular mode or Linear Time Varying (LTV) mode. Once the wall filter step is completed the filtered vectors of each ensemble are used to calculate a velocity estimate using autocorrelation. For each flow ensemble a triplet of the power estimate of the signal P(n), the numerator N(n) and the denominator D(n) are calculated. The power P(n) is calculated as the averaged zero-lag autocorrelation of the signal. N(n) and D(n) are calculated as the real and imaginary components of the single-lag auto-correlation. The two are used later to estimate the mean frequency of the signal which is calculated as the arctangent of the ratio of N(n)/D(n).
196
A. Chihoub
The generated triplet vectors P(n), N(n) and D(n) are then filtered with an equiweight FIR filter with up to sixteen taps. This step is called adaptive axial filtering. It is used to provide for spatial smoothing along the flow direction. The size of the FIR filter may be adaptively changed in the depth direction as a function of an estimate calculated by filtering the power signal using samples eight positions ahead of the ones being used by the equi-weight FIR filter. The goal is to increase the signal-tonoise ratio by increasing the smoothing in the axial direction as the flow signal amplitude decreases. The calculated estimate is used to look-up the proper filter coefficients from a pre-computed table. The adaptive filter may also be used to decimate the size of the vectors to the required upper limit of 512 samples (to satisfy scan conversion flow requirements) when needed. Once the adaptive axial filter function is finished a table look-up step is performed to reduce the dynamic range of the data for use with later processing steps. In addition, the N(n) and D(n) vectors are transformed from the Cartesian axis to the polar one generating the angle and magnitude vectors Φ(n) and R(n) respectively. The next step, key hole filter, is used to remove (set to zero) the values of Φ(n), R(n) and P(n) where the noise due to weak signals has corrupted the measured Doppler shift. This filter is implemented as a look-up table. The values of Φ(n) and R(n) are used to read an on-off bit that is used either to pass the values of Φ(n), R(n) and P(n) unchanged to the next stage (if on) or set to zero if (off). Once the Φ(n), R(n) and P(n) vectors are key hole filtered a 3x3 median filter is used to smooth the discontinuities in the flow data. The sorting step in the median filter is done based on the phase data. The median filter step is followed by the persistence calculation. In the persistence step the phase Φ(n), magnitude R(n) and power P(n) are temporally filtered using an IIR filter acting as a decaying peak hold filter. The phase values from the current frame are compared with the phase values from the previous frame multiplied by a decay constant. If the phase value from the current frame is higher than the weighted one from the previous frame the current phase, magnitude and power are passed to the next stage. Otherwise the old values from the previous frame are passed to the next stage instead. Finally, the phase and magnitude vectors are converted back into the rectangular coordinates (back to N(n) and D(n)). In addition, the P(n), N(n) and D(n) values are compressed further using an additional table look-up step.
3
Mapping Methodology
The idea behind the method described here is to divide the input frame into segments as close in size as possible then distribute them among the processing elements of the array to balance the computational load among all processors [4][5][6]. A graphical illustration of the load balancing scheme used to allocate the B-Mode, and Doppler frames among processors for both 1D and 2D arrays is shown in figures 2 and 3 below. Once each processor gets its data segment the processing steps described in the previous section can be started. Since these processing steps are now done in parallel, additional communication steps to transfer the vectors located at the boundaries of
High Level Simulation & Modeling for Medical Applications– Ultrasound Case
197
each processor’s data segment to its neighbors are needed. In the remainder of this section we will describe the parallel implementation of the processing steps outlined in the previous section along with the data transfers needed at each step. However, we will concentrate only on the steps that require communication of data between processors.
Fig. 2. Load balancing scheme for B-Mode processing for 1D and 2D arrays
3.1
Fig. 3. Load balancing scheme for Doppler processing for 1D and 2D arrays
B-Mode Processing
In this section we will outline how the various processing steps were mapped onto the 1D and 2D architectures. The magnitude calculation step is local in our mapping approach. The load balancing procedure gives each processor an even number of I and Q vectors. No data transfer between processors is, therefore, needed for this step. In the focal zone blending step each processor performs the focal zone blending function on its local vectors. At the end of this step each processor may have partially blended vectors. This is because the vectors needed to blend some zones together may reside on different (adjacent) processors. In such cases some data transfer steps are needed to complete the blending. Two data communication steps (east and west) are used to transfer the partially blended vectors to the processor having the smallest number of overall local vectors [5]. A vector addition step is then used to finish the zone blending step. In the worst case each processor, except for the ones on the boundary of the array, could transfer up to two full vectors (one east and one west) in the case of 1D arrays. In the case of 2D arrays the size of the vectors transferred by each processor are now divided by the size of the processor array in the vertical direction (assuming equal distribution). To perform axial edge enhancement a data transfer step along the axial (vertical) direction is needed so that local filtering of samples at the boundaries can be done. If the number of filter taps is M, each processor except the ones at the boundary of the array needs (M-1) data samples sent to it from its north and south neighbors respectively. The two communication steps are needed only for the case of 2D arrays. In the case of 1D arrays each processor holds full vectors and therefore no south / north
198
A. Chihoub
communication is needed. Once the north / south communication steps are finished all needed data is local and the filtering step is done locally on each processor. The parallel frequency compounding function is performed in a way similar to that of the focal zone blending function which has been discussed earlier. Essentially, each processor adds together all vectors assigned to it which were down converted from the same RF vector. Some vectors belonging to the same RF vectors may be split among neighboring processors. For such cases the partially summed vectors may need to be transferred to the appropriate east / west neighbor following the same criteria used in the focal zone blending. The number of vectors transferred is also the same as that for the focal zone blending case. A final vector addition step is then used to add together the partially compounded vectors which now reside at the same node for each processor in the array. The dynamic reject function also needs data transfer steps between neighboring processors along the vertical direction at the start of the computation. Again these data transfer steps are only needed for the case of 2D arrays. If L is the neighborhood window length and M is the guard band length, the number of samples transferred by each non boundary processor to its north and south neighbors is (L -M)/2. Once the north-south communication steps are done the dynamic reject function is performed by all processors in parallel with each processor operating on its local data. To do the lateral edge enhancement in parallel we need to transfer data along the horizontal direction. Each processor, except the boundary ones, performs two data communication steps to their neighbors: east and west. For a three tap FIR filter based enhancement each processor would transfer two full vectors to its east and west neighbors respectively in the case of 1D arrays. In the case of 2D arrays the number of vectors transferred by each processor is still the same, the size of each transferred vector is now divided by the size of the processing array in the vertical direction, however. The black hole filing step data transfer requirements for the parallel implementation are exactly the same as those of the lateral edge enhancement step. That is, each processor transfers one full vector to its east and west neighbors in the case of 1D. In the case of 2D the size of the vectors to be transferred is divided by the dimension of the array in the vertical direction. 3.2
Doppler Processing
In the current implementation the load balancing method used for flow processing assigns each processor complete sets of ensembles. With this data assignment, the adaptive wall filtering step becomes local to each processor with no data transfers. The auto-correlation step is also performed on locally available full data ensembles. No data transfer is thus needed. In the axial edge enhancement step the power, numerator and denominator vectors are filtered in the axial direction using an FIR filter (the current implementation uses a fixed kernel of length 16). To perform filtering in the axial direction a data transfer step along the axial (vertical) direction is needed to allow for the local filtering of samples at the boundaries. If the number of filter taps is 16, each processor except the ones at the boundary of the array needs 15 samples sent to it from its north and
High Level Simulation & Modeling for Medical Applications– Ultrasound Case
199
south neighbors respectively. The two communication steps are needed only for the case of 2D arrays. In the case of 1D arrays each processor holds full vectors and no data transfer is needed. The key hole filter function is implemented as a table look-up. No data transfer between processors is needed for this step. To perform the median filter step on the vector triplets (Φ,R,P) data transfers both in the axial and lateral directions are needed. Since the filter uses a 3x3 window, each non-boundary processor needs three data samples (Φ,R,P) from all of its neighboring processors (east, west, north and south) in the 2D array case. In the case of a 1D array each non boundary processor needs the three data samples (Φ,R,P) from its east and west neighbors. Once all the data needed is transferred each processor performs the median filtering step locally. The persistence step does not require any data communication. The needed data from a previous frame is stored locally on each processing node.
4
SES/Workbench Simulation Model
To verify our mapping approach and evaluate the performance of the mapped algorithms on the parallel architectures we built a high level simulation model using the SES / Workench tool [7]. The block diagram of the implemented 2D model is shown in figure 4. The SES/Workbench implementation of the model has different hierarchy levels. At each level the functionality of one of the model blocks is refined. At the lowest levels the function to be performed is specified in C. The topology implemented in the model is dynamically reconfigurable to allow for the implementation of both 1D and 2D topologies of various sizes. Timing models of different DSP processors were developed and attached to the functions to get accurate timing information.
5
Results
Using the SES/Workbench model described in the previous section we performed the simulation using post digital receivers B-Mode, and Doppler frames as input. The goal was to verify the functionality of the approach and determine the performance characteristics of the parallelized algorithms on the two target architectures. In addition we wanted to determine the scalability of the mapped algorithms on the two architectures [8][9][10][11]. The processed frames using the parallel approach were collected from the local memory of each processor at the end of each run then displayed. Figures 5, and 6 show examples of a processed B-Mode, and Doppler frame. The B-Mode frame represents a liver example while the Doppler one represents a carotid artery. A few select output frames from the parallel implementation were also compared on a pixel by pixel basis with the results obtained from the original (serial) approach to validate the parallel approach. The C code used in the SES/Workbench model was also compiled using the cross compilers of several candidate DSPs. Timing models were then generated to estimate
200
A. Chihoub
the performance of the system using such processors. The performance curves (scaleability) for TI’s TMS320C40 DSP [12] are shown in figures 7 and 8.
Fig. 4. SES Simulation model.
Fig. 5. A parallel based B- Fig. 6. A parallel based Doppler Mode liver frame carotid artery frame
The performance metrics generated, namely the speed-up curves, indicate that the approach is scaleable for a high number of processors. This means that a machine based on such an architecture could be scaled to meet the performance requirements of market segments ranging from the low to high end. The cost projections for such a machine using the number of processors estimated from the simulation indicate that such an approach is also cost efficient. In our parallel approach, as the graphs show, the Doppler mode gives higher speed-up factors than the B-Mode. This is a reflection that in our parallel mapping / implementation the communication overhead as a function of the number of processors is lower for Doppler than that B-Mode in the range of interest. Both speed-up curves are linear for a fairly wide range, though.
2D 1D
speed-up
speed-up
2D 1D
# of processors Fig. 7. Scaleability of B-Mode algorithms on 1 & 2D arrays.
6
# of processors Fig. 8. Scaleability of Doppler algorithms on 1 & 2D arrays.
Conclusions
In this paper, we have presented an overview of the ultrasound echo and flow processing pipelines. We also presented the parallelisation / mapping of the algorithms used in the two pipelines onto 1D and 2D processing arrays based architectures. The
High Level Simulation & Modeling for Medical Applications– Ultrasound Case
201
goal was to determine the scaleability of the parallel version of the algorithms on the two target architectures. We then build a high level simulation model to validate the approach. The results of the simulation verify the correctness and feasibility of the approach. The functional correctness of the approach was verified by comparing the frames processed using the parallel approach with those obtained from the uniprocessor implementation. The scaleability of the approach was determined to be linear in the number of processors for a wide range.
References 1. J. A. Jensen, Estimation of Blood Velocities Using Ultrasound: A signal Processing Approach, Cambridge University Press, Jan. 1996. 2. J. A. Zagzebski, Essentials of Ultrasound Physics, Mosby-Year Book, Jan. 1996. 3. L. R. Tarbox, “Description of Processing Algorithms in Ultrasound Scanners”, Internal Report, SCR, Sept. 1993. 4. K. Hwang, “Advanced Computer Architectures: Parallelism, Scalablity, Programmability,” McGraw Hill, 1993. 5. F. T. Leighton, “Introduction to Parallel Algorithms and Architectures: Arrays, Trees, Hypercubes,” Morgan Kaufmann, San Mateo, CA 1992. 6. S. Y. Kung, “VLSI Array Processors, ” Prentice Hall, Englewood Cliffs, NJ, 1988. 7. SES Inc., SES/Workbench User's Manual, Release 2.1, Scientific and Engineering Software, Inc., 1992. 8. Nussbaum and A. Agarwal, “Scalability of Parallel Machines,” Communications of the ACM, no. 3, vol. 34, pp. 57-61, 1991. 9. V. Kumar and A. Gupta, “Analyzing Scalability of Parallel Algorithms and Architectures, ” TR 91-18, Department of Computer Science, University of Minnesota, Minneapolis, MN 55455, June 1991. 10. M. D. Hill, “What is Scalability?,” Scalable Shared Memory Multiprocessors, M. Dubois and S. S. Thakkar (eds.), Kluwer Academic Publishers, 1991. 11. J.R. Zorbas, D.I. Rebele and R.E. VanKooten, “Measuring the Scalability of Parallel Computer Systems,” Proceedings of SuperComputing 89, pp. 832-841, 1989. 12. J.R. Zorbas, D.I. Rebele and R.E. VanKooten, Parallel Processing With the TMS320C40, Texas Instruments, 1994.
Generation of Pathologies for Surgical Training Simulators Raimundo Sierra1 , Gabor Sz´ekely1 , and Michael Bajka2 1
2
Computer Vision Group, ETH Z¨ urich, Switzerland, {rsierra,szekely}@vision.ee.ethz.ch Clinic of Gynecology, Dept. OB/GYN, University Hospital of Z¨ urich, Switzerland
Abstract. In the past few years virtual reality based systems have been proposed and realized for many medical interventions. These simulators have the potential to provide training on a wide variety of pathologies. So far, realistic generation of anatomical variance and pathologies have not been treated as a specific issue. We report on a cellular automaton, specially developed to generate macroscopic findings fulfilling the requirements for a sophisticated simulation. The specific pathology investigated are leiomyomas protruding to different extents into the uterine cavity. The automaton presented is part of a virtual reality based hysteroscopy simulator which is currently under development.
1
Introduction
The rapid development of complex minimal invasive surgery creates a strong demand for risk free training environments. Therefore surgical simulators play an increasingly important role in medical training and education. Increasing computational power, as well as current achievements in the field of interactive computer graphics and virtual reality have already led to the rapid development of more or less sophisticated surgical simulators during the past years. These simulators demonstrate the possible power of virtual reality based training, however they exhibit many unresolved problems to provide the fidelity needed for effective training. One key issue is the generation of anatomical models for the simulation. Todays simulators use single static organ models to build surgical scenes. This anatomical model is usually derived from an exemplary anatomy, such as MRI datasets of a volunteer, from specially acquired high resolution datasets, like the visible human project, or artificially created with CAD [11,19]. To acquire surgical skills, it is highly desirable that the training scene is different from session to session. The configuration of a surgical scene entails both the anatomy of the healthy organ and the incorporation of pathologies. The goal is to generate anatomical models considering the natural variability of the healthy anatomy and seamlessly integrate a wide spectrum of different pathologies according to the specifications from physicians. Our current research aims at the development of a hysteroscopy simulator. Hysteroscopy is the visualization of the inner surface of the uterus by performing T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 202–210, 2002. c Springer-Verlag Berlin Heidelberg 2002
Generation of Pathologies for Surgical Training Simulators
203
a distension of the cavum uteri, realized through a single hull for manipulation and visualization. It makes minimal invasive surgery on the uterus possible and allows the physician to perform a specific treatment under organ saving conditions. Hysteroscopy is the second most often performed endoscopic procedure after laparoscopy in gynecology. Because of the lack of alternatives, training is usually performed during actual intervention, in assistance of an experienced gynecologist. A single organ model is inherently unable to represent an every day situation of the operating site, thus obstructing the training effect of the simulator. The organs of any two patients will never be alike. Statistical anatomical models, such as the ones used for the incorporation of prior anatomical knowledge into the segmentation process [10,4] offer an appealing way to handle the variability of healthy human anatomy within the organ models used for simulation. The other requirement for a reasonably realistic surgical simulator is the ability to provide training on a wide variety of pathological cases. The large number of possible pathologies as well as the enormous range of their manifestations makes a similar statistical approach unreasonable if not impossible. We therefore propose to model the pathologies by their genesis. In this work, the generation of pathologies for surgical simulators is addressed. The focus will be on myomas that are visible from within the uterine cavity, so called submucosal leiomyomas. Their clinical relevance as well as their relatively well-defined appearance makes them the best candidate to gain a better understanding of the artificial generation of pathologies for surgical simulators.
2
Myoma Formation
Uterine (leio-)myomas are found in up to 25% − 40% of women in their childbearing years, being the most common benign tumors of the uterus in women over 35 years old. They are composed of smooth muscle and a variable amount of fibrous tissue. Blood supply is provided by one or two large vessels [8]. Most often myomas are classified into four types, depending on their position relative to the uterine wall: the intramural myoma, which is confined to the myometrium; the submucosal myoma, which protrudes into the uterine cavity; the subserosal myoma, which projects off the peritoneal surface of the uterus [14]; and the intraligamintary myoma which protrudes into the surrounding ligaments. Both the subserosal or the submucosal myoma may be sessile or pedunculated and the latter can become prolapsed through the cervix into the vagina. All myoma start growing as intramural myomas and are referred to as such as long as they do not vault the endometrium or the serosa. If the growing direction goes towards the uterine cavity and the endometrium is vault, the myoma becomes a submucosal myoma. With continued growth the myoma can protrude through the uterine wall into the uterine cavity. The latter two cases
204
R. Sierra, G. Sz´ekely, and M. Bajka
are visible and treatable by hysteroscopy and therefore of main interest for the simulator. For hysteroscopy more detailed categorizations have been proposed [17]. Three types of myomas are discerned, which differ in the size of the intramural portion. Pedunculated myomas are classified as type 0. Myomas forming an acute angle with the surrounding uterine wall are predominantly intracavitary (type I, intracavitary portion > 50%), whereas the larger portion of type II myomas is an intramural location, if the angle is obtuse. The type of myoma has implications on the hysteroscopy as only type 0 and some type I myomas are regarded as safely resectable in one session, as the resection should never extend further than the inner border of the myometrium [3]. Leiomyomas may be solitary or multiple and over 90% are found in the uterine corpus. Five percent arise in the cervix and a smaller number are found in the broad ligament. The size of a myoma can vary from a pearl to as large as a melon. A uterus with multiple myoma may even give the impression of a sack filled with different sized potatoes [14]. Despite the amount of research in this area, the exact etiology of myoma is not known. It is assumed that the genesis is initiated by regular muscle cells with increased growth potential and that the growth of myomas is driven by estrogen. They are thus related to the function of the ovaries. Therefore myomas do not appear before puberty and do not emerge after menopause, when already existing myomas even tend to shrink. In general they grow slowly but continuously until the beginning of menopause [6]. The increase of volume by the factor of two usually takes several months or years. Slow growing myomas tend to be squeezed out by the healthy surrounding muscular meshes. Therefore, they seem to migrate over months or years towards the inner surface (endometrium, submucosal) or towards the outer surface (serosa, subserosal). Fast growing myomas, which are potentially malignant, tend to overwhelm this process by stretching out and thinning the healthy surrounding myometrium. They are able to completely deform the organ’s appearance. A myoma has a much stronger tendency to keep its shape than any of the tissues surrounding it, as it is composed of very dense fibrotic tissue. Therefore, the myoma will be able to grow almost independently from its surroundings by keeping a spherical shape. This holds for both the intramural and the submucosal myoma. The surrounding tissue consists of clustered myometrium. There is no actual capsule around the myoma. The tissue of the myoma as well as the surrounding tissue of the myometrium have a layered structure. This often simplifies the resection of the myomas as they can be peeled off the myometrium [14]. The endometrium is a highly reactive tissue covering the whole uterine cavity as well as protruding myomas of any degree. Therefore the endometrium defines the myomas’ visual appearance. The myometrium is an active muscular mesh which exhibits slow waves of contractions. This mechanism extrudes any tumor or foreign body affecting the uterine cavity and finally leads to pedunculated myomas.
Generation of Pathologies for Surgical Training Simulators
205
Table 1. List of required and neglected features
Required Neglected realistic shape exact cellular interaction [15,7] fully automatic generation stability of growth [1] randomness patient specific modeling [18] provide information for: biomechanical deformation of - texturing surrounding tissue[12] - blood perfusion observation of the growing process [9] - biomechanical modeling incorporation in organ model
For a diagnostic description of a myoma the physician will specify: its location within the uterus (fundal, corporal, or cervical), the degree of protrusion into the cavity (type 0, type I, or type II), and the length of the tree main axes [2].
3
Modeling Requirements
The generation of pathologies for surgical simulators has to fulfill a number of application specific requirements, which are summarized in Table 1. Tumor growth has previously been modeled with different objectives in mind, but most of the features of these models can be neglected in this application. The main property needed in a surgical simulator is a realistic appearance of the pathology. A surgical training scenario will be configured by a physician and should not need any additional interaction with a simulator expert. Therefore the generation process has to be fully automatic after initialization. The physician has to be able to specify a desired pathology in medical terminology, i.e. the specification of the pathology type and optionally the definition of size and position. An alternative to the definition of the size is the specification of the tumor’s age. The actual generation procedure can be computed off-line and no modifications of the tumor size or position are needed during simulation. That is, the pathology itself does not change during intervention, but of course it might be altered by the trainee, for example by cutting. In respect of the implementation, the goal is to incorporate a wide range of variations in size, geometry and position within one framework. This demands a nondeterministic model. Once the shape of the pathology has been generated, additional properties that are needed by the simulator have to be added, such as the texturing, the blood perfusion and the biomechanical properties. At the end of the generation process, the tumor has to be seamlessly incorporated into the organ model.
4
Implementation
Cellular automata are predestinated for the modeling of pathologies, as they imitate the development just by applying the same set of rules multiple times.
206
R. Sierra, G. Sz´ekely, and M. Bajka
Once a cellular automaton is able to generate a pathology in one of its most developed stages, any of the intermediate stages can be obtained with no additional effort. The main advantages when using a cellular automaton are the simplicity and extendibility of the implementation. Rules can easily be added, removed or modified. Computational stability is intrinsically a part of a cellular automaton [5]. The synthesis of a cellular automaton is equivalent to the seeking of a minimal set of rules that allows to model a certain behavior. There will always be a trade-off between realism and tractability of the model. Real myomas consist of millions of cells which is more than can reasonably be modeled in a computer simulation. An exact modeling of single cells implies also the use of a non-regular mesh [9]. This approach may be used to model exactly the behavior of a tumor in its very early stage, but it will not be manageable in the orders of magnitude considered. Thus, the actual value in a node of the cellular automaton can be regarded as a cell conglomeration rather than as a single cell. The implemented cellular automaton comprises two cell types and the background or cell-free space. The first cell type is the tissue which consists of the muscle cells of the myometrium, the second cell type represents the tumor cells. The background describes the uterine cavity. Additional factors can be added, e.g. to model the influence of the contraction of the uterine muscles. A single node with a tumor component is enough to initiate the growing process of a myoma. The cellular automaton is defined by a regular, three dimensional, cubic lattice L, an interaction neighborhood template Nb , the set of elementary states E, and the local space- and time-independent transition rules Ri . The local rules are either probabilistic or deterministic. As the cell of the automaton does not model a biological cell, the term node is used to avoid misinterpretation. A node is specified by its position p = (x, y, z) in the lattice. The neighborhood template Nb specifies the nodes that influence the state of the node under scrutiny. In the automaton described Nb is rule dependent and can be either a 6-neighborhood (N6 , von Neumann neighborhood) or a 26-neighborhood (N26 , Moore neighborhood). Whenever possible the smaller neighborhood was selected. The set of elementary states Etumor for the tumor is upper bounded by 1 and defined as a multiple of the predefined step ∆ = n1 , n ∈ IN, whereas the set Etissue is represented by the floating point values in the range [0, 1]. With each node, a tumor and a tissue channel can be associated: ctumor (p) and ctissue (p), thus tumor and tissue do not exclusively occupy a node. The idea is that only nodes with a value of ctumor (p) = 1 are considered to be part of the tumor, while any value smaller than 1 indicates a reactive shell around it. The tissue is initialized with values around 0.5 with decreasing values towards the surface. At least one node needs a tumor component with a value t0 ≥ ∆ for the growing process to start. The strict concept of a cellular automaton is relaxed to allow the integration of global knowledge in two aspects: global cost functions can be evaluated and
Generation of Pathologies for Surgical Training Simulators
207
a global rule Rglobal controlling the application of single rules Ri is introduced. Different rules Ri are applied sequentially in a loop but each rule is applied synchronously on every node. The application of a rule Ri is the transition from time step t to t + 1 with R : E ν → E, where ν = |Nb |. A first rule Rgrow determines the growing process of the tumor: Rgrow :
ctumor (p)t+1 = min 1, ctumor (p)t + ∆
A node will receive a tumor component with a certain probability if one of its neighbors in N26 has a tumor component. If the neighbor is a direct one, i.e. in N6 , the probability of becoming a tumor node is close to one. Otherwise the probability is much smaller. Once a tumor component is in a node it is continuously incremented by the same rule. Three objectives are modeled with this rule; the spherical shape of the myoma, the inhomogeneity of the surface, and the reactive shell around the tumor. A global cost function for the current tumor position is computed. For each node with a tumor component, the amount of tissue as well as the gradient value for its neighbors in N6 are separately added and summed to six different overall costs Ci . The rule Rmoving then moves the tumor into the direction d corresponding to the optimal cost min(Ci ). Rmoving :
ctumor (p)t+1 = ctumor (p − d)t
A third rule Radaption models the adaption of the surrounding tissue to the new situation. In a first pass the displacement of the tissue introduced by the growing tumor is modeled. In this rule the incremental property of Rgrow is used. Radaption :
ctissue (p)t+1 = min 1, ctissue arg max ctumor (q) t + ctissue (p)t q∈N26 ctissue arg max ctumor (q) t+1 = 0 q∈N26
In a second pass the displacement is propagated into the surrounding area by smoothing the tissue state. This could be represented by an additional rule but for simplicity Gauss filters of variable length are used in the actual implementation. A final rule Rclose closes the covering hull of the myoma so that no node with a tumor component will ever touch the background in N26 . This rule is introduced to ensure that the endometrium always covers the tumor, but can be skipped if the relaxation area is large enough, i.e. by applying Radaption several times. The global rule Rglobal defines which rule Ri is applied. This rule is time dependent so that the sequence of rules Ri applied changes during evolution. This allows for modeling a faster movement of the tumor while it is small. As the tumor size increases the rule Rmoving is applied less often to model a slower motion. As soon as the tumor is pedunculated this rule can once again be applied more frequently.
208
R. Sierra, G. Sz´ekely, and M. Bajka
Fig. 1.
Fig. 2.
Artificial myoma after 15, 25, 35 and 43 iterations.
Comparison of real (left) and two artificial myomas.
As stated in Section 2, the physician will specify the myoma by its location, protrusion and size, so that an explicit time-volume relation cannot be used. Therefore, the Gompertz model for tumor growth, which has been proposed on different occasions [9,15], is not suitable for this application. Through adaption of the global function Rglobal the desired myoma can be generated. By counting the number of applications of rule Rgrow and multiplying this number with the respective probability for N6 , one can keep track of the volume. Once the desired value is reached, the procedure exits the main loop.
5
Results
This very simple model is sufficient to produce highly satisfactory results and fulfill the requirements of a simulator. Figure 1 shows an exemplary sequence of the growing myoma. The procedure is fully automatic and does not need any interaction with a physician. The volume where the tumor was grown consisted of a cubic lattice with 1003 nodes. For the conversion to a surface model the marching cubes algorithm was used [13]. No additional factors were used and the global rule states that the moving rule is less often applied as the tumor grows. Figure 2 shows an image of a real myoma on the left which was taken during hysteroscopy. The two following images are synthetic myomas generated with the cellular automaton described. Texture images were taken from real hysteroscopies and mapped on the artificial surfaces. The visual inspection shows a high degree of realism and proves the approach to be suitable for the random
Generation of Pathologies for Surgical Training Simulators
209
generation of myomas for hysteroscopy simulation. This and more examples are available for download at http://www.vision.ee.ethz.ch/∼rsierra/miccai. The differentiation between tumor and normal tissue is needed both for the incorporation of the vascularization and the biomechanical properties. The vascularization is confined in the healthy tissue around the myoma. Different biomechanical properties can be assigned to the tumor and the tissue, so that the pathology consists of a stiff inner sphere surrounded by a softer tissue layer. The structures that can be generated by the cellular automaton proposed are not limited to myomas in the uterine cavity. The automaton can easily be tuned to produce other structures where a growing object is more rigid than the surrounding media. Validation of the resulting myomas is a major task, as it is in general for any training system. To our knowledge there are currently no other systems that generate artificial pathologies for surgical training devices which could serve as a reference. The structures described have subjectively been analyzed by experienced gynecologist who attested them a very high visual resemblance with actual cases. In the future validation will be twofold. On one side the training system will be evaluated, which also entails the behavioral and visual aspects of the myoma. Definition of useful metrics for measuring the resemblance as well as the training efficiency are preliminary tasks towards objective measurements. On the other side the growing process of tumors will be more deeply investigated and compared with the cellular automata described.
6
Conclusion and Future Research
A cellular automaton that is able to generate submucosal myomas has been described. The generated tumors show a high resemblance to real cases and can be seen as a step forward in the creation of high fidelity simulators. Future research will on one hand address the generation of other pathologies, and on the other hand incorporate the existing tumor generation model in more complex situations. In the next step, possible pathologies will be extended to incorporate surgically relevant degenerations of the uterus. The incorporation of the existing model in the fundus and corpus of the uterus is straightforward. Close to the fallopian tubes, where the myometrium is much thinner, a myoma will deform more than one surface at a time. In such cases, the incorporation into the organ model has to be further investigated. The generation of a vascular system for the pathology is closely related to the pathology itself. In the future we plan to merge the generation of the pathology with the generation of the vascular system [16]. Acknowledgments This work has been performed within the frame of the Swiss National Center of Competence in Research on Computer Aided and Image Guided Medical Interventions (NCCR CO-ME) supported by the Swiss National Science Foundation.
210
R. Sierra, G. Sz´ekely, and M. Bajka
References 1. J. Adam. A simplified mathematical model of tumor growth. Math. Biosci., 81:229–244, 1986. 2. M. Bajka. Empfehlungen zur Gyn¨ akologischen Sonographie. Schweizerische Gesellschaft f¨ ur Ultraschall in der Medizin, 2001. 3. P. Brandner, K. Neis, and P. Diebold. Hysteroscopic resection of submucous myomas. Contrib Gynecol Obstet., 20:81–90, 2000. 4. Cootes et al. Active shape models - their training and application. Computer Vision and Image Understanding, 61(1):38–59, 1995. 5. S. Dormann. Pattern Formation in Cellular Automaton Models. PhD thesis, Universit¨ at Osnabr¨ uck, 8.2000. 6. F.H.Netter. Farbatlanten der Medizin, Band 3: Genitalorgane. Georg Thieme Verlag, Stuttgart, New York, second edition, 1987. 7. H. Greenspan. On the growth and stability of cell cultures and solid tumors. J. theor. Biol., 56:229–242, 1976. 8. A. Heuck and M. Reiser. Abdominal and Pelvic MRI. Springer, 2000. 9. A. Kansal et al. Simulated brain tumor growth dynamics using a three-dimensional cellular automaton. J. theor. Biol., 203:367–382, 2000. 10. Kelemen et al. Elastic model-based segmentation of 3-D neororadiological data sets. IEEE Transactions on Medical Imaging, 18(10):828–839, 1999. 11. C. Kuhn. Modellbildung und Echtzeitsimulation deformierbarer Objekte zur Entwicklung einer interaktiven Trainingsumgebung f¨ ur Minimal-Invasive Chirurgie. Forschungszentrum Karlsruhe GmbH, Karlsruhe, 1997. 12. S. Kyriacou et al. Nonlinear elastic registration of brain images with tumor pathology using a biomechanical model. IEEE Transactions on Medical Imaging, 18(7):580–592, 1999. 13. W. Lorensen and H. Cline. Marching cubes: A high resolution 3D surface construction algorithm. Computer Graphics, 21(4):163–170, 7.1987. 14. Pschyrembel, Strauss, and Petri. Praktische Gyn¨ akologie f¨ ur Studium, Klinik und Praxis. de Gruyter, Berlin, New York, fifth edition, 1990. 15. A. Qi et al. A cellular automaton model of cancerous growth. J. theor. Biol., 161:1–12, 1993. 16. D. Szczerba. Macroscopic modelling of vascular systems. Submitted to MICCAI, 2002. 17. K. Wamsteker, M. Emanuel, and J. de Kruif. Transcervical hysteroscopic resection of submucous fibroids for abnormal uterine bleeding: Results regarding the degree of intramural extension. Obstet Gynecol, 82:736–740, 1993. 18. R. Wasserman and R. Acharya. A patient-specific in vivo tumor model. Math. Biosci., 136:110–140, 1996. 19. LASSO Project. http://www.vision.ee.ethz.ch/projects/Lasso/start.html, 2001.
Collision Detection Algorithm for Deformable Objects Using OpenGL Shmuel Aharon and Christophe Lenglet Imaging and Visualization Department, Siemens Corporate Research, Princeton, NJ
[email protected]
Abstract. This paper describes a collision detection method for polygonal deformable objects using OpenGL, which is suitable for surgery simulations. The method relies on the OpenGL selection mode which can be used to find out which objects or geometrical primitives (such as polygons) in the scene are drawn inside a specified region, called the viewing volume. We achieve a significant reduction in the detection time by using a data structure based on an AABB tree. The strength of our method is that it doesn’t require the AABB hierarchy tree to be updated from bottom to top. We are using only a limited set of bounding volumes, which is much smaller than the object’s number of polygons. This enables us to perform a fast update of our structure when objects deform. Therefore, our approach appears to be a reasonable choice for collision detection of deformable objects.
1
Introduction
Many interactive virtual environments, such as surgery simulations, need to determine if two or more surfaces are colliding. That is, if there are surfaces that are touching and/or intersecting with each other. Finding the exact locations of these areas is a key process in this kind of application. For realistic interactions/simulations these calculations require good timing performance and accuracy. Collision detection algorithms have been published extensively. The most general and versatile algorithms are based on bounding volume hierarchies to detect collisions between polygonal models. These algorithms are primarily categorized by the type of bounding volume that is used at each node of the hierarchy tree. That is, axis aligned bounding boxes (AABB) [1], object oriented bounding boxes (OOBB) [2], or bounding spheres [3], [4], [5]. The main limitation of these algorithms is that for deformable objects, one needs to update (or re-build) the hierarchy trees at every step of the simulation. This is a time-consuming step that significantly reduces the efficiency of these algorithms. As far as we can tell, there are only two algorithms, known today, that are using graphics hardware acceleration for collision detection. One of them is the approach of Hoff et al. [6], which is limited to collisions between two-dimensional objects, or to some specialized three-dimensional scenes, such as those whose objects collide only in a two-dimensional plane. The second approach, suggested by Lombardo et al. [7], uses the OpenGL selection mode to identify collisions between polygonal surfaces. However, it is limited to the collisions between a deformable polygonal surface and an object with a very simT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 211–218, 2002. © Springer-Verlag Berlin Heidelberg 2002
212
S. Aharon and C. Lenglet
ple shape, such as a cylinder or a box. Furthermore, the performance of this algorithm significantly decreases with the increase of the object’s number of polygons. Therefore, it is limited to objects with relatively small number of polygons. This paper presents a new approach for collision detection using Open GL which allows a fast and accurate way to detect the collisions between the individual polygons of deformable objects.
2
The Collision Detection Algorithm
The collision detection algorithm suggested here detects collisions between a specified object, the reference object, and all other objects in the scene. We are assuming that each object is built from a set of polygons, which are either triangles or quadrangles. Following is a description of the various steps needed to complete a collision query with our algorithm. The algorithm is based on rendering the scene in selection mode, using an orthographic camera, with OpenGL graphics. Selection is a mode of operation for OpenGL, which automatically tells which objects in the scene are drawn inside a specified region, called the viewing volume. Before rendering it is necessary to provide a “name” for each object/primitive in the scene. After rendering in selection mode, OpenGL will return the list of all the “names” of the objects/primitives that are drawn inside the viewing volume. Further details about the OpenGL selection mode can be found in [7], [9]. 2.1
Bounding Volumes Hierarchy Creation
As a pre-processing step, it is necessary to divide the surface of each object into a set of Axis Aligned Bounding Boxes (AABB), in such a way that each bounding box contains no more than a specified number of polygons, and each polygon belongs to one and only one bounding box. We start building an AABB tree using the method suggested in [1], with the following modifications. Only the root and the leaves of the tree are used, ignoring all its internal nodes. This allows us to perform very fast updates since there is no need to consider the complete tree structure. We stop subdividing a bounding box when the number of polygons it contains is below a specified threshold. It may happen that the final bounding box has large empty spaces. This will be the case if it contains polygons that are not continuously connected in space. To prevent this, further subdivide it into two bounding boxes as described in [1]. If the resulting two bounding boxes are completely separated - keep them, otherwise - keep their parent bounding box. 2.2
The Collision Query
Step 1: Find the objects’ bounding boxes that intersect with the global bounding box of the reference object. This is done by defining the global bounding box of the reference object as the OpenGL viewing volume, and rendering all the bounding boxes of all other objects, defined as triangle strips, in selection mode, using a different
Collision Detection Algorithm for Deformable Objects Using OpenGL
213
“name” for each one. We use triangle strips, which are the most optimized OpenGL primitive, to ensure high performance [8]. Collisions can occur only in the area of the bounding boxes that intersect with the global bounding box of the reference object. Therefore, only these bounding boxes will be processed in the next step. If there is no bounding box that intersects the global bounding box of the reference object, then there is no collision and the algorithm stops. Step 2: Find the bounding boxes of the reference object that intersect with the bounding boxes found in the previous step. This is done, similar to step 1, by defining the OpenGL viewing volume as one of the bounding boxes found in the previous step, and render all the bounding boxes of the reference object, defined as triangle strips, in selection mode, using a different “name” for each one. Repeat this process using all of the bounding boxes that have been found in the previous step. The goal of this step is to have, for each bounding box from an object that is potentially involved in a collision, the list of all bounding boxes from the reference object that intersect with it, if any. Step 3: Find the list of all the reference object’s polygons that have potential collisions with an object(s) in the scene. For a given object bounding box B, found in step 1, render in selection mode, all the polygons contained in the bounding boxes from the reference object that found in step 2 to intersect B. This allows a major computational saving since it can greatly reduce the final number of polygon-polygon intersection checks. In order to minimize the number of viewing volume definitions, this step can be combined with step 2. Step 4: Find all the polygons, from all objects, that intersect with the bounding boxes of the reference object. These polygons have potential collisions with the reference object. To do this define, as was done in step 3, the OpenGL viewing volume as one of the reference object bounding boxes that has known intersections with one or more objects’ bounding boxes. Then render in selection mode, all the polygons from the objects’ bounding boxes that were found to intersect with this reference object bounding box, using a different “name” for each polygon. Repeat this procedure for all the reference object bounding boxes. The result of this processing provides the following for each reference object bounding box: • The list
L
i r
of the potentially colliding polygons inside this bounding box (i) of
the reference object (found in step 3). • The list
L
jk ri
of the polygons potentially colliding with polygons of
L
i r
inside
the bounding boxes (k) from object (j) of the scene (found in step 4). Where j goes from 1 to the number of scene’s object (excluding the reference object), and k goes from 1 to the number of bounding boxes for object j. This limits the number of polygons that have possible collisions, and hence significantly reduced the number of polygon-polygon intersection tests. Step 5: Find the polygons of the reference object that are colliding with polygons from an object, or objects, and the list of these polygons. For every polygon, P, in a
L
i r
list, with i going from 1 to the number of reference object bounding boxes, find
214
S. Aharon and C. Lenglet
whether or not this polygon really intersects the polygons from the
L
jk ri
lists. To do
so, define the polygon’s viewing volume to be a tightly fitting volume around the polygon P (see section 2.3 for details). Then render in selection mode all the polygons in a
L
jk ri
list, giving a different name to each of them. Every polygon that is found
inside the specified polygon’s viewing volume is actually intersecting the given polygon, P. The accuracy of this detection algorithm is limited to how accurate the polygon’s viewing volume actually limits the region that this polygon occupies in the world, and can be made as good as possible with no additional cost. Repeat this step for all the non-empty
L
jk ri
lists, and for all the
L
i r
lists.
This step provides the list of all the polygons from the reference object and the polygons from the scene’s object(s) that intersect. This is the desired result of the collision detection algorithm. 2.3
Defining the Polygon’s Viewing Volume
The goal is to define a small region tightly covering a given polygon, P, as the OpenGL viewing volume. Then render all other polygons of interest in selection mode to find out if they are drawn inside the specified polygon viewing volume, which means that they are intersecting with it. The accuracy of this detection method is defined by the accuracy of the viewing volume definition, and how well it really describes the region that this polygon occupies in the world. The rendering is done using an orthographic camera. In this case the OpenGL viewing volume is a rectangular parallelepiped (or more informally, a box), defined by 6 values representing 6 planes named left, right, bottom, top, near and far. Below are the steps to define a polygon’s viewing volume. First, find the polygon’s two-dimensional bounding box that resides in its plane. Then specify the viewing volume around this bounding box. That is define the left, right, bottom, and top planes of the viewing volume as the four edges of this bounding box. Next define the depth of the volume to be a very small number, ε, which specifies the required accuracy (we found that ε=0.001 gives good results). That is, specify the near and far planes to be
−ε ε , and from the polygon’s plane along the poly2 2
gon’s normal. If the polygon P under consideration is a triangular polygon one need to add two clipping planes to limit the viewing volume to a pyramid, tightly fitted around the polygon, as shown in Fig. 1. This method can be easily adapted for quadrangular polygons. In this case, three clipping planes might be needed to limit the viewing volume to the quadrangle edges. 2.4
Updating the AABB Structure for Object’s Deformations
When dealing with deformable objects it is necessary to update the Axis Aligned Bounding Boxes (AABB) structure after every step of the simulation. As mentioned before, we only keep the root of an AABB tree (the global bounding box of an ob-
Collision Detection Algorithm for Deformable Objects Using OpenGL
215
ject), and its leaves. Due to the relatively small number of bounding boxes that need to be updated, this task is performed rather quickly. Every bounding box is re-fitted around all the polygons that it contains. We further optimized this step by using the Streaming SIMD Extensions (SSE) provided by Intel processors since the release of the Pentium III processor (see [10] for details). This allows us to perform the update of the AABB structure twice faster.
Fig. 1. Viewing Volume fitting a triangular polygon (shaded). For clarity only one clipping plane is shown
3
Results
We performed several tests of our collision detection algorithm in order to evaluate its performance, the effect of the maximum number of polygons allowed in a bounding box, and the effect of object’s number of polygons, on the detection time. The tests were done on an Intel Pentium III 930 MHz processor with a Matrox Millennium G450 32 MB graphics card. All reported values are the average of a set of about 2000 collisions with approximately 10-15 colliding polygons. We tested the algorithm using a model of a scalpel consisting of 128 triangles, and one of the following polygonal models: Face with 1252 triangles, Teapot - 3752 triangles, Airways – 14436 triangles, Colon – 32375 triangles, and the Spinal Column with the Hips – 51910 triangles. The last three models were generated from clinical CT data. 3.1
Effect of Number of Polygons per Bounding Box on the Performance
The effect of the number of polygons allowed in a bounding box on the performance of our collisions detection method is shown in Fig. 2, for objects with different number of polygons. As expected, increasing the number of polygons in a bounding box will decrease the collision detection performance, since it will require to render large number of polygons for every box that has potential collision polygons and to perform many polygon-polygon intersection tests. On the other hand, having a small number of polygons in a bounding box, will lead to too many boxes for each mesh. Hence, increasing the number of boxes that have to be tested and will also decrease the efficiency of the detection algorithm.
216
S. Aharon and C. Lenglet
Fig. 2. Effect of the number of polygons in a bounding box (BBox) on the performance of the collision detection (upper-left), the update of the AABB tree structure (upper-right), and the total iteration time (bottom) for objects with different number of triangular polygons
However, as can be seen in Fig. 2, the performance of the collision detection is not very sensitive to the selection of the number of polygons in a bounding box. Therefore, a simple rule of thumb can be used to specify this number. That is, having about 300-600 polygons in a bounding box will have the best performance for large objects (more than 5000 polygons) and 50-150 polygons per bounding box for small objects (less than 5000 polygons). It is worth mentioning that changing the number of polygons in a bounding box, hence the number of bounding boxes we have, almost doesn’t affect the performance of updating the AABB structure. This is not surprising since no matter how many bounding boxes we have, we need to process all the polygons within them, that is all the object’s polygons. 3.2
Effect of the Object’s Number of Polygons on the Performance
The effect of the object’s number of polygons on the collision detection performance is shown in Fig. 3.
Fig. 3. Effect of the object’s number of polygons on the collision detection performance
Collision Detection Algorithm for Deformable Objects Using OpenGL
217
As can be seen in Fig. 3, the collisions detection time increases with the total number of polygons. However, the increase rate is far below a linear increase rate (i.e. O(n)). This means, that our algorithm can handle efficiently deformable models with a large number of polygons without a huge penalty in its performance. This is the main strength of the algorithm we presented here. 3.3
Performance Evaluation
Our goal was to compare the performance of our method to others. However, not all of the implementations were available to us. Therefore we performed a ballpark evaluation using public benchmark information [11]. Using the benchmark information we were able to estimate the difference in performance between the machines used to provide timing information of the various collision detection methods. Although this gives only ballpark estimates, it is sufficient to provide an idea of how well our algorithm performs in comparison to others. The Object Oriented Bounding Box (OOBB) method [2], used by RAPID, is very efficient in the collision query. However, it is necessary to rebuild the OOBB tree or refit it at every step when objects deformed. This is a time consuming task (in the order of tens of milliseconds for objects with couple of thousands of polygons, see [1]) that makes this method not suitable for use with deformable objects. The AABB method is also a very fast method for collision query, which can be updated for object’s deformation [1]. However, updating its tree structure is still the bottleneck for this method. As reported in [1] (and estimated for our machine) it takes about 1.5 milliseconds to update the AABB tree for an object with 3752 polygons, while our modified AABB structure can be updated in 0.39 milliseconds. The cost of the tree update increases significantly with the number of polygons using the AABB method. Therefore, although the AABB tree performs a collisions query much faster than our method, the overall performance for each iteration is slower than ours, in particular for large objects. Finally, Brown et. al. [5] suggested a method to update the bounding sphere tree method reported by Quinlan [4]. They reported a time of about 0.02 milliseconds per triangle for the tree structure updates. That is, it will be in the order of 20 milliseconds for an object with 1000 triangles, while with our method we can updated 50000 triangles in about 5 milliseconds. This again implies that our method is much more efficient for large deformable objects.
4
Conclusions
We have presented an algorithm for collision detection between polygonal deformable objects using Open GL. The algorithm is based on the OpenGL selection mode combined with a structure of axis aligned bounding boxes. It performs well on deformable objects with a large number of polygons, with a relatively small cost in performance when increasing the number of polygons. This method is particularly suitable for surgery simulations, where fast interaction is essential.
218
S. Aharon and C. Lenglet
References 1. Van Den Bergen G.: Efficient collision detection of complex deformable models using AABB trees. Journal of Graphics Tools (USA), vol. 2, no. 4, p. 1-13, 1997. 2. Gottschalk S., Lin M.C., Manocha D. OBB Tree: a hierarchical structure for rapid interference detection. Proceedings of 23rd International Conference on Computer Graphics and Interactive Techniques (SIGGRAPH'96), New Orleans, LA, USA, 4-9 Aug. 1996. 3. Larsen E., Gottschalk S., Lin M.C., and Manocha D. Fast distance queries with rectangular swept sphere volumes. Proceedings 2000 ICRA. IEEE International Conference on Robotics and Automation, vol.4, San Francisco, CA, USA, 24-28 April 2000. 4. Quinlan S. Efficient distance computation between non-convex objects. Proceedings of the 1994 IEEE International Conference on Robotics and Automation, vol. 4, San Diego, CA, USA, 8-13 May 1994. 5. Brown J., Sorkin S., Bruyns C., Latombe JC., Montgomery K., and Stephanides M.: RealTime Simulation of Deformable Objects: Tools and Application, Computer Animation, Seoul, Korea, November 7-8, 2001. 6. Hoff K.E., Zaferakis A., Lin M.C., and Manocha D. Fast and simple 2D geometric proximity queries using graphics hardware. Proceedings of the 2001 symposium on Interactive 3D graphics, p. 145-148, ACM Press New York, NY, USA, 2001. 7. Lombardo J.C., Cani M.P., and Neyret F. Real-time collision detection for virtual surgery. Proceedings Computer Animation, Geneva, Switzerland, p.82-90, 26-29 May 1999. 8. Evans F., Skiena S., and Varshney A.: Optimizing Triangle Strips for Fast Rendering. Proceedings of IEEE Visualization 96, pp. 316-326, 27 October – 1 November 1996. 9. Woo M., Neider J., Davis T., and Shreiner D.: OpenGL Programming Guide. Third Edition. Addison-Wesley, Massachusetts, USA, 2000. 10. Intel Corporation: Data Alignment and Programming Issues for the Streaming SIMD Extensions with the Intel C/C++ Compiler. January 1999. 11. SPEC CPU95 Benchmark. Standard Performance Evaluation Corporation. www.spec.org.
Online Multiresolution Volumetric Mass Spring Model for Real Time Soft Tissue Deformation Celine Paloc1 , Fernando Bello2 , Richard I. Kitney1 , and Ara Darzi2 1
2
Dept. of Bioengineering, Imperial College, London, UK Dept. of Surgical Oncology and Technology, Imperial College, London, UK
[email protected] Abstract. Recent years have seen an increase in the acceptance and demand for Virtual Reality surgical simulators. Although significant advances have been made in the area, real-time accurate simulation of soft tissue deformation is still a major obstacle when developing simulators with haptic feedback. On this paper we present a new multi-resolution volumetric mass-spring model that offers high visual and haptic resolution in and around the region of interaction and other critical regions. Visual and haptic resolution decreases in proportion to the distance from such regions making it possible to distribute the computational workload optimally in order to achieve real-time haptic simulation.
1
Introduction
Surgical simulation is an extremely challenging area of research combining medical imagery, computer graphics and mathematical modelling. Recent advances make it possible to represent complex tissue structures and perform virtual flythrough operations, but a great deal of research in soft tissue modelling is still needed to develop the next generation of surgical simulators. 1.1
Physically-Based Deformable Models
Various approaches founded on the laws governing the dynamics of non-rigid bodies have been proposed for simulating deformable soft tissue. The Finite Element Method (FEM) is a common and accurate way to compute complex deformations of soft tissue, but conventional FEM has high computational cost and large storage requirements. Hybrid models based on global parameterized deformations and local deformations based on FEM have been introduced [1,2,3] to tackle this problem. Large-scale multi-processor computers to obtain soft tissue deformation at interactive rates have also been employed [4], while in [5,6] pre-computed elementary deformations and speed-up algorithms were used. Most of these methods, however, are only applicable to linear deformations and valid for small displacements. Furthermore, they tend to rely on pre-computing the complete matrix system and are therefore unable to cope with topological changes occurred during cutting or tearing. Mass Spring Systems (MSS) have been widely used in soft tissue simulation [7,8,9,10] because of their ability to generate dynamic behaviors that allow real time deformation and topological changes. The main limitation of MSS is T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 219–226, 2002. c Springer-Verlag Berlin Heidelberg 2002
220
C. Paloc et al.
computing the masses and springs parameters in order to set up a homogeneous material. Since damped springs are positioned along the edges of a given mesh, the geometrical and topological structure of this mesh strongly influences the material behavior and may generate undesired anisotropy. This tends to disappear as the density of the mesh increases, but using an extremely dense mesh reduces efficiency. For these reasons, a trade off between accuracy and computational time is typically required in MSS. 1.2
Multiresolution in Physically-Based Simulation
Multiresolution is a very active field of research in computer graphics and image processing. It consists of using representations of a geometric object at different levels of accuracy and complexity. This concept can be extended to physicallybased simulation by dynamically and locally adapting the density of the mesh in regions of interest depending on the desired accuracy. Most of the work in deformable modelling has used fixed space discretization. However, there have been some attempts at using the idea of locally refining a model in and around regions of interest. Hutchinson et. al [11] simulated a piece of draped cloth with a MSS which can be refined in regions of high curvature using a multi-level hierarchical mesh to represent varying levels of granularity. More recently, a model based on a multiresolution triangulation stored with a preprocessed DAG was used to refine a volumetric mass-spring network near user-controlled cutting lines [12]. Debunne et. al [13] combined a linear finite-volume based mechanical model with a non-hierarchical refinement technique and Wu et. al [14] proposed a scheme for mesh adaptation based on an extension of the progressive mesh concept for simulation with non-linear FEM. The problem with these approaches is the use of a pre-processing phase. Such pre-processing fixes the range and accuracy of the stored resolutions and limits the flexibility of the method. Moreover, the pre-processed resolutions depend on the topology of the object and thus prohibit topological modifications. 1.3
Refinement for Unstructured Mesh Generation
Implementation of a flexible multiresolution soft tissue model requires the online refinement/simplification of an unstructured mesh. Starting with a coarse mesh, a refinement procedure based on traditional unstructured mesh generation algorithms can be applied until the desired nodal density has been achieved. One of the existing approaches to element refinement is to divide an element into several ones by inserting a single node inside or on the boundary depending on its location in the mesh [15,16]. The quality of resulting elements can be improved by deleting the local elements and connecting the nodes to the triangulation using the Delaunay criterion. Our refinement scheme is based on Shewchuck’s Delaunay refinement algorithm for 3D quality mesh generation [16]. The remainder of this paper is organized as follows: Section 2 explains in detail the proposed online multiresolution approach. In section 3 we present and discuss the results of applying the new approach making a comparison with standard single resolution techniques. Lastly, in section 4 we formulate our conclusions and comment on our future work.
Online Multiresolution Volumetric Mass Spring Model
2
221
Methodology
Recently, we presented a volumetric MSS that offers topological and geometric flexibility for the efficient modelling of complex anatomical structures and simulation of interactions such as cutting or suturing [17]. Building on our model, we now introduce a flexible and truly dynamic multiresolution volumetric massspring representation offering high visual and haptic resolution in and around the region of interaction and other critical regions. 2.1
Behavior Consistency
One of the main problems in multiresolution models is ensuring that the deformable model stays self-consistent despite changes of resolution. Since there has never been any underlying physical model to refer to in order to find what parameter changes will guarantee the most consistent behavior at different resolutions, it has been commonly assumed that it is difficult to change the density of the mesh during the simulation while maintaining the same global mechanical properties. To address this problem, we studied the oscillations of a deformable tissue block under gravity at several resolutions using different parameter definitions as described below. Some frames of the simulations are presented in Figure 1. Figures 2 and 3 show the results of the simulation at three different resolutions and at combined resolutions.
(a)
(b)
(c)
(d)
Fig. 1. Deformable tissue block (a) under gravity forces using different parameter definitions. (b) single resolution. (c) combined resolutions using k = 1l and constant damping. (d) combined resolutions using (2) and (3).
Point mass. In our volumetric MSS, masses are allocated at the vertices of the tetrahedral mesh and damped springs along the edges. To accurately distribute the total mass of the mesh, we compute the mass mi of each vertex i according to the volume Vj of its adjacent tetrahedron j. If D is the material density, then: D j Vj mi = (1) 4 Spring stiffness. The easiest way is to use a constant value for the stiffness (k). More commonly, k is computed as k = 1l , where l is the length of the spring at rest. In [18], Van Gelder suggested a formula to compute spring stiffness for a 3D mesh that is the closest to an elastic continuous representation. Let E be the material elastic modulus, then:
222
C. Paloc et al. -7.4
-7
-7.6 -7.8
-8 -8
-8 y displacement
-7.5
-7.5
-8.5
-8.2
-8.5
-8.4
-8.8
-9.5
-9.5
-9
single resolution 87 nodes single resolution 187 nodes single resolution 553 nodes Combined resolution 269 nodes
-10
-9.2 -9.4
-9
-9
-8.6
0
10
20
30 40 50 60 70 Number of iterations
80
90
100
-10.5
0
10
20
(a)
30 40 50 60 Number of iterations
(b)
70
80
-10 -10.5 90
20
40
60 80 100 120 Number of iterations
140
160
(c)
Fig. 2. Comparison of the vertical position of an oscillating cube at several and combined resolutions using different k: (a) constant. (b) k = 1l . (c) Using (2).
k=
E
j l2
Vj
(2)
Figure 2 shows the behavior of the tissue block using (1) to obtain the mass of each element and the above stiffness definitions. It clearly illustrates that using a constant k and k = 1l fails to ensure the same amplitude and frequency of oscillations at different resolutions, while using (2) results in a consistent physical behavior at different and combined resolutions. Spring damping. The question of how to assign different damping (c) values to the various springs in a MSS has been largely ignored in the literature. Traditionally, c is treated as a constant throughout the system. If we assume that our multi degree-of-freedom (DOF) system can be transformed into a set of uncoupled single DOF systems, the damping ratio di of each spring taken separately can be expressed as di = 2√ckM , where k is the spring stiffness and M of the system without the effective end mass mi + mj . To limit the oscillations √ overdamping it, we let di = 1, such that c = 2 kM . We performed the same simulation as before using (1) and √ (2) to calculate m and k, first letting c be a constant and then defining it as c = 2 kM . Figure 3(b) shows that the proposed formula guarantees the best frequency consistency for different resolutions, but fails to ensure the same amplitude of oscillation that tends to increases with the resolution. To compensate for this effect, we adjust c to be inversely proportional to l, the length of the spring at rest: √ 2 kM c= (3) l As shown in figure 3(c), this new formula ensures the best behavior consistency for different and combined resolutions. We have thus demonstrated that it is possible to ensure a coherent physical behavior of a volumetric mass spring system at different and combined resolutions by dynamically updating the parameters using equations (1), (2) and (3). 2.2
Online Tetrahedral Refinement
In order to refine a given mesh at a desired location, we extend the concept of tetrahedral Delaunay refinement. A coarse mesh is created offline by forming an
Online Multiresolution Volumetric Mass Spring Model -7.5
-7.5
y displacement
-8
223
-7.5 single resolution 87 nodes single resolution 187 nodes single resolution 553 nodes Combined resolution 269 nodes
-8
-8
-8.5
-8.5 -8.5
-9
-9 -9
-9.5
-9.5 -9.5
-10 -10.5
0
10
20
30 40 50 Number of iterations
60
70
80
-10
-10
0
(a)
10
20
30 40 50 60 Number of iterations
(b)
70
80
90
-10.5
0
10
20
30
40 50 60 70 Number of iterations
80
90
100
(c)
Fig. 3. Comparison of the vertical position of an oscillating cube at several and com√ √ bined resolutions using different c: (a) constant. (b) c = 2 kM . (d) c = 2 lkM .
initial boundary constrained Delaunay tetrahedralization of the input vertices and triangles. The input vertices are used as a reference and can not be deleted. During online simulation, the given mesh can be locally refined/simplified by inserting/deleting vertices. In this section, we present the details of our method. Addition of mass points. During the simulation, additional mass points can be inserted in a tetrahedron of the previous triangulation in its resting configuration. Using the barycenter point coordinates within the tetrahedron, we interpolate its previous and actual position and velocity in the deformed mesh. Delaunay refinement. The resting configuration of the mesh is then updated using the Bowyer/Watson [19,20] algorithm to maintain the Delaunay property. Once all the new points have been inserted, the tetrahedrons which might have appeared in the concavities are removed. Vertex removal. If a vertex is removed, all tetrahedrons incident at the vertex are also removed leaving a ”hole” in the mesh. The re-triangulation of the hole is done by building the Delaunay triangulation of adjacent vertices [21]. Parameters update. Each time a tetrahedron is created or removed, all relevant parameters (m, k, c), are updated using equations (1), (2) and (3). Data structures. The efficiency of the on-line tetrahedral refinement depends heavily on the data structures. We use dynamic memory reallocation and two mesh data structures for mesh manipulation: the tetrahedron-based data structure [22] and the triangle-edge data structure for face classification and mesh manipulation [23]. The data structures have been extended by adding various flags to quickly trace mesh changes after an update and specific attributes related to the mesh deformation. 2.3
Refinement Locations
An essential step in the implementation of our model is to determine where to refine the mesh. Our refinement locations are defined based on the region of interaction and on the topology of the mesh. Region of interaction. The interactions between the surgical instruments and the soft tissue structures are handled by our collision detection algorithm. We approximate the geometric models of the instruments with a set of oriented
224
C. Paloc et al.
bounding boxes (OBBs) and have further optimized the collision detection by also using axis-aligned bounding box (AABB) trees for both the surgical instruments and soft tissue models. AABB trees can be updated very quickly as the shape of the model changes [24] allowing us to compute in real-time the exact points of intersection between the surgical instrument and the soft tissue model. New points into the soft tissue mesh are then inserted at the points of intersection and the mesh is refined using the Online Tetrahedral Refinement method described above. Our collision detection also allows us to quickly compute the distance between the new points inserted into the original mesh and the OBBs of the surgical instrument. If the distance is larger than a specified threshold, the point is removed from the mesh using the Vertex Removal algorithm. Mesh Topology. We use a quality refinement procedure to incrementally insert new points in the areas of high strain in the mesh. Such areas are normally represented by low quality tetrahedra in the initial coarse mesh input. One possible measure for analyzing the quality of a tetrahedron is the circumradius-to-shortest edge ratio of a tetrahedron. Any tetrahedron whose circumradius-to-shortest edge ratio is larger than B is split by inserting a vertex at its circumcenter. By using different values for the threshold B, it is possible to make the degree of refinement dependent on local strain.
3
Results and Discussion
To demonstrate the accuracy and efficiency of our refinement method, we simulated the deformation of an organ under the interaction of a surgical instrument using single (low/high) resolution and multiresolution models. The multiresolution model incorporated refinement at the regions of interactions as described in section 2. Every simulation lasted 20 seconds, including the time the instrument is approaching and leaving the organ. Figure 4 shows three frames taken during each simulation as the instrument is deforming the organ. They clearly illustrate the lack of deformation accuracy for the low-resolution model, whereas the high resolution and the multiresolution models show similar behavior. The underlying physical model as described in 2.1 allows the refined mesh to stay perfectly consistent despite the combination of several resolutions. Unlike continuous models which introduce diverging instabilities when different resolutions are combined [13], our refinement model behaves smoothly, without any visual artifacts. In order to quantify the performance of our model, we plotted for each simulation the elapsed time of every iteration broken down into different steps. The results are shown on figure 4. The relaxation time of the multiresolution model being proportional to the density of the mesh is kept as small as that of the low-resolution model. Figure 4 (c) also shows that the time required for the refinement is small enough to keep the simulation at a rate much faster than the high-resolution model. In fact, the iteration rate of the multiresolution model varies between 2 and 20 times that of the high-resolution model.
Online Multiresolution Volumetric Mass Spring Model
220
700 Collision Relaxation
200
350 Collision Relaxation
600
180
Time (ms)
250
500
200
120
400
100
150
80
300
100
60 40
200
50
20 0
Refinement Collision Relaxation
300
160 140
225
0
50
100 150 Number of iterations
(a)
200
250
100
0
5
10
15
20 25 30 35 Number of iterations
(b)
40
45
50
0
0
20
40
60
80 100 120 140 160 180 200 Number of iterations
(c)
Fig. 4. Performance comparison between different models. (a) Single low resolution. (b) Single high resolution. (c) Multiresolution
4
Conclusions and Future Work
We have developed a soft tissue model that allows localized online refinement, offering high visual and haptic resolutions in the regions of interest. Our results show that the refinement is made completely transparent to the user, as the model stays self-consistent despite the changes of resolution. Our method for 3D mesh refinement is truly dynamic offering full flexibility even in the case of topological changes. In fact, we believe that our multiresolution model will facilitate the implementation of topology modifying interactions such as cutting since we are able to refine the model along the path of a surgical instrument. As part of our future work, we will implement an original method for accurate soft tissue cutting. We also plan to develop a multifrequency relaxation for our model. The integration time step of classical MSS is closely dependent on the spring parameters. Since our parameters are directly proportional to the resolution, we shall optimize the relaxation time in relation to the density of the mesh. This will enable us to distribute the computational workload and help us towards achieving real-time haptic simulation.
References 1. Delingette H., Cotin S., and Ayache N.A. Hybrid elastic model allowing real-time cutting, deformations and force-feedback for surgery training and simulation. In Computer Animation, 1999. 2. Ramanathan R. and Metaxas D. Dynamic deformable models for enhanced haptic rendering in virtual environments. In IEEE Virtual Reality, pages 31–35, 2000.
226
C. Paloc et al.
3. Jianyun C., Jian S., and Zesheng T. Hybrid fem for deformation of soft tissues in surgery simulation. In IEEE Medical Imaging and Augmented Reality, pages 298–303, 2001. 4. Szekely G., Brechbuhler C., Hutter R., Rhomberg A., Ironmonger N., and Schmid P. Modelling of soft tissue deformation for laparoscopic surgery simulation. Medical Image Analysis, 4, march 2000. 5. Cotin S., Delingette H., and Ayache N.A. Real-time elastic deformations of soft tissues for surgery simulation. IEEE Transactions on Visualization and Computer Graphics, 5(1):62–73, 1999. 6. James D.L. and Pai D.K. Artdefo - accurate real time deformable objects. In Siggraph 1999, Computer Graphics Proceedings, pages 65–72, Los Angeles, 1999. 7. Joukhadar A. and Laugier C. Fast dynamic simulation of rigid and deformable objects. IEEE/International Conference on Intelligent Robots and Systems IROS, august 1995. 8. Desbrun M., Schr¨ oder P., and Barr A. Interactive animation of structured deformable objects. In Graphics Interface, pages 1–8, 1999. 9. Kuhnapfel U., Akmak H., and Maa H. Endoscopic surgery training using virtual reality and deformable tissue simulation. Computers and Graphics, 24:671–682, 2000. 10. Brown J., Montgomery K., Latombe J.C, and Stephanides M. A microsurgery simulation system. In MICCAI, pages 137–144, 2001. 11. Hutchinson D., Preston M., and Hewitt T. Adaptive refinement for mass/spring simulations. In Computer Animation and Simulation ’96, pages 31–45. 12. Cignoni P., Ganovelli F., , and Scopigno R. Introducing multiresolution representation in deformable modeling. In SCCG, pages 149–158, april 1999. 13. Debunne G., Desbrun M., Cani M.P, and Barr A.H. Adaptive simulation of soft bodies in real-time. In CA, pages 15–20, 2000. 14. Wu X.M. Adaptive nonlinear finite elements for deformable body simulation using dynamic progressive meshes. Eurographics, pages 439–448, 2001. 15. Ruppert J. A delaunay refinement algorithm for quality 2-dimensional mesh generation. J. Algorithms, 18(3):548–585, 1995. 16. Shewchuk J.R. Tetrahedral mesh generation by delaunay refinement. In Symposium on Computational Geometry, pages 86–95, 1998. 17. Paloc C., Kitney R.I, Bello F., and Darzi A. Virtual reality surgical training and assessment system. In Computer Assisted Radiology and Surgery, pages 207–212, June 2001. 18. Van Gelder A. Approximate simulation of elastic membranes by triangulated spring meshes. Journal of Graphics Tools, 3(2):21–41, 1998. 19. Bowyer A. Computing dirichlet tesselations. Computer Journal, 24:162–166, 1981. 20. Watson D. Dimensional delaunay tesselation with applications to voronoi polytopes. Computer Journal, 24:167–172, 1981. 21. Renze K. and Oliver J. Generalized surface and volume decimation for unstructured tessellated domains. In VRAIS, pages 111–121, March 1996. 22. Shewchuk J.R. Delaunay Refinement Mesh Generation. PhD thesis, School of Computer Science, Carnegie Mellon University, Pittsburgh, Pennsylvania, May 1997. 23. Mucke E.P. Shapes and Implementations in Three-dimensional Geometry. PhD thesis, Department of Computer Science, University of Illinois at UrbanaChampaign, Urbana, Illinois, 1993. 24. Van Den Bergen G. Efficient collision detection of complex deformable models using aabb trees. Journal of Graphics Tools, 2(4):1–13, 1997.
Orthosis Design System for Malformed Ears Based on Spline Approximation Akihiko Hanafusa1, Tsuneshi Isomura1, Yukio Sekiguchi1, Hajime Takahashi2, and Takeyoshi Dohi3 1
Department of Rehabilitation Engineering, Polytechnic University 4-1-1 Hashimotodai, Sagamihara, Kanagawa 229-1196, Japan
[email protected] 2 Department of Plastic Surgery, Tokyo Metropolitan Toshima Hospital 33-1 Sakaemachi, Itabashi-ku, Tokyo 173-0015, Japan
[email protected] 3 Graduate School of Information Science and Technology, The University of Tokyo 7-3-1 Hongou, Bunkyou-ku, Tokyo 113-8654, Japan
[email protected] Abstract. Malformed ears of neonates can be effectively treated by employing an orthosis of suitable shape. Currently, we use orthoses made of nitinol shape memory alloy wire and have developed a computer-assisted design system to manufacture the orthosis. Using this method, extracted contours of the helix and auriculotemporal sulcus are approximated to spline, and orthosis shape can be designed by moving the control points of spline with reference to control points of the target auricular shape. The system also functions to evaluate the contact force between the orthosis and auricle. Using this system, orthoses were designed and manufactured for 16 patients with malformed ears. Treatment was more effective in cases where it was necessary to extend the helix.
1
Introduction
In Japan, approximately 20% of neonates are born with an auricular deformity that will not heal spontaneously. Most cases can be treated by mounting an appropriately shaped orthosis in the auricle [1]. Currently, the authors use orthoses made of nitinol shape memory alloy wire covered with an expanded polytetrafluoroethylene tube. An example of treatment of a folded helix using an orthosis is illustrated in Fig.1. For manufacture of the orthosis, the wire is fixed in the appropriate shape and the shape is memorized by heating to 500oC for 30 minutes. An iron plate is grooved by the shape of the orthosis and the wire is inserted to fix the shape. We have recently developed a computer-assisted design system [2] for use in constructing the orthosis. Previously, there have been only limited attempts to employ such a system for this purpose. One example is a system that constructs a wax auricular model using three-dimensional shape measuring system and numerically controlled machine tool. This has been used for planning of a microtia operation [3]. Here, we describe a newly developed orthosis design method that focuses on post therapeutic auricular shape, and orthosis shape is generated based on a splineapproximated curve. The system also permits an estimation of the contact force between the auricle and orthosis by finite element analysis. This system was developed using MATLAB (The Math Works Inc.), and we have introduced several clinical T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 227–234, 2002. © Springer-Verlag Berlin Heidelberg 2002
228
A. Hanafusa et al.
applications of the system. In each case, treatment was performed only after thorough explanation of the procedure by the physician, and with the agreement of the patients' parents. Folded helix
Orthosis
(a) Before treatment
(b) Orthosis mounted in the auricle during treatment
Fig. 1. A case of folded helix, and treatment using an orthosis made of nitinol shape memory wire
2
Spline Approximation of Auricular Shape [4]
Spline curves are widely used in the field of CAD/CAM and computer graphics, with B-splines being the most popular in these applications. To approximate auricular shape in the current investigation, B-spline function of order four with four internal knots is used. An initial step is to convert the co-ordinates of auricular contour points. This conversion is based on the ear base line, that is a line connecting Otobasion superius (obs) and Otobasion inferius (obi), as illustrated in Fig.2(a). In addition, the distance is normalized based on the length of the line between O and obs, thus permitting the comparison of various auricle shapes. X and Y co-ordinates of contour points are defined by the medium of θ, and approximated co-ordinates are defined by the B-spline function as shown in the equation (1). 8
x(θ ) = ! α i Bi , 4 (θ ) i =1
8
y (θ ) = ! β i Bi , 4 (θ ) i =1
(1)
Coefficients (α i , β i ) are calculated using the least square method and plotted on the XY plane as control points (CPi). The position of the control points can then be used as an indicator of auricular shape. Using photographs, we examined the auricular shape of 550 ears of Japanese neonates under the age of 7 days, and classified them as normal or abnormal (five categories). Fig.2(b) illustrates the distribution of control points for 130 normal ears. The large X’s are the average normal control points, that is the average position of control points for normal ears, and the thick curve represents the average shape of a normal ear as determined by the position of the average normal control points. When an individual control point is specified, Mahalanobis generalized distance and the probability of belonging to the group can be calculated by the distribution of control points. We derive the normal rate by the ratio of probability of normal group and that of abnormal groups.
Orthosis Design System for Malformed Ears Based on Spline Approximation
229
Y CP3 1.5
350
CP3
250
Y CP2 CP1 obs
x(θ)
300 CP4
y(θ)
θ
!s θs Ear base line X
200
O
150
obi CP8
CP6 150
1
CP1 CP4
0.5 0
CP5
-0.5
CP5
100 50 100 (Unit:0.1mm)
CP2
200
CP8
-1
CP7 250
(a) Axes of co-ordinates.
CP6
-1.5
-1
-0.5
CP7 0
0.5
X 1
(b) Distribution of control points and average shape of normal ear.
Fig. 2. Approximated spline of an auricle
3
Generation of Orthosis Shape
An Orthosis shape is generated as follows: 1) An auricular three-dimensional model is composed; 2) Contours of the helix and auriculotemporal sulcus are obtained; 3) Contours are approximated by spline and the position of control points are compared with those of target post therapeutic auricular shapes. The control points are moved toward the target positions; and 4) The orthosis shape and output manufacturing data are generated. 3.1
Composure of the Auricular Three-Dimensional Model
The three dimensional auricular shape is measured using the non-contact laser measurement system, which can acquire not only three dimensional position data, but also RGB color data. Since it is not possible to acquire data from both the frontal side of the auricle and the rear side (including the auriculotemporal sulcus) at the same time, it is necessary to compose a three-dimensional image using data obtained from measurements made from various directions. In order to improve the accuracy of auricular matching, we use a composite method that can match not only the surface contour but also the color. This method is based on the Iterative Closest Point (ICP) algorithm [5], with improvements to permit handling of both three-dimensional and RGB color co-ordinate distance. Equation (2) shows the defined united distance (d) of three dimensional distance and color dis" " " " tance. Here, Pa , Pb , C a , C b are three-dimensional co-ordinates and color coordinates of points a and b respectively, and kp and kc are weighted coefficients.
230
A. Hanafusa et al.
Usually, a plaster cast of the auricle, colored in a striped pattern, is used to compose the three dimensional model and color distance is used auxiliarily be setting coefficients to kp = 0.9 and kc = 0.1. " " 2 " " 2 (2) d 2 = kp 2 Pa − Pb + kc 2 C a − C b 3.2
Extraction of Contours of the Helix and Auriculotemporal Sulcus
Fig.3 shows the triangular element modification and line segment extraction system. The system includes a function to trace to the next point, situated where the difference of curvature and direction of normal vector from the current point is smallest. Contours can be obtained by selecting the Otobasion superius (obs), and tracing either helix or auriculotemporal sulcus points, automatically or manually, to the Otobasion inferius (obi).
Orthosis shape Extracted contour
Fig. 3. Overlaid display of generated orthosis and auricular model
3.3
Approximation of Contours by Spline and Comparison of Control Points
To fix and memorize the orthosis shape to the shape memory alloy wire, the helix and the auriculotemporal sulcus side plates are grooved separately. The approximated plane is first calculated and co-ordinates of points on the contour are projected onto the plane. Approximated spline is then calculated using the method described in the previous section. Subsequently, the current position of control points and the position of control points for the post therapeutic target shape are compared. As a post therapeutic target shape, spline approximation results of normal ear shape, obtained from a
Orthosis Design System for Malformed Ears Based on Spline Approximation
231
normal ear on the opposite side or from a sibling or parent, can be used. Also, the average position of normal ear control points can be employed. By moving the position of control points, shape of contours can be modified. 3.4
Generation of the Orthosis Shape
In accordance with the modified shape of the contours, the orthosis shape is generated. Fig.3 illustrates an overlaid display of a generated orthosis and auricular model. Finally, the shape is converted to Numerical Control (NC) data for groove process using a machine tool.
4
Finite Element Analysis of Contact Force [6]
It is important to evaluate the contact force between the orthosis and auricle, to ensure that it is sufficient to correct the auricle shape, while not excessive to the degree that it may cause a decubitus-like inflammation on the auricle. To evaluate the contact force, finite element analysis, that can handle the material non-linearity of an auricle and the contact deformation of auricle and orthosis, is developed. To contend with the material non-linearity, an incremental method is employed and displacement is increased by inserting the orthosis gradually. Moreover, we have also performed a tensile test using pig's auricular cartilage, and applied the resultant strain-stress diagram to the material property. Multiple point constraints are also used to represent the contact deformation of auricle and orthosis. The constraint condition should be updated in every insertion step of the incremental method. Z (mm)
Z (mm)
Orthosis
X (mm)
Y (mm)
(a) Control points lie halfway between their original position and that of the average control points in a normal situation
Orthosis
X (mm)
Y (mm)
(b) Control points are moved to the position of average control points in a normal situation
Fig. 4. Simulation results to demonstrate the effect of moving control points on the contact force distribution on the auricle
Where an orthosis was used in the case of an upper folded helix, as illustrated in Fig.1, contact force was compared for different positions of control points. Figs.4(a) and 5(a) illustrate the outcome of moving control points CP2 to CP4 (see Fig. 2(b)) halfway towards the position of average normal control points as illustrated in
232
A. Hanafusa et al.
Fig.6(a) (middle line). Figs.4(b) and 5(b) illustrate the effect of moving CP2 to CP4 to the position of average normal control points. Orthoses of helix side are inserted in 8 steps, increasing by 0.5 mm each, and the contact force is evaluated. The start edge of the orthosis and auriculotemporal sulcus of the auricle are clipped. The number of elements of the auricle is 630, and that of orthosis is 28. Fig.4 shows the distribution of the force on the auricle. Brighter area where more force is implied is increased when the average of normal position is used. Fig.5 illustrates orthosis deformation and the force produced by the contact. When the control points are in the middle of current and average normal position, the force at the clip point is 0.63 times smaller and the maximum contact force is 0.67 times smaller. Z (mm)
Z (mm)
X (mm)
(a) Control points lie halfway between their original position and that of the average control points in a normal situation
X (mm)
(b) Control points are moved to the position of average control points in a normal situation
Fig. 5. Simulation results to demonstrate the effect of moving control points on orthosis deformation and contact force on the orthosis
1.5
1
CP3
CP2
CP4 CP1
0.5 0
CP6
-0.5 CP5
CP8 CP7
-1
-1
-0.5
0
0.5
(a) Control points used to design the orthosis (b) Clinical result after four months of treatment Fig. 6. Treatment of the folded helix case shown in Fig.1
Orthosis Design System for Malformed Ears Based on Spline Approximation
5
233
Clinical Applications
The clinical application of nitinol orthoses, designed and manufactured using the recently-developed system, was evaluated in 16 cases. More specifically, this study set included 6 cases of cryptotia, 5 cases of folded helix, 3 cases of folded lobulus auriculae, and one case of Stahl's ear and one case of a protruded auricula. Currently 11 of these cases are under treatment and effectiveness of treatment has been confirmed in 9 cases. Where treatment required extension of the helix, employing this method generally resulted in improvement. However, where it was necessary to form anthelicis as part of treatment, it was impossible to effectively treat the deformity using only the developed orthosis. 5.1
Treatment of a Folded Helix
Fig.1(a) show a folded helix in a one-month-old baby. Fig.6(a) shows the spline data used to generate the orthosis for this case. From the center of the figure outwards, the three lines represent the approximated spline of auricle pre-treatment, orthosis shape and the average shape of a normal ear respectively. Control points CP2, CP3 and CP4 are moved halfway between the current position and position of average normal points. Fig.1(b) shows the orthosis mounted in the auricle and, as illustrated in Fig.6(b), after 4-months of treatment the degree of folding was improved. 5.2
Treatment of a Folded Lobulus-Auriculae
Figs.7 and 8 illustrate the use of the system to correct a folded lobulus-auriculae in a three-month-old baby (Fig.7(a)). The orthosis shape shown in Fig.7(b) is generated by moving control points CP4, CP7 and CP8 toward the normal average position, as demonstrated in Fig.8(a). After one month, the degree of folding was improved (Fig.8(b)).
Folded lobulus auriculae
(a) Before treatment
Orthosis
(b) Orthosis mounted in the auricle during treatment
Fig. 7. A case of folded lobulus-auriculae, and treatment using an orthosis
234
A. Hanafusa et al. CP3
1.5 1
CP2
CP4 CP1
0.5 0 -0.5
CP5 CP7
-1
CP8
CP6 -1
-0.5
0
0.5
(a) Control points used to design the orthosis (b) Clinical result after one month of treatment Fig. 8. Treatment of the folded lobulus-auriculae case shown in Fig.7
6
Conclusion
Using the orthosis design system, a three dimensional auricular model can be produced that considers both surface contour and color. Extracted contours of the helix and auriculotemporal sulcus are approximated to spline, and orthosis shape can be designed by moving the control points of spline, referring to control points of the target auricular shape. The system also permits evaluation of the contact force between the orthosis and auricle. By moving control points halfway towards the position of average normal control points, the contact force is less than that associated with moving control points to the average normal positions. Orthoses for 16 cases of malformed ears were designed and manufactured using the system, and treatment was effective in 9 cases when the helix had to be extended.
References 1. Matsuo K, Hirose T, Tomono T, Iwasawa M, Katohda S, Takahashi N, Koh B : Nonsurgical Correction of Congenital Auricular Deformities in the Early Neonate, Plast. Reconstr. Surg. 73 (1984) 38-50. 2. Hanafusa A, Takahashi H, Akagi K, Isomura T : Development of Computer Assisted Orthosis Design and Manufacturing System for Malformed Ears, Computer Aided Surgery 2 (1997) 276-285. 3. Kaneko T.: A System for Three-Dimensional Shape Measurement and its Application in Microtia Ear Reconstruction, Keio J. Med. 42(1) (1993) 22-40. 4. Hanafusa A, Takahashi H, Isomura T, Dohi T: Analyses of Japanese Neonates' Auricular Shape Using Spline Approximation, Proc. of the CAR'98 (1998) 951. 5. Besl PJ, McKay ND : A Method for Registration of 3-D Shapes, IEEE Trans. Pattern Analysis and Machine Intelligence, 14, 2 (1992) 239-256. 6. Hanafusa A, Isomura T, Sekuguchi Y, Takahashi H, Dohi T: Computer Assisted Orthosis Design System for Malformed Ears -Automatic Shape Modification Method for Preventing Excessive Corrective Force-, Proc. of the World Congress on Medical Physics and Biomedical Engineering Chicago 2000 (2000) 1-3.
Cutting Simulation of Manifold Volumetric Meshes C. Forest, H. Delingette, and N. Ayache Epidaure Research Project, INRIA Sophia Antipolis, 2004 route des Lucioles, 06902 Sophia Antipolis, France
Abstract. One of the most difficult problem in surgical simulation consists in simulating the removal of soft tissue. This paper proposes an efficient method for locally refining and removing tetrahedra in a realtime surgical simulator. One of the key feature of this algorithm is that the tetrahedral mesh remains a 3-manifold volume during the cutting simulation. Furthermore, our approach minimizes the number of generated tetrahedra while trying to optimize their shape quality. The removal of tetrahedra is performed with a strategy which combines local refinement with the removal of neighboring tetrahedra when a topological singularity is found.
1
Introduction
The simulation of cutting soft tissue is one of the major components of a surgical simulator. In fact, in a surgical simulation procedure, the word cutting may be used to describe two different actions: incising which can be performed with a scalpel, and removing soft tissue materials which is performed with an ultrasound cautery. Despite their different nature, the simulations of these two actions rise against a common problem, which is the topology modification of a volumetric mesh. Incising algorithms have been the most commonly studied approaches in the literature. Most of them are based on subdivision algorithms whose principle is to divide each tetrahedron across a virtual surface defined by the path of the edge of a cutting tool[BG00,MK00]. These algorithms create a smooth and accurate surface of cut, but they suffer from two limitations. First, they tend to generate a large number of small tetrahedra of low shape quality. The number of new tetrahedra may be reduced by moving mesh vertices along the surface of cut [NvdS01], but this tends to worsen the quality of tetrahedra. Second, the simulation of incising supposes that the cut surface generated by the motion of a surgical tool is very smooth or even, as found in some articles, locally planar. However, in most real-time surgical simulation systems, there is no constraints on the motion of surgical tools and most of the time the cut surface is not even close to a smooth surface. Furthermore, surgical cutting gestures do not usually consist of a single large gesture, but are made of repeated small incisions. Therefore, these algorithms are not of practical use for cutting volumetric meshes, for instance when simulating an hepatectomy. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 235–244, 2002. c Springer-Verlag Berlin Heidelberg 2002
236
C. Forest, H. Delingette, and N. Ayache
In this paper, we focus on the simulation of the removal of soft tissue material as performed with an ultrasound cautery. The targeted application is the simulation of hepatectomy, i.e. the resection of a functional segment of the liver. We suppose that all volumetric anatomical structures (for instance a liver) are represented with tetrahedral meshes which meet some criteria (see section 2.1). To remove soft tissue, we need to perform two distinct tasks : remove tetrahedral elements and control the element size such that the material cavities have the same size as the cautery device. The former task may seem trivial since it could reduce itself to a self-evident “remove this tetrahedron from the list of tetrahedra”. However, due to additional constraints on the mesh topology (which has to remain a manifold), it actually turns out to be a difficult problem to solve. The latter task consists in locally refining the mesh around the cut path. Indeed, in order to speed-up the deformation of soft tissue, it is preferable to use meshes with as few tetrahedra as possible. To obtain a realistic cut, it is therefore necessary to refine the mesh dynamically, as opposed to previous approaches [CDA00,PDA01] where the mesh was refined beforehand in regions where the cutting could occur. In our approach, the tasks of removing and refining tetrahedra are both devised in order to optimize two constraints. First, they must minimize computation time and therefore cannot use sophisticated remeshing algorithm because each cutting operation should be performed in a few tens of milliseconds (typically 50-100 milliseconds). Second, they should produce tetrahedra with a high shape quality.
2 2.1
Tetrahedral Meshes for Surgery Simulation Topological Constraints
For the simulation of volumetric soft tissue, we use tetrahedral meshes as a geometric model. These meshes are built from triangulated surfaces with a dedicated mesh generation software. For our application, we chose to restrict the set of possible tetrahedral meshes to be both conformal and manifold. Conformality is required since we are using finite element modeling for the spatial discretization of the elastic energy (see [PDA01] for more details). In a conformal mesh, the intersection of two tetrahedra is either empty, or a common vertex, edge or triangle. Furthermore, the tetrahedral mesh is a 3-manifold[BY98] which implies that the neighborhood of a vertex is homeomorphic to a topological sphere for inner vertices and is homeomorphic to a half-sphere for surface vertices. More precisely, a tetrahedral mesh is a 3-manifold if the shell of a vertex (resp. an edge), i.e. the set of tetrahedra adjacent to a vertex (resp. an edge), has only one connected component. In Figure 1, we show two examples of non-manifold volumetric objects. When the neighborhood of a vertex or an edge is not singly-connected, we say that there exists a topological singularity at that vertex or edge. Having a manifold tetrahedral mesh is not mandatory for finite element algorithm. However because it allows to compute a normal for each surface vertex this
Cutting Simulation of Manifold Volumetric Meshes
237
Fig. 1. Examples of non manifold objects: (left) edge singularity; (right) vertex singularity
feature is useful for the rendering of the mesh, for example when using Gouraud shading or PN triangles[VPBM01], and become necessary for computing the reaction force to be sent to a force-feedback device. Furthermore it simplifies the computation of edges and vertices neighborhood, allowing to decrease the redundancy of data structure. Removing tetrahedra in conformal tetrahedral meshes is a trivial task but it proves to be much more difficult for manifold meshes. Thus, at least two authors have reported problems for removing topological singularities during cutting simulation[NvdS01,MLBdC01] without providing practical solutions. 2.2
Data Structure
We found the design of a data structure for a tetrahedral mesh to have a strong impact on the computational efficiency of the cutting simulation and on the ease of implementation. As a detailed description of this data structure would fall outside the scope of this paper, we will only describe its main features below. Our data structure relies on the notion of manifold mesh and the topological information is mainly stored in vertices and tetrahedra. These two objects are stored in lists and each tetrahedron points towards its 4 vertices, 6 edges, 4 triangles and 4 neighboring tetrahedra. We also have edge (resp. triangle) objects which are stored in a hash table indexed by the reference of its two (resp. three) vertices. To each vertex ( resp. edge), we add a pointer on a neighboring tetrahedron (resp. triangle) in order to build its neighborhood. We “close” topologically the volumetric mesh by adding virtual vertices, edges, triangles and tetrahedra for each surface vertex.
3 3.1
Cutting Algorithm Problem Position
In our surgical simulator, the user manipulates a force-feedback device which has the same kinematics as a real surgical instrument in a laparoscopic procedure. The position and orientation of this virtual instrument is read periodically and collisions with the different virtual organs are detected with an efficient method [LCN99] based on standard OpenGL hardware acceleration. The input of the cutting algorithm is therefore a set of surface triangles and consequently
238
C. Forest, H. Delingette, and N. Ayache
a set of tetrahedra Tinitial (adjacent to these triangles). When simulating an ultrasound cautery, all tetrahedra in the neighborhood of the tool extremity are simply removed. We proceed in two stages: first neighboring tetrahedra are refined and then a subset of these tetrahedra are removed. We first describe the refinement stage before detailing the deletion stage. 3.2
Local Mesh Refinement
We have chosen, for the sake of efficiency and simplicity, to locally refine a mesh by splitting edges into two edges. More precisely, the input of the refinement algorithm is a set of edges: these edges may be inside or on the surface. Then, we compute the set of tetrahedra that are adjacent to any edges in the set. Finally, we split each tetrahedron in this set in a systematic manner which depends on the number and the adjacency of edges. There are 10 distinct configurations for refinement which are displayed in Figure 2. In order to get continuity across two neighboring tetrahedra (conformality), it is required in some configuration to take into account the lexicographic order of vertices (any other order between vertices could be used). B
B
D
D
ab
D
C
ac
B
A
cd
ac
A
B bc
C
ABbcbd, AbcCcd, Abccdbd, AbdcdD B
B
B
A
ac
AabcdD if A
ab
D
ad
A
cd C
bc
ab
D
A C
Aabacad, abbcacad if C
bc
ab
D
ad
cd C
A
bc
bd ab
D A
ad ac
cd
C
bd ab
D
bc
ad
C
C
D
ab
AabCcd, AacdD, abBCcd, abBcdD assume B
AabCD, abBCD
B
D
A
A
C
ABCD
B
bd
ab
A
A
B
B
ab
D
ad ac
C
bc cd C
if A
Fig. 2. The 10 configurations for splitting a tetrahedron.
When a single tetrahedron is split, we not only update the adjacency information between elements, but also the rendering information (texture coordinate) and the mechanical information. The local refinement is controlled by a single distance threshold Dref whose value depends on the diameter of the virtual tool (in our implementation Dref = 1.5 ∗ ∅tool ). Given the set of tetrahedra Tinitial hit by the virtual tool, we build the set of edges Esplit to be split. For each tetrahedron in Tinitial , we scan each edge and add it into Esplit if its edge length is greater than Dref . We do not use directly this set for the refinement because it would create elongated tetrahedra with a bad geometrical quality. Thus, we additionally scan each edge e in Esplit and look at all edges located in the shell of e (edges opposite to e inside each adjacent tetrahedron): if one of these edges has a length greater than p times
Cutting Simulation of Manifold Volumetric Meshes
239
the length of e then it is added to Esplit . The purpose of this extra stage is to limit the ratio between the minimum and maximum edge length for the newly created tetrahedra. Then the edge set is fed into the refinement procedure which produces as an output, the list of newly created tetrahedra Tnew . The refinement procedure is iterated until the edge set Esplit is empty, by replacing the set of initial tetrahedra Tinitial with the list of new tetrahedra Tnew . The value p controls the mesh quality after refinement but also the extent of the refinement. If p is set to a value close to 1.0 then the refinement procedure may create a large set of new tetrahedra of high quality after many iterations. On the contrary, it p is set to a large value (2.0 for instance) then no extra refinement will take place and the refined tetrahedra may be of poor quality. As a trade-off we are currently using p = 1.5 which, in practice, leads to no more than 3 refinement iterations.
Fig. 3. Iterative refinement of tetrahedra located at the vicinity of a virtual surgical tool
3.3
Tetrahedra Removal
The second stage consists in removing tetrahedra. The list of initial tetrahedra intersected by the cautery device Tinitial has been potentially extended by the refinement procedure. From this list, we discard any tetrahedron which is too far away from the tool or which has an edge with a length greater than Dref and call Tremoval the resulting set. The removal algorithm then consists in removing one-by-one all tetrahedra in Tremoval . For a given tetrahedron T , we test if it can be removed without creating topological singularities at a vertex or edge. This is done according to the number of faces (triangles) of that tetrahedron T that are lying on the mesh surface : Case 1 If T has no face in the mesh surface (T is then inside the mesh), it can be removed iff none of its four vertices belongs to the surface. Case 2 If T has exactly one face belonging on the surface, it can be removed iff the vertex opposite to that face does not belong to the surface. Case 3 If T has exactly two faces belonging to the surface, it can be removed iff the edge opposite to these two faces does not belong to the surface. Case 4 If T has three or four faces belonging to the surface it can always be removed.
240
C. Forest, H. Delingette, and N. Ayache
Table 1. Estimation of the occurrence of vertex or edge singularities when removing a tetrahedron on the mesh surface nb of % of tetra % of pbs. due % of pbs. due Mesh operations removable to vertices to edges 1 130 89.3 9.2 1.5 2 265 62.3 31.7 6.0 3 460 84.1 13.5 2.4 4 1448 83.8 14.4 1.8 5 2268 86.1 13.3 0.6
In fact, the probability that the suppression of a tetrahedron creates a topological singularity is fairly high. From a set of experiments described on section 3.6, we found (see Table 1) that this probability varies between 0.1 and 0.4 depending on the mesh topology or the mesh geometry. Also singularities at vertices (cases 1 and 2) occur nearly 10 times more often than singularities at edges (case 3). There are different possible strategies for removing topological singularities. We have explored a variety of approaches and finally decided to use a combination of two strategies: the first one is based on a local mesh refinement and the second one is based on the removal of neighboring tetrahedra. All approaches were evaluated with respect to two criteria: the overall number and shape quality of tetrahedra. The local refinement strategy can always be used to solve topological singularities, but it tends to create many small tetrahedra (see section 3.4). Removing neighboring tetrahedra does not have this drawback but cannot be applied for all configurations (see section 3.5). We propose to combine the two previous strategies to get an optimal result (see section 3.7). 3.4
Basic Strategy Based on Refinement
This strategy is quite simple: we create new tetrahedra surrounding the singularity in order to “thicken” the mesh at that location. If a topological singularity occurs at a vertex (resp. edge) then all edges adjacent to that vertex (resp. edge) are split (see section 3.2) and we then remove all refined tetrahedra adjacent to that vertex (resp. edge) together with the refined tetrahedra corresponding to the original tetrahedron. We can formally prove that these set of tetrahedra can always be removed without creating new singularities. This strategy works well in all cases but it creates many small tetrahedra of poor shape quality. 3.5
Strategy Based on Tetrahedra Removal
Let T be the tetrahedron to be removed. The strategy consists in removing the smallest set of neighboring tetrahedra of T which can suppress a vertex or edge singularity without creating new ones. This is typically a search strategy however there is no guarantee that such set exists. This approach is well-suited for simulating surgical cuts since neighboring tetrahedra are also guaranteed to be small
Cutting Simulation of Manifold Volumetric Meshes
241
enough due to the local refinement stage. Furthermore, it is likely that neighboring tetrahedra of T also belongs to Tremoval which implies that they would have been removed anyway. The algorithm significantly differs whether a vertex or an edge singularity is considered. For a vertex singularity, we can better understand the search problem by looking at the shell of the singular vertex. In Figure 4, we display the shell of that singular vertex: it is equivalent to a half-sphere since it lies on the surface. The opposite triangle of this vertex in tetrahedron T is drawn in solid line (leftmost figure) and dark grey (rightmost figure). To suppress the vertex singularity, we must find a set of adjacent tetrahedra (drawn as patterned triangles) that connect T to the border of the shell. The search is performed in a width-first manner in order to find the shortest possible path and is further arbitrarily limited to a depth of 10 in order to restrict the time spent in this search. When a path is found, its removability is tested and, when successful, the set Tremovable is removed. If the test fails, we use an additional
111111111 000000000 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111 000000000 111111111
Fig. 4. (left)The shell of a vertex that exhibit a topological singularity; (right) a path (patterned triangles) connecting tetrahedron T to the shell border
heuristics. It appears that the path we found, often divides the adjacency of one of the tetrahedron vertices into two connected components. Therefore, we also test the removability of the given path once extended with each of the two connected components. If one of those two sets appears to be actually removable it is temporally stored and will be eventually used if no smaller removable set is determined. If no paths nor extended paths are found to be removable then we output an empty set Tremovable = {}. For an edge singularity, we look at all tetrahedra adjacent to that edge. Because this edge is lying on the surface, we can divide these tetrahedra into two sets, rightmost and leftmost tetrahedra. We first test each of these two sets for removal and in case of failure, we also test if the whole set of adjacent tetrahedra can be removed. If not, we output an empty set. This algorithm is very simple and can be implemented in a very efficient manner. We could also add more heuristics in case of failure to try to solve more configurations. However, edge singularities do not occur very often and therefore we have decided to keep it simple.
242
C. Forest, H. Delingette, and N. Ayache
Fig. 5. The meshes used for the tests
3.6
Strategy Efficiency
In order to estimate this algorithm efficiency, we used the electric lancet of our simulator to proceeded a cut into several meshes. The meshes that were used are: 1. 2. 3. 4.
A simple cylinder A thin sheet with an average thickness of two tetrahedra A rough mesh of a liver (3970 tetrahedra) obtained with a 3D-Mesher A locally refined mesh of the same liver (8454 tetrahedra), obtained with a 3D-Mesher 5. A locally refined of a liver (9933 tetrahedra for a third of the liver) obtained with a home-made edge splitting method Table 2 shows the percentage of successful operations and the average cardinality of the removed set, for the whole cutting process, and restricted to both the subsets of tetrahedra that caused a problem at a vertex and at an edge. To be more significant, if a non-removable tetrahedra is encountered several times it is only counted once. The results appear to be quite stable from one mesh to the other and independent to the number of tetrahedra removed, to the geometry or the shape quality of the meshing. The only differences appear with the cylinder and the thin sheet, which is understandable because of their topological particularities. Table 2. Analysis of tetrahedra removal strategy Vertices Singularities Edges Singularities Global Mesh % of un- average % of un- average % of un- average resolved cardinality resolved cardinality resolved cardinality 1 8.3 3.1 50.0 6.0 1.5 1.2 2 14.3 3.1 43.8 4.7 7.1 1.8 3 6.4 2.9 54.5 5.0 2.2 1.3 4 10.1 3.2 34.6 4.0 2.1 1.3 5 15.6 3.3 35.7 3.8 2.3 1.3
Cutting Simulation of Manifold Volumetric Meshes
243
In average the strategy of removing neighboring tetrahedra allows to solve nearly 90 % of vertex singularities and 60 % of edge singularities. The total number of tetrahedra removed is nearly 3 for a vertex singularity and 4 for an edge singularity. Considering that vertex (resp. edge) singularities only occur 15 % (resp. 3 %) of the time when removing a tetrahedra, this strategy of removing neighboring tetrahedra allows to remove nearly 97 % of all tetrahedra that should be removed. Furthermore, on the whole, the average number of tetrahedra removed when a single tetrahedron should be removed is 1.3 which only shows a 30 % increase with respect to an ideal minimal value. 3.7
Combined Strategy: Optimized Local Refinement
To remove the remaining 3 % of tetrahedra that cannot be removed by the previous strategy, we use an optimized local refinement strategy. Instead of refining (splitting) all edges adjacent to a vertex or edge singularity as described in section 3.4, we propose to refine only a subset of these edges. For instance, for a vertex singularity, it suffices to refine a set of tetrahedra that creates a path from T to the border of the vertex shell (see Figure 4). Thus, when searching for a suitable set of removable tetrahedra as described in the previous section, we also store the shortest path connecting the tetrahedron T to the shell border. This shortest path is eventually used for the optimized local refinement. A similar approach is used for edge singularity where only a subset of adjacent edges are used corresponding to either the rightmost or leftmost set of adjacent tetrahedra.
Fig. 6. Removal of an edge(left) and of a vertex(right) singularity after an optimized local refinement
4
Conclusion
We proposed an algorithm which is suitable for the interactive simulation of soft tissue removal. The local refinement stage allows to simulate fine cuts even on a coarse tetrahedral mesh. The trade-off between mesh quality and computation efficiency is governed by a single parameter p > 1 whose value can be set intuitively. For removing tetrahedra, our approach combines the deletion of neighboring tetrahedra (in 97 % of cases) with the optimized refinement around topological singularities. A dedicated data structure for tetrahedral meshes has been an important component for its efficient implementation within a hepatic surgical simulator (see Figure 7).
244
C. Forest, H. Delingette, and N. Ayache
Fig. 7. Sequence of a simulated liver resection with extricating of the portal vein
References BG00.
Daniel Bielser and Markus H. Gross. Interactive simulation of surgical cuts. In Proc. Pacific Graphics 2000, pages 116–125. IEEE Computer Society Press, October 2000. BY98. Jean-Daniel Boissonnat and Mariette Yvinec. Algorithmic Geometry. Cambridge University Press, UK, 1998. Translated by Herv´e Br¨ onnimann. CDA00. S. Cotin, H. Delingette, and N. Ayache. A hybrid elastic model allowing real-time cutting, deformations and force-feedback for surgery training and simulation. The Visual Computer, 16(8):437–452, 2000. LCN99. J-C. Lombardo, M.P. Cani, and F. Neyret. Real-time collision detection for virtual surgery. In Computer Animation, Geneva Switzerland, May 26-28 1999. MK00. Andrew B. Mor and Takeo Kanade. Modifying soft tissue models: Progressive cutting with minimal new element creation. In MICCAI, pages 598–607, 2000. MLBdC01. C. Mendoza, C. Laugier, and F Boux de Casson. Virtual reality cutting phenomena using force feedback for surgery simulations. In Proc. of the IMIVA workshop of MICCAI, Utrecht(NL), 2001. NvdS01. Han-Wen Nienhuys and A. Frank van der Stappen. Supporting cuts and finite element deformation in interactive surgery simulation. In W.J. Niessen and M.A. Viergever, editors, Procs. of the Fourth International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI’01), pages 145–152. Springer Verlag, October 2001. PDA01. G. Picinbono, H. Delingette, and N. Ayache. Non-linear and anisotropic elastic soft tissue models for medical simulation. In ICRA2001: IEEE International Conference Robotics and Automation, Seoul Korea, May 2001. Best conference paper award. VPBM01. A. Vlachos, J. Peters, C. Boyd, and J. Mitchell. Curved PN triangles. In Proc. 17th Annu. ACM Sympos. Comput. Geom., August 2001.
Simulation of Guide Wire Propagation for Minimally Invasive Vascular Interventions Tanja Alderliesten1 , Maurits K. Konings2 , and Wiro J. Niessen1 1
2
Image Sciences Institute, University Medical Center Utrecht, Dept. of Biomedical Engineering, University Medical Center Utrecht, P.O. Box 85.500, 3508 GA Utrecht, The Netherlands {tanja,wiro}@isi.uu.nl,
[email protected]
Abstract. In order to simulate intravascular interventions, a discretized representation of a guide wire is introduced which allows modeling of guide wires with different physical properties. An algorithm for simulating the propagation of a guide wire within a vascular system, on basis of the principle of minimization of energy, has been developed. Both longitudinal translation and rotation are incorporated as possibilities to manipulate the guide wire. The algorithm is based on quasi-static mechanics. Two types of energy are introduced, internal energy related to the bending energy of the guide wire and external energy resulting from the elastic deformation of the vessel wall. Compared to existing work, the novelty of our approach lies in the fact that an analytical solution is achieved. The algorithm is tested on phantom data. Results indicate plausible behavior of the simulation.
1
Introduction
Minimally invasive vascular interventions are performed through catheterization. During these interventions visual feedback is provided by intra–operative X-Ray images. The intervention is performed by manipulating parts of the instruments that are outside the body. In order to successfully carry out these procedures, proper training is required. Providing a training possibility by simulation has a number of attractive properties. No radiation is required and the simulation can be made patient specific. In order to simulate intravascular interventions, the instruments and the patient need to be modeled. Hereto, tissue types with different elastic properties have to be segmented and modeled. Also the interaction between the instruments and the patient needs to be modeled. Finally, simulation of the visual feedback during the intervention is required. The simulation of surgical procedures has been the interest of a variety of research groups. Progress in the field of simulating minimally invasive vascular interventions has been made in the last few years [1,4,7,6]. In this paper we focus on modeling the guide wire which is the main instrument used during a minimally invasive vascular intervention. The guide wire allows the radiologist to navigate inside the vasculature. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 245–252, 2002. c Springer-Verlag Berlin Heidelberg 2002
246
T. Alderliesten, M.K. Konings, and W.J. Niessen
To model the behavior of a deformable object, the Finite Element Method (FEM) approach [9] and mass-spring models are both suitable methods. However, mass-spring models are better suited for modeling soft tissue deformation [3]. Furthermore, the FEM approach requires a large amount of computation time, which makes it less appropriate for real time applications. Therefore, we introduce a different approach: an analytical solution of a discretized model of the guide wire. This model contains the required information to add force feedback, supplying a good basis for a realistic simulation of a catheterization procedure. The outline of the paper is as follows. The aspects concerned with the modeling of the guide wire and the vasculature are discussed in Section 2. Section 3 is dedicated to the relaxation algorithm which determines the guide wire position using energy minimization. The initialization of the relaxation algorithm is the topic of Section 4. Section 5 is dedicated to simulation results on phantom data. Finally, details concerning future research are discussed and some conclusions are drawn in Section 6.
2
Modeling the Guide Wire and Vasculature
Two types of energy are associated with the guide wire. The internal energy of the guide wire that is related to the bending of the guide wire and the external energy of the guide wire related to the elastic deformation of the vessel wall. The representation of the guide wire and the bending energy associated with the guide wire are discussed in Section 2.1. Subsequently, the modeling of the elastic deformation energy of the vessel wall is discussed in Section 2.2. 2.1
Guide Wire Modeling
The guide wire is modeled by a discrete parametrization. A set of equally large segments connected at so called joints represents the guide wire. The segments are straight, not bendable or compressible and have a fixed length (λ). The segments have to be small to approximate reality. The guide wire can be represented by storing an array with the 3D position vectors of the joints: x0 , . . . , xk . The joint position closest to the insertion tube is called x0 (Figure 1). When a new part of the guide wire is inserted into the vessel, the current representation is adapted by adding segments and joints and by computing a new guide wire configuration with the relaxation algorithm. Bending energy is associated with the angular orientation between two segments. The bending energy increases if the difference in angular orientation between the two segments increases. Bending energy (Ub ) per joint is given by: Ub (xi ) =
1 ci θi 2 2
(1)
Where θi is the angle between two segments connected by joint i (Figure 1) and ci is a spring constant. This spring constant is related to the stiffness of the joint.
Simulation of Guide Wire Propagation
x5
x4
x3
247
θ3
x2
x1
x0
Fig. 1. Representation of a guide wire defined by joint positions x0 , . . . , x5 . The angle (θ) between two segments as used in equation 1 is illustrated for joint 3. For joint 4 and 5 the sample points on the outer hull of the guide wire, which allows the modeling of guide wires with different thickness, are drawn.
By storing a separate spring constant for each joint, guide wires with different physical properties, i.e. guide wires with a variable flexibility along the guide wire body, can be modeled. The total bending energy (UTb ) of the guide wire equals the sum of the bending energy present in each joint: k−1 UTb = Ub (xi ) (2) i=0
No bending energy is defined for the tip joint (k) since only one segment is connected to this joint. Guide wires are available in a variety of shapes. A distinction between completely straight guide wires and guide wires with an intrinsically curved tip can be made. In Equation 1, zero bending energy for joint i of a completely straight guide wire is achieved if two adjacent segments lie parallel: (xi − xi−1 ) // (xi+1 − xi )
(3)
The intrinsic curvature in curved tip guide wires is included in our model by associating a vector (ω) with each joint that represents a bias towards this curvature. The vector indicates the deviation from the straight line it would have formed with the previous segment if no intrinsic curvature would have been present. Zero bending energy for joint i belonging to an intrinsically curved part of the guide wire is then achieved if: (xi − xi−1 + ω i ) // (xi+1 − xi )
(4)
Generally, when one side of a longitudinal object is subjected to a torque (τ ), while resistive forces are present at the other side of the object, the object will show some torsion (ϕtorsion ). The amount of torsion depends on the torsion constant (κtorsion ) of the object: τ = κtorsion ϕtorsion
(5)
In practice, with the tip as an exception, guide wires have excellent torque control. In the current version of our model, the assumption has been made that guide wires have ideal torque control, i.e., the torsion constant of a guide wire approaches infinity. To model the interaction between the guide wire and the vasculature, the outer hull of the guide wire is modeled by a set of points. This allows for modeling the non–zero thickness of guide wires (Figure 1).
248
2.2
T. Alderliesten, M.K. Konings, and W.J. Niessen
Modeling the Vasculature
To model and test the motion and deformation of a guide wire, the vasculature in which the guide wire moves, needs to be modeled. Hereto we segmented 3D X-Ray data sets of phantom data. For the segmentation we used the fuzzy connectedness algorithm introduced by Udupa et al. [8]. The segmentations form the basis for modeling the vasculature. The deformation properties of the vessel wall are determined by the tissue characteristics of the vessel wall and the tissue types surrounding the vessel. Incorporating all these aspects will most likely give the most realistic representation of the deformation of the vessel wall. For now, a simplified model is used for the vasculature and the vessel wall energy. Hookes’ law [2] is used, which states that the deformation of an elastic material is proportional to the applied stress up to a certain point, called the elastic limit, beyond which additional stress will deform it permanently. Mathematically Hookes’ law can be described as follows: 1 F = kd, U = kd2 (6) 2 Where F is the applied force, k the spring constant and d the deformation of the elastic body subjected to the force F . Integrating this formula with respect to d ) due gives us an expression for the energy (U ). The change in energy (δU = δx· ∂U ∂x to a displacement (δx) can be expressed in terms of the gradient (δU = ∇U · δx) of the energy at the new arrived position ((x, y, z)), where ∇U = ( ∂U , ∂U , ∂U ). ∂x ∂y ∂z The energy formula in equation 6 is used to define a potential distribution which represents the elastic vessel wall energy expressed in gradients. Within the lumen of the vessel, the potential distribution equals zero. Elsewhere, the potential distribution has non–zero values according to the energy formula mentioned above, showing increasing values when moving further away from the vessel wall. The total vessel wall energy (UTvw ) the guide wire is subjected to equals the sum of the wall energy every single joint is subjected to (Uvw ): k Uvw (xi ) (7) UTvw = i=1
The first joint (0) has no contact with the vessel wall and is therefore not considered. The vessel wall energy for every joint is the average of the vessel wall energy present in the sampled points on the outer hull of the guide wire.
3
Relaxation Algorithm: Minimization of the Total Energy
The developed algorithm, based on quasi-static mechanics, handles the relaxation of a guide wire within a vascular system after a forced translation or rotation of the proximal end of the guide wire has taken place. The determination of the input for the relaxation algorithm, i.e. the initial configuration of the guide wire after a forced translation or rotation, is explained in Section 4.
Simulation of Guide Wire Propagation
t
t+1
249
xi αi x i+1
Fig. 2. A change in αi has influence on the bending energy of joints i and i + 1 and on the positions of joints i + 1 to k.
The relaxation process can be defined as finding a new configuration of joint positions on basis of the minimization of the total energy of the guide wire. The bending energy of the guide wire and the vessel wall energy of the vasculature are both functions of the positions of all joints. Since a displacement of any joint will entail displacements of other joints as well, there is a high degree of interdependence between the joint positions in the total energy function. Therefore, a different set of variables which uniquely describe the positions of the joints is introduced: ξ i = xi − xi , αi = ξ i+1 − ξ i
(8)
The vector ξ i represents the difference between the new position of joint i (xi ) and the previous position (xi ), i.e. the movement of joint i. The vector αi denotes the difference between the movements of two adjacent joints (Figure 2). A change in αi affects the positions of joints i + 1 to the tip, but solely the bending energy of joint i and i+1. This constitutes a significant reduction of the interdependency of the variables and hence of the complexity of the algorithm. After a forced translation over a fixed interval (ξ 0 ) or a rotation (ξ 0 = 0) of the proximal end of the guide wire, the guide wire responds to a translation ξ i at each joint i that may deviate from the fixed translation ξ 0 according to: ξi = ξ0 +
i−1
αj
(9)
j=0
The purpose of the algorithm can now be defined as finding a specific set of α vectors such that the resulting ξ i corresponds to the situation in which the total energy is minimal. Since the distance between two adjacent joints is constant, the 3D α vectors can be represented using a 2D parametrization, viz. an angle ψi around λi (xi+1 − xi ) and a scalar ai which represents the length of αi . Minimizing the sum of the vessel wall energy and the bending energy, can be performed by finding the set of ai values and ψi values for which the total energy is an extremum. This gives an analytical expression for ψi and ai : G−ci Υ |λ ai = |2G P −c i ψi = βi + π ˜
˜
(10)
For a full derivation we refer the interested reader to a technical report [5]. Here, we will only intuitively describe the different terms. The terms basically represent trade–offs between relevant gradients of the vessel wall energy (which can be interpreted as forces acting on the guide wire) and the bending energy associated ˜ denotes the transversal component of the sum of the gradients with joint i. G
250
T. Alderliesten, M.K. Konings, and W.J. Niessen
of the vessel wall energy at all joints from i + 1 to the tip, and GP expresses the longitudinal component of this term. The longitudinal component of the bending energy associated with joint i is represented by the term ci , whereas ci Υ˜ denotes the transversal component of this energy. In equation 10, βi is the ˜ − ci Υ˜ , which angle expressed in polar coordinates represented by the vector G denotes the trade–off in the transversal plane between the relevant vessel wall energy gradients and the bending energy associated with joint i. The solutions βi and βi + π denote the values for ψi in which ψi represents an extremum. The minimum energy state is obtained when ψi equals βi + π. Clearly, when an equilibrium between the local bending energy and the relevant vessel wall ˜ = ci Υ˜ , the expression for ai equals zero, gradients has been achieved, i.e. G which means that the i–th joint is in a steady state.
4
Initialization of the Relaxation Algorithm
In the following subsections it is explained how after a forced translation or rotation of the guide wire, an initial configuration of the guide wire is calculated that can serve as a starting point for the relaxation algorithm. 4.1
Translation
Given the current position of the guide wire (x0 , . . . , xk ) and a change of x0 as a result of a forced translation of the proximal guide wire body into the introducer sheath, the initial configuration of the guide wire is calculated by adding the movement of joint 0 (ξ 0 ) to every single joint position, see Figure 3. 4.2
Rotation
Given the current position of the guide wire (x0 , . . . , xk ) and a forced rotation of the proximal end of the guide wire, the initial configuration of the guide wire is calculated not by translating the joints, but by rotating local (P, Q, S) coordinate systems that are associated with each joint and initialized as follows: eˆPi =
1 λi , eˆQ ˆ L , eˆSi = eˆPi × eˆQ i =n i λ
(11)
Where n ˆ L is the unit vector perpendicular to plane L containing all joints under the circumstances that the guide wire is outside the vasculature and no external forces are applied to it. By rotating the local coordinate systems, the local characteristics associated with each joint are rotated as well, like the ω vector (Figure 3). After rotation, the local semantics of each ω i remains identical, but globally the orientation has changed and thus the joints connected to joint i desire a new position in order to ensure that the internal energy stays minimized (Equation 4). The new desired configuration of the guide wire will be found when this initial configuration is given to the relaxation algorithm.
Simulation of Guide Wire Propagation b)
a)
c)
d)
e)
φ0
251
Pi-1
Pi-1
xi-1 Qi-1
xi-1 Si-1
Qi-1
P’ i-1
ωi
S i-1
S’i-1 ξ0
ω ’i
φ0 Q’i-1
Fig. 3. Determining the initial configuration after a forced translation or rotation, that will serve as input for the relaxation algorithm: a,c) current guide wire configuration, b) the initial configuration after a forced translation, d ) the initial configuration after a forced rotation. e) For clarity only one local coordinate system has been visualized. After forced rotation, the global semantics of the ω vectors has changed. In the relaxation process, the guide wire will deform so as to minimize the total energy, taking into account the preference for the predefined intrinsic curvature.
5
Experiments
To demonstrate the performance of the relaxation algorithm used in the simulation of the propagation of a guide wire, 3D X-Ray data from an intra–cranial anthropomorphic vascular phantom, including the circle of Willis and an aneurysm (Figure 4a), have been acquired. Phantom data are very useful since during interventions in patients only projection images are acquired. In phantom experiments, 3D X-Ray images can be acquired, providing the advantage that the actual position of the guide wire in experiments can be visualized for comparison with the simulation. To obtain a realistic simulation, the correct vessel wall elasticity and guide wire characteristics (spring constants, ω’s) need to be incorporated in the model. During an experiment, a guide wire with a curved tip is propagated in the intra–cranial vascular phantom. A time sequence of 30 3D X-Ray images have been acquired during this experiment. The position of the guide wire is also modeled for different insertion depths. Figure 4c shows the position of the guide wire inside the phantom after propagation of the guide wire until it has reached the aneurysm. Figure 4b represents the simulation of the propagation of a guide wire until it has reached the same position. Comparing Figure 4b and 4c, a large similarity can be detected, indicating a plausible behavior of the simulation.
6
Conclusions
An algorithm for simulating the propagation (translation and rotation) of a guide wire for intravascular interventions has been presented. An analytical solution to the minimization of the energy is used in this algorithm. The algorithm is based on quasi-static mechanics. The current discretized guide wire representation we introduced, allows to model guide wires with different physical characteristics
252
T. Alderliesten, M.K. Konings, and W.J. Niessen
Fig. 4. a) An intra–cranial anthropomorphic vascular phantom including the circle of Willis and an aneurysm. Visualization of a simulated (b) and a real (c) propagation of a guide wire in the intra–cranial phantom until it reaches the aneurysm.
by changing the segment length, defining different values for the constants for the bending energy per joint and by adjusting the ω vector per joint. Results indicate plausible behavior of guide wire propagation in a phantom data set, where the actual position could be validated visually using 3D X-Ray imaging. This verifies that the methodology we proposed, based on a relaxation algorithm, gives realistic results. However, for a more realistic modeling of motion and forces in patients, additional information needs to be incorporated. This can readily be achieved with the proposed paradigm. Furthermore, for extensive validation a simulation device is currently being constructed which allows controlled translation and rotation of the guide wire in experiments.
References 1. G. Abdoulaev, S. Cadeddu, and G. Delussu et al. ViVa: The Virtual Vascular Project. IEEE Trans. on Inform. Techn. in Biomedicine, (22(4)):268–274, 1998. 2. G. Arfken. Mathematical Methods for Physicists. 3. S. Cotin, H. Delingette, and N. Ayache. Real-Time Elastic Deformations of Soft Tissues for Surgery Simulation. IEEE: Transactions on Visualization and Computer Graphics, (5(1)):62–73, January-March 1999. 4. J.K. Hahn and R. Kaufman et al. Training Environment for Inferior Vena Caval Filter Placement. Studies in Health Techn. and Inf., (50):291–297, 1998. 5. M.K. Konings and E.B. van de Kraats. Discretized Analytical Guide Wire Movement Algorithm. Utrecht Medical Center Utrecht Technical Report 015319, 2000. 6. H.L. Lim, B.R. Shetty, C.K. Chui, Y.P. Wang, and Y.Y. Cai. Real–Time Iinteractive Surgical Simulator for Catheter Navigation. Proceedings of SPIE Biomedical Optics 1998, SPIE vol. 3262 San Jose, USA, January, pages 4–14, 1998. 7. Z. Li and C.K. Chui et al. Computer Environment for Interventional Neuroradiology Procedures. Simulation and Gaming, (32(3)):404–419, 2001. 8. J.K. Udupa and S. Samarasekera. Fuzzy Connectedness and Object Definition: Theory, Algorithms, and Applications in Image Segmentation. Graphical Models and Image Processing, (58(3)):246–261, 1996. 9. O. Zienkewickz and R. Taylor. The Finite Element Method. McGraw Hill, 1987.
Needle Insertion Modelling for the Interactive Simulation of Percutaneous Procedures S.P. DiMaio and S.E. Salcudean Department of Electrical and Computer Engineering University of British Columbia, Vancouver, Canada {simond,tims}@ece.ubc.ca
Abstract. A novel interactive virtual needle insertion simulation is presented. The insertion model simulates three-degree-of-freedom needle motion, physically-based needle forces, linear elastostatic tissue deformation and needle flexibility for the planning and training of percutaneous therapies and procedures. To validate the approach, an experimental system for measuring planar tissue deformation during needle insertions has been developed and is presented. A real-time simulation algorithm allows users to manipulate the virtual needle as it penetrates a tissue model, while experiencing steering torques and lateral needle forces through a planar haptic interface. Efficient numerical computation techniques permit fast simulation of relatively complex two-dimensional and three-dimensional environments at haptic control rates.
1
Introduction
One of the most common procedures employed in modern clinical practice is the subcutaneous insertion of needles and catheters. In many cases, such procedures are difficult to plan and to perform, and can lead to significant complications if performed incorrectly [1, 2, 3, 4]. Physically-based virtual planning and training environments are being developed [2, 5, 6, 7, 8, 9, 10]; however, the majority of these systems use largely phenomenological and heuristic models that have not been validated, and that are not generalizable. While perhaps effective for the simulation of predominantly 1-DOF problems [6], these approaches may not be suitable for problems involving more complex soft tissue anatomy, needle placement optimisation, trajectory planning and automatic control, where more detailed verifiable knowledge of the biomechanical interaction between surgical needles and soft tissues is required. In prior work, needle insertion forces have been determined for gelatine [11], ex vivo porcine and bovine tissues [5, 12]. In each case, only the resultant force acting at the proximal end of the needle was measured, while in fact penetration forces are distributed along the entire length of the needle axis, resulting from physical phenomena such as cutting/fracture, sliding, friction, stick-slip friction, tissue deformation, tissue displacement and peeling [5]. The needle driving forces measured previously are the integration of this force distribution along the needle shaft. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 253–260, 2002. c Springer-Verlag Berlin Heidelberg 2002
254
S.P. DiMaio and S.E. Salcudean
Fig. 1. The complete experimental setup: a robotic manipulator with instrumented epidural needle mounted, a tissue phantom and a CCD camera.
This paper presents a new methodology that has been developed to experimentally determine needle forces during soft tissue puncture, as well as a simulation algorithm for needle insertion mechanics, and is organised as follows. In Section 2, an experimental system for measuring planar tissue phantom deformations during probing and needle insertion is described. Soft tissue modelling and parameterization using a linear elastostatic model is used for estimating the force distribution that occurs along a needle shaft during insertion, as is outlined in Section 3. A numerical simulation of a needle insertion based upon estimated needle force distributions is discussed in Section 4. Issues of real-time performance and haptics are also addressed. Conclusions and discussion of future work are provided in Section 5.
2
Experimental System to Measure Planar Tissue Deformations
An experimental setup, shown in Figure 1, has been developed in order to measure the relationship between needle force and 2-D tissue phantom deformation during insertion. A 17 guage Tuohy needle is instrumented with a 6-DOF force/torque sensor (ATI Nano-17 SI-12-0.12), and is manipulated by a 3-DOF planar device [13]. The planar motion of a soft tissue phantom (constructed using a polyvinyl chloride compound) is measured by means of a single CCD camera that is mounted above the needle insertion workspace. Images from this camera are used to track the motion of a set of markings that are applied to the top surface of the phantom, thereby measuring the deformation of the sample.
3
Needle Insertion Force Model
The relationship between measured insertion force and tissue deformation is characterised by a material model. Forces that occur along the needle shaft are estimated based upon this model, since the direct measurement of needle forces by an instrumentation technique is a challenging problem. If the relationship
Needle Insertion Modelling
255
Force per unit length 62N/m 48N/m
Penetration depth
(a)
(b)
Fig. 2. (a) Estimated forces at material mesh nodes. (b) Estimated needle force distribution.
between tissue force and displacement is known, then the distribution of force applied along the needle shaft can be computed given only the tissue motion resulting from needle penetration. Tissue deformation is complex and is still the subject of much research (e.g., [14, 15, 16] and many others). In general, tissue modelling is complex because of inhomogeous, non-linear, anisotropic elastic and viscous behaviour. As a first approximation, this study focuses on linear elastostatic models that are discretised using the Finite Element Method, yielding a set of 2n linear equations that describe tissue deformations in two dimensions: K(2n×2n) u = f ,
(1)
where u and f are displacement and force vectors for nodes lying on the mesh discretisation [17]. Such models are characterised by two parameters, namely Young’s Modulus and the Poisson Ratio, which are identified from boundary probing experiments [18]. An example of tissue phantom forces, derived using experimental measurements and the linear elastostatic material model, is shown in Figure 2(a). Figure 2(b) illustrates the force distribution that is found to occur along the needle during insertion. This distribution is taken from the estimated tissue phantom forces that lie at nodes along the needle, and indicates that axial friction between the needle and the tissue phantom is relatively uniform along the needle shaft. A force peak located immediately behind the needle tip rises approximately 30% above the friction force, and may be attributable to material cutting. Needle force distributions were determined using experimental measurements taken for a single, fixed needle insertion rate of 1mm/s, which is typical in clinical practice.
4
Needle Insertion Simulation
Virtual needle insertions are simulated using a numerical material model and the needle shaft force distribution that has been derived. A virtual needle is
Y [m]
256
S.P. DiMaio and S.E. Salcudean
0.09
0.09
0.09
0.08
0.08
0.08
0.07
0.07
0.07
0.06
0.06
0.06
0.05
0.05
0.05
0.04
0.04
0.04
0.03
0.03
0.03
0.02
0.02
0.02
0.01
0.01
0
0
−0.01 −0.06
−0.04
−0.02
0.02
0 X [m]
0.04
0.06
0.01 0
−0.01 −0.06
−0.04
−0.02
0
X [m]
0.02
0.04
0.06
0.08
−0.01 −0.06
−0.04
−0.02
0
0.02 X [m]
0.04
0.06
0.08
Fig. 3. Simulated needle intercept of a small target embedded within elastic tissue. 0.03
0.03
0.03
0.02
0.02
0.02
0.01
0.01
0.01
0
0
0
−0.01
−0.01
−0.01
−0.02
−0.02
−0.02
−0.03
−0.03
−0.03
−0.04
−0.04
−0.04
−0.05
−0.05
−0.05
−0.06
−0.03
−0.02
−0.01
0
0.01
0.02
0.03
−0.06
−0.03
−0.02
−0.01
0
0.01
0.02
0.03
−0.06
−0.03
−0.02
−0.01
0
0.01
0.02
0.03
Fig. 4. Simulated needle steering with needle flexibility. Lateral needle motion causes needle flexion, steering the needle toward the target (shown as a disk embedded in the tissue).
advanced into a linear elastostatic model that is discretised using the Finite Element Method [18], while needle shaft forces, distributed as shown in Figure 2(b), are applied to model mesh nodes that lie in the path of the needle. Simulated needle insertions that are based upon the estimated needle forces are shown to reproduce results similar to those observed experimentally. Figure 3 shows a simulated needle insertion into the side of a rectangular tissue model that is rigidly fixed along one edge. The needle axis is initially coincident with a “virtual biopsy target”, but fails to intercept the target (shown as a disk), due to tissue deformation. Figure 4 illustrates needle steering (to correct an off-target needle) with a model that includes needle flexibility. Note the nonminimum phase type of response as the base of the needle is moved away from the target, laterally. The potential of physically-based needle insertion simulations for planning and training purposes is thus illustrated. A real-time implementation of the needle insertion simulator allows users to experience both visual and kinesthetic feedback while executing a virtual planar needle insertion. The haptic simulation system is described in detail in [17]. Realtime computation of needle insertion into soft tissue is complicated by the “curse of dimensionality” that is established by the large number of linear equations required to describe even small models. The behaviour of a continuum model discretised by the finite element method is observed through the behaviour of a finite set of mesh nodes. For large volumes
Needle Insertion Modelling
1
oi ~ 0
1
y
257
i
y
x
0
x
(a)
(b)
Fig. 5. (a) Mesh nodes lying along the needle are constrained along the 1x-axis, and either slip or stick along the 1y-axis. (b) New intercept nodes are identified by searching within a small neighbourhood centred at the most distal needle node.
of tissue that are finely discretised, the size of matrix K in Equation (1) becomes large due to the large number of material nodes at which force or displacement need to be solved. For needle insertion simulation, it is not necessary to consider the motion of nodes that are not visible (e.g. interior nodes), or the forces applied at nodes that are not in direct contact with the needle shaft. In Figure 3 it is evident that the large majority of mesh nodes are neither visible nor palpable; therefore, the system of linear equations can be reduced to KW uW = f W , where only the behaviour at a small subset W of mesh nodes (called working nodes) is explicitly considered. At run-time the subset of working nodes W is selected and the system matrix reduced to KW . As the needle penetrates the tissue surface, it intercepts hidden nodes that need to be re-introduced into the reduced system by simply adding new matrix rows and columns that are derived directly from K, which is precomputed. The matrix reduction approach is similar to the condensation techniques discussed in [15], and the Boundary Element Method selected by [14]; however, in this work access to the interior of tissue volumes is retained for quick inclusion when needle penetration occurs. 4.1
Boundary Conditions and Needle Constraint
Mesh nodes that are in contact with the needle are constrained by the needle as shown in Figure 5(a). If the needle is rigid, then the lateral position of the node is fixed along the 1 x-axis, which constitutes a displacement boundary condition. Along the needle shaft, the node force or node displacement may be constrained, depending upon its state of contact with the needle (i.e. sticking to the needle, or slipping) [18]. If the node is free to slide along the needle shaft, then a force boundary condition is applied along the 1y-axis, and a constant force consistent with the force distribution is applied to the slipping node. If it is in the stuck state, then the node is constrained to lie at a fixed point on the needle, along the 1y-axis. The system of equations in KW is rearranged in order to reflect the resulting inhomogeneous collection of boundary conditions: KW uW = f W
→
KW xW = y W ,
(2)
where xW and y W are formed by exchanging elements between uW and f W .
258
S.P. DiMaio and S.E. Salcudean
Needle node boundary conditions change frequently during simulation, depending upon the commanded motion of the needle. A single boundary condition change can be expressed as an inexpensive low rank update: −1 −1 (KW ) = KW −
ci .ri , pi
(3)
−1 −1 ; ci and ri are the ith column and ith row of KW , where pi is the ith pivot of KW th with the exception of their i coordinates, which are set to (pi + 1) and (pi − 1), −1 ) is the new system matrix and vectors xW and y W must respectively. (KW be adjusted accordingly (i.e. by exchanging displacement and force variables). This approach to boundary condition changes results in an O(N 2 ) computation rather than the O(N 3 ) operation required to re-invert stiffness matrix KW , in a way that is similar to the capacitance matrix strategy presented by James and Pai in [14].
4.2
Local Coordinate Changes
The coordinate system shown in Figure 5(a) is fixed to the needle; therefore, as its orientation changes it is necessary to effect local coordinate changes in −1 . When the boundary conditions are uniform, this results in a simple affine KW transformation: 0
−1 0 uW = K W fW
⇒
−1 1 uW = AT KW A fW
1
where 0 uW and 0fW are displacement and force vectors in a nominal system coordinate frame, while 1 uW and 1fW are the vectors after rotating the coordinate frame at the ith node by an angle θ. Matrix A is composed of (2 × 2) rotation submatrices on its diagonal [18]. If node i has different boundary conditions along its two coordinate axes, then such a tranformation is not possible, due to the mixed force and displacement variables in xW and y W . The local coordinate transformation for a node that is sliding along the needle axis has the following form: −1 −1 xW = (M − KW N )−1 (KW M − N ) 1yW
1
where M and N are sparse transformation matrices. Due to the properties of −1 N ) is shown to be inexpensively inverted, and the new M and N , (M − KW system computed [18]. Node coordinate frames must be updated incrementally from one simulation sample period to the next, according to the change in needle orientation angle ∆θ. If the needle is curved or flexible, then the local coordinate system transformations will vary along the length of the needle shaft. 4.3
System Solution
−1 The reduced system matrix KW evolves from sample period to sample period, due to boundary condition and local coordinate system updates, and is used
Needle Insertion Modelling
259
Fig. 6. Interactive virtual needle insertion in a planar environment.
to solve for xW , the vector of unknown node forces and displacements. Node positions are updated for graphical rendering of the scene and needle node forces are integrated for feedback via the haptic interface. The interactive real-time model shown in Figure 6 consists of 361 nodes and is computed at a rate of 500Hz by a 450MHz P-III PC, without any particular effort to optimise code.
5
Conclusion
This paper presents a system for interactively simulating virtual needle insertions that is based upon experimentally determined needle insertion mechanics. A novel approach for estimating needle shaft forces and tissue behaviour using a measurement system and soft tissue deformation models has been developed. It is based on the established Finite Element Method and parameters identified during experiments. The haptically-enabled virtual insertion environment allows users to manipulate a three-degree-of-freedom needle as it penetrates a discretised inhomogeneous linear elastostatic tissue model. Unlike existing single axis simulation, steering torques and lateral needle forces can be felt, while tissue model deformation is observed. Real-time simulation of this model is challenging due to the large system of equations involved, as well as frequent topological and boundary condition changes that occur as the needle moves into the tissue model. We have developed a fast algorithm for interactive needle insertion with force feedback, without loss of model detail or degradation in global response. The haptic simulation, described in [17], achieves a sample rate of 500Hz for a 2-D virtual tissue model and planar haptic interface, using a desktop PC. While it was developed for modelling 2D tissue phantoms, the method can be generalised to 3D. Needle mechanics measurements and simulations are of interest for the development of physically-based virtual planning and training systems that are aimed at reducing the incidence of complications in clinical practice. Current and future work will explore 3-D modelling techniques, further biomechanics experiments, non-linear material models, the effects of material inhomogeneities and dynamics (including needle feed rate dependence), as well as model-based planning and control of needle insertion procedures.
260
S.P. DiMaio and S.E. Salcudean
References 1. Datta, S.: Complications of regional analgesia and anaesthesia. In: Proceedings of the 17th Annual European Society of Regional Anaesthesia Congress. (1998) 2. Azar, F.S., Metaxas, D.N., Schnall, M.D.: A Finite Element Model of the Breast for Predicting Mechanical Deformations during Biopsy Procedures. In: Proc. of the IEEE Workshop on Math. Methods in Biomedical Image Analysis. (2000) 38–45 3. Nath, S., Chen, Z., Yue, N., Trumpore, S., Peschel, R.: Dosimetric effects of needle divergence in prostate seed implant using 125 I and 103 Pd radioactive seeds. In: Medical Physics, American Institute of Physics (2000) 1058–1066 4. Fiducane, B.: Complications of brachial plexus anaesthesia. In: Proceedings of the 17th Annual European Society of Regional Anaesthesia Congress. (1998) 5. Brett, P.N., Parker, T.J., Harrison, A.J., Thomas, T.A., Carr, A.: Simulation of resistance forces acting on surgical needles. In: Proceedings of the Inst. of Mech. Engineers. Part H, Journal of Engineering in Medicine. Volume 211. (1997) 335–347 6. Hiemenz, L., McDonald, D.J., Stredney, D., Sessanna, D.: A Physiologically Valid Simulator for Training Residents to Perform an Epidural Block. In: Proceedings of the 15th Southern Biomedical Engineering Conference. (1996) 7. Kwon, D.S., Kyung, J.U., Kwon, S.M., Ra, J.B., Park, H.W., Kang, H.S., Zeng, J., Cleary, K.R.: Realistic Force Reflection in a Spine Biopsy Simulator. In: Proc. of the IEEE Int. Conf. on Robotics and Automation. (2001) 1358–1363 8. Shimoga, K.B., Khosla, P.K.: Visual and Force feedback to Aid Neurosurgical Probe Insertion. In: Proceedings of the 16th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. Engineering Advances: New Opportunities for Biomedical Engineers. Volume 2. (1994) 1051–1052 9. Miller, S., Jeffrey, C., Bews, J., Kinsner, W.: Advances in the Virtual Reality Interstitial BrachyTherapy System. In: Proceedings of the Canadian Conference on Electrical and Computer Engineering. (1999) 349–354 10. Zeng, J., Kaplan, C., Bauer, J., Xuan, J., Sesterhenn, I.A., Lynch, J.H., Freedman, M.T., Mun, S.K.: Optimizing prostate needle biopsy through 3-D simulation. In: Proceedings of SPIE Medical Imaging. (1998) 11. Hiemenz, L., Litsky, A., Schmalbrock, P.: Puncture Mechanics for the Insertion of an Epidural Needle. In: Proceedings of the Twenty-First Annual Meeting of the American Society of Biomechanics. (1997) 12. Simone, C., Okamura, A.: Haptic Modeling of Needle Insertion for Robot-Assisted Percutaneous Therapy. In: IEEE Int. Conf. on Robotics and Automation. (2002) 13. Sirouspour, M.R., DiMaio, S.P., Salcudean, S.E., Abolmaesumi, P., Jones, C.: Haptic Interface Control – Design Issues and Experiments with a Planar Device. In: Proceedings of the IEEE Int. Conf. on Robotics and Automation. (1999) 14. James, D.L., Pai, D.K.: ArtDefo, Accurate Real Time Deformable Objects. In: Computer Graphics - Proceedings of SIGGRAPH ’99. (1999) 15. Bro-Nielsen, M.: Finite Element Modeling in Surgery Simulation. In: Proceedings of the IEEE. Volume 86. (1998) 490–503 16. Hagemann, A., Rohr, K., Stiehl, H.S., Spetzger, U., Gilsbach, J.M.: Nonrigid Matching of Tomographic Images Based on a Biomechanical Model of the Human Head. In: Medical Imaging 1999 - Image Processing (MI ’99). (1999) 583–592 17. DiMaio, S.P., Salcudean, S.E.: Simulated Interactive Needle Insertion. In: Proc. of the 10th Symposium on Haptic Interfaces for Virtual Environments and Teleoperator Systems, IEEE Virtual Reality. (2002) 18. DiMaio, S.P., Salcudean, S.E.: Needle Insertion Modelling and Simulation. In: IEEE International Conference on Robotics and Automation. (2002)
3D Analysis of the Alignment of the Lower Extremity in High Tibial Osteotomy Hideo Kawakami1, Nobuhiko Sugano2,Takashi Nagaoka5, Keisuke Hagio1, Kazuo Yonenobu3, Hideki Yoshikawa2, Takahiro Ochi1, Asaki Hattori4, and Naoki Suzuki4 1 Department of Computer Integrated Orthopaedics, Osaka Univ. Graduate School of Medicine, 2-2 Yamadaoka, Suita-shi, 565-0871,Osaka, Japan LOE[EOEQMSWO$YQMREGNT 2 Department of Orthopaedic Surgery, Osaka Univ. Graduate School of Medicine, 2-2 Yamadaoka, Suita-shi, 565-0871,Osaka, Japan 3 Department of Orthopaedic Surgery, Osaka-Minami National Hospital, 2-1 Kidohigashimachi, Kawachinagano-shi, 586-0008, Osaka, Japan 4 Institute for High Dimensional Medical Imaging, Jikei Univ.School of Medicine, 4-11-1 Izumi Honcho, Komae-shi, Tokyo, Japan 5 Graduate School of Science and Engineering, Waseda University, 3-4-1 Okubo, Shinjuku-ku, Tokyo, Japan
Abstract. The alignment of the lower extremities after high tibial osteotomy (HTO) varies widely on the radiographs. It is difficult to assess the rotation change on the radiographs. We developed computer software to calculate the alignment of the lower extremities and to simulate HTO. The purpose of this study is to compare and evaluate the variance in the femoro-tibial angle (FTA) and hip-knee-ankle angle (HKA) in relation to rotational shift of the lower extremity on radiographs and to clarify how the rotational shift affects the FTA and HKA by using three-dimensional (3D) CT simulation software for HTO. The mean absolute rotation angle of the lower extremity position on radiographs was 7.8 degrees, ranging from an external rotation of 8 degrees to an internal rotation of 14 degrees. Within the same range of rotation, the mean change in the FTA was 3.7 degrees, the mean change in the HKA was 1.5 degrees.
1
Introduction
High tibial osteotomy (HTO) is a treatment option for medial osteoarthritis of the knee. The goal of this operation is to reduce the abnormal loading stresses on the knee joint through realignment of the lower extremity by osteotomy of the proximal tibia. Researchers have recognized the importance of precise realignment after HTO, and it 3)5)12) . However, has been reported that poor alignment yields poor results overall alignment after HTO varies widely despite improvements in surgical techniques and 1)6)7)9)10) . Imprecise preoperative radiographic measurements and operative instruments rotational shift of the osteotomy plane may be the factors which causes inconstancy of the postoperative alignment. The femoro-tibial angle (FTA) and hip-knee-ankle angle 1)11) (HKA) are often used to estimate the axial alignment . FTA is an angle formed by the femoral axis and the tibial axis. HKA is an angle formed by the line from the T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 261–267, 2002. © Springer-Verlag Berlin Heidelberg 2002
262
H. Kawakami et al.
center of the femoral head to the center of the knee joint and the line from the center of the knee joint to the center of the ankle joint. Preoperative planning of realignment involves measuring full-length weight-bearing radiographs of the lower extremity, but full-length radiographs may lack standardization in the positioning of the subjects and may be distorted by parallax. It is quite difficult to estimate axial alignment error due to rotational shift of the lower extremity on plain radiographs. In a closed wedge type of HTO, the proximal tibia is cut in a plane parallel to the knee joint surface, a wedged bone is resected from the proximal tibia on the lateral side, and each surface of the fragments is reattached. This procedure may lead to a shift in rotation between the proximal tibia and the distal tibia, and such a rotational shift causes a change in the axial alignment of the lower extremity. The purpose of this study is to evaluate and compare the changes of FTA and HKA in relation to rotational shift of the lower extremity on radiographs and to clarify how the rotational shift in osteotomy affects the FTA and HKA by using 3D CT simulation software for HTO.
2
Materials and Methods
22 knees of 13 patients with medial osteoarthritis of the knee were subjects of this study. There were 4 men and 9 women and the age ranged from 31 to 84 years (mean 64 years). An anteroposterior radiograph of the lower extremity was taken by positioning the extremity as much as possible so that the patellar was facing forward, and CT images were acquired from the proximal end of the femur to the distal end of the tibia by positioning with full extension of the knee joint and 90-degree extension of the ankle joint in supine position. 3D surface models of the femur, tibia and patellar were reconstructed from CT images by surface rendering. Six landmark points were plotted on the 3D surface models (Fig.1).
Fig. 1. Landmark points on the 3D surface model
Fig. 2. Axis of the lower extremities (a) Mechanical axis for measuring HKA (b) Anatomical axis for measuring FTA
3D Analysis of the Alignment of the Lower Extremity in High Tibial Osteotomy
Fig. 3. The measurement of the axial rotation
263
Fig. 4. The measurement of the alignment (FTA,HKA)
The first point was the center of the femoral head, which was determined by the least-squares method; the second and third points were the lateral and medial epicondyles of the femur, which were determined by selecting the most distant points among the 1-100 points around both; the fourth and fifth points represented the medial and lateral joint surface of the proximal tibia, and they were determined by selecting the centers of gravity of multiple points on each joint surface; the last point represented the joint surface center of the distal tibia, which was determined by the same method as the fourth and fifth points. Based on these points, the following coordinates were defined. In the coordinates to define the front of the lower extremity, the Z-axis is a line through the center of the femoral head and the center of gravity of the distal tibia joint surface, and the X-axis is a line through the medial and lateral epicondyles of the femur. In the anatomical axis for measuring the FTA, the femoral axis is the line through the center of the femoral section with 15% proximal side of the femoral length and the center of the femoral section with 15% distal side of the femoral length, and the tibial axis is the line through the center of the tibia section with 15% proximal side of the tibial length and the center of the tibial section with 15% distal side of the tibial length (Fig.2-a). In the mechanical axis for measuring the HKA, the femoral axis is the line from the center of the femoral head to the mid-point of the medial and lateral epicondyle, and the tibial axis is the line from the center of gravity of the proximal tibia joint surface to the center of gravity of the distal tibia joint surface (Fig.2-b). We developed computer software that calculates the FTA and HKA on a 2D projection from the 3D bone models, and also developed a bone-cutting program which can cut the bone models in any plane and remove the wedge bone and reattach the surfaces for the operative simulation of HTO. To measure the axial rotation of the lower extremity on anteroposterior plane radiographs, position of the patellar and femur on the 3D surface model were matched to those of the patellar and femur on radiographs (Fig.3). The maximum external rotation and the maximum internal rotation of the lower extremity on the radiographs were picked up in all subjects, and the 3D surface model was rotated between assumed maximum internal rotation and external rotation, the FTA and HKA were calculated on the projection of the 3D models (Fig.4). Differences of HKA and FTA between the two assumed positions of the lower extremity were compared statistically. Using the operative simulation program, the FTA and HKA were measured in the hypothetical case of internal and external 10-
264
H. Kawakami et al.
degrees rotation error in cutting the proximal tibia and another hypothetical case of internal and external 10- degrees rotation error in reattaching proximal cutting surface to distal cutting surface(Fig.5). Similarly, differences of the FTA and HKA between the two types of HTO simulations were compared statistically.
Fig. 5. Operative simulation of HTO (a) Rotational error in cutting the proximal tibia (b) Rotational error in reattaching the bone cutting surface
Internal rotation angle (degree)
20 15 10 5 0 -5
-10
Fig. 6. Rotation angle of lower extremities on the radiographs (N=22)
3
Results
The mean absolute rotation angle of the lower extremity position on radiographs was 7.8 degrees (N=22) , ranging from an external rotation of 8 degrees to an internal rotation of 14 degrees (Fig.6). Within this range of rotation, the FTA tended to decrease with internal rotation, and the HKA tended to increase with internal rotation (Fig.7). Within the same range of rotation, the mean change in the FTA was 3.7 degrees, the mean change in the HKA was 1.5 degrees, and the differences of the FTA and HKA were significant (P<0.05)(Fig.8).
3D Analysis of the Alignment of the Lower Extremity in High Tibial Osteotomy
265
188 FTA (degree)
186 184 182 180 178 176 174 Internal rotation angle 2
HKA (degree)
0 -2 -4 -6 -8 -10 -12 Internal rotation angle
Fig. 7. Alignment of the lower extremities ranging from an external rotation of 8 degree to an internal rotation of 14 degrees (N=22) 6 5
Mean±SD
(degree)
4 3
HKA FTA
2 1 0
Fig. 8. Change in the HKA and FTA ranging from an external rotation of 8 degrees to an internal rotation of 14 degrees (N=22)
When the rotational shift was assumed to be an internal and external rotation in the range of 10 degrees, the change in the axial alignment during the simulation of a rotational error in cutting the proximal tibia involved a mean of 0.4 degrees in the FTA and a mean of 0.3 degrees in the HKA, and the change in the axial alignment during the simulation of a rotational error in reattaching the proximal and distal bone surfaces involved a mean of 3.4 degrees in the FTA and a mean of 2.1 degrees in the HKA. The FTA tended to decrease with internal rotation of the distal tibia reattach-
266
H. Kawakami et al.
ment, and the HKA tended to increase with internal rotation of the distal tibia reattachment. Differences of the FTA and HKA during the simulation of a rotational error in cutting the proximal tibia were significant (P<0.05). Differences of the FTA and HKA during the simulation of a rotational error in reattaching the proximal bone cutting surface to the distal bone surface were significant (P<0.05)(Fig.9). 6 5
Mean±SD
(degree)
4 HKA FTA
3 2 1 0
Rotational error in cutting the proximal tibia
Rotational error in reattaching the bone cutting surface
Fig. 9. Change in the HKA and FTA of HTO simulation (N=22)
4
Discussion
The alignment of the lower extremities was calculated on the 2D radiographs, so that we could not evaluate the influence of rotation. In a previous study by Wright et al, lower extremities axial rotation of 10°internally and externally did not have a statisti14) cally significant effect on FTA in radiographs of amputated lower extremities . By Swanson et al, lower extremities axial rotation of 10°internally and externally have a statistically significant effect on FTA in radiographs of Sawbone extremities model 13) 8) with severe valgus or varus deformity . By Krackow et al, the effect of flexion and rotation on varous and valgus deformity was analyzed by calculating the assuming leg deformity. Their results were different each other and materials of them were not patient with medial osteoarthritis of the knee. In this study the axial alignment of the patients with medial osteoarthritis of the knee was calculated and analysis, so it was useful on clinical analysis. In previous study of a computer approach to HTO by Chao et al, 2D rigid –body 2) spring model to simulate the forces across the articular surfaces was used . By Ellis et al, a 3D pre-surgical planner and an intraoperative guidance system for HTO were 4) developed . Both of them were used for the preoperative planning of HTO. However 2D rigid-body spring model is 2D model and 3D pre-surgical planner is the 3D model of only around of the knee joint, the alignment of the whole leg in rotation cannot analyze exactly. We developed computer software that calculates the alignment exactly from the 3D whole leg bone models and simulate the operation of HTO. In this study the pre and post-operative alignment of the lower extremities was calculated on 3D models of the whole leg. The differences of FTA and HKA in the rotation of HTO were analyzed. In the case of patient with medial osteoarthritis of the knee, the
3D Analysis of the Alignment of the Lower Extremity in High Tibial Osteotomy
267
change of HKA in rotation was smaller than that of FTA. FTA was presumed to receive the influence of rotation, in comparison with HKA.
5
Conclusion
The rotation of the lower extremity position on the preoperative radiograph influenced the measured value of the axial alignment of the lower extremity. Differences of the HKA were statistically smaller than that of the FTA within the range of preoperative lower extremity rotation on radiographs. In the HTO simulation, errors in cutting proximal tibia bone surface had no influence on the measurement value of the axial alignment of the lower extremities within the range of 10-degree external and internal rotation. However, errors in reattaching proximal bone cutting surface to distal bone surface within the range of 10-degree external and internal rotation had a significant influence on the measurement value of the axial alignment of the lower extremity. Differences of the HKA were statistically smaller than that of the FTA as measured in the HTO simulation of rotational error in cutting the proximal tibia and reattaching bone cutting surface.
References 1. Bauer GC, et al. Tibial Osteotomy in Gonarthrosis. J. Bone and Joint Surg. 51-A:15451563, 1969. 2. Chao EYS, et al. Computer-aided Preoperative Planning in Knee Osteotomy. Iowa Orthop J. 15:4-18, 1995. 3. Coventry MB. Proximal Tibial Osteotomy. J.Bone and Joint Surg.75-A :196- 201, 1993. 4. Ellis RE, et al. A Surgical Planning and Guidance System for High Tibial Osteotomy. Computer Aided Surgery. 4:264-274, 1999. 5. Hernigou P, et al. Proximal Tibial Osteotomy for Osteoarthritis with Varus Deformity. J. Bone and Joint Surg. 69-A: 332-354, 1987. 6. Insall JN, et al. High Tibial Osteotomy. J.Bone and Joint Surg. 56-A: 1397-1405, 1974. 7. Kettelkamp DB, et al. Results of Proximal Tibial Osteotomy. J.Bone and Joint Surg. 58-A: 952-960, 1976. 8. Krackow KA, et al. A Mathematical Analysys of the Effect of Flexion and Rotation on Apparent Varous/Valgus Alignment at the Knee. Orthopedics; 13:861-868, 1990. 9. Krackow KA, et al. AAOS Instructional 47: 429-436, 1998. 10. 10.Macintosh DL.,and Welsh RP. Joint Debridement-A Completement to High Tibial Osteotomy in the Treatment of Degenerative arthritis of the Knee. J. Bone and Joint Surg. 59-A:1094-1097, 1977. 11. Moreland JR ,et al.Radiographic Analysis of the Axial Alignment of the Lower Extremity. J. Bone and Joint Surg. 69-A:745-749 , 1987. 12. Rudan JF, et al. High Tibial Osteotomy. Clin.Orthop.268: 157-160, 1990. 13. Swanson KE, et al. Dose Axial limb Rotation Affect the Alignment Measurements in Deformed Limb?. Clin.Orthop:371:246-252, 2000. 14. Wright JG, et al. Measurement of Lower Limb Alignment Using Long Radiographs. J.Bone and Joint Surg.73-B:721-723,1991.
Simulation of Intra-operative 3D Coronary Angiography for Enhanced Minimally Invasive Robotic Cardiac Intervention G. Lehmann, D. Habets, D.W. Holdsworth, T. Peters, and M. Drangova The John P Robarts Research Institute, The University of Western Ontario and the London Health Sciences Centre. London Ontario, Canada
Abstract. A simulation environment has been developed to aid the de-
velopment of three-dimensional (3D) angiographic imaging of the coronary arteries for use during minimally invasive robotic cardiac surgery. We have previously developed a dynamic model of the coronary arteries by non-linearly deforming a high-resolution 3D image of the coronaries of an excised human heart, based on motion information from cine bi-plane angiograms. The result was a sequence of volumetric images representing the motion of the coronary arteries throughout the cardiac cycle. To simulate different acquisition and gating strategies, we implemented an algorithm to forward project through the volume data sets. Thus, radiographic projections corresponding to any view-angle, can be produced for any time-point throughout the cardiac cycle. Combining re-projections from selected time-points and view angles enables the evaluation of various gating strategies. This approach will allow us to determine the optimum image acquisition parameters to produce 3D coronary angiograms for planning and guidance of minimally invasive robotic cardiac surgery.
1
Introduction
Traditional coronary artery bypass (CAB) procedures create an alternate route of blood supply, bypassing an occluded artery by grafting a vessel from another part of the body. CAB procedures require a full sternotomy and cardio-pulmonary bypass, each of which inflict significant trauma to the patient and require a lengthy recovery period. To decrease the associated trauma of CAB procedures, minimally invasive direct CAB (MIDCAB) and more recently, minimally invasive robotic CAB (MIRCAB) have been implemented. MIDCAB techniques eliminate the need for full sternotomy and cardio-pulmonary bypass and are all performed on the beating heart with endoscopic port-access inserted through a small incision in the chest wall. A major limitation of MIDCAB is that the long-handled instruments magnify hand tremors, which can make precise suturing difficult and tiring. Robot-assisted surgical systems were developed for MIRCAB to avoid the restrictions of conventional endoscopic port-access instruments, to remove surgeon tremor, T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 268–275, 2002. © Springer-Verlag Berlin Heidelberg 2002
Simulation of Intra-operative 3D Coronary Angiography
269
provide a minification factor between the operator’s movements and the tools, and permit the surgeon to perform the procedure from a comfortable position. MIRCAB procedures have several technical limitations, including the lack of guidance from conventional two-dimensional (2D) images of patients, possible improper port placement and limited field of view of the operative site from the endoscope. These problems are being addressed by a virtual cardiac surgical planning (VCSP) platform [1,2] being developed at the John P. Robarts Research Institute. The VCSP will ultimately provide the surgeon with a dynamic, virtual representation of the patient’s thorax in the operating room where the patient’s heart motion and positioning are synchronized in the virtual environment. With such a system, the surgeon would always maintain a global view of the operative site, and not be constrained by the small field-of-view of the endoscope. We believe that further improvements in the MIRCAB procedure could result from pre- and intra-procedure three-dimensional (3D) angiograms of the coronary arteries. Since the heart is beating during an interventional procedure, intra-operative 3D coronary angiograms could be used to update preoperative images in the VCSP to verify the location of the surgical instruments with respect to the surgical target, as well as to verify, post-operatively a successful bypass graft. Recent advances in cone-beam CT - including computed rotational angiography (CRA, also known as 3D DSA, Conebeam C-arm CT, etc), [3,4] electro-cardiograph (ECG) gating strategies, and the potential for fast CCD-equipped x-ray image intensifiers (XRII) - warrant a feasibility study into the implementation of 3D coronary angiography in the operating room. We have begun a preliminary investigation aimed at providing intra-operative 3D coronary angiography by modifying a clinical C-arm angiography system in the operating room. The purpose of this paper is to demonstrate the capability of a numerical modeling environment to simulate acquisition and gating strategies and how they will be used to investigate the feasibility of intra-operative 3D coronary angiography.
2
Methods
The human heart is relatively still during the diastolic phase of the cardiac cycle, making it possible to select projections during diastole and reconstruct a 3D volume with acceptable artifacts arising from cardiac motion. The simulation environment uses a dynamic 3D model of the coronary circulation to investigate the appropriate projections that can be selected from diastole. To allow a comparison of image quality, the numerical simulation environment mimics the imaging parameters of a CRA system, thus allowing simulated images to be compared to the images acquired with the CRA system. 2.1
Dynamic Model of the Coronary Circulation
We have previously developed a realistic dynamic model of the coronary circulation [5] and briefly outline the essential steps in the following section. The dynamic coronary artery model was developed from high-quality 3D CT images of static human coronary arteries and cine bi-plane angiograms from a patient with similar coronary anatomy, as described below.
270
G. Lehmann et al.
2.1.1 High Quality 3D CT Image of Human Coronary Arteries A modified clinical angiography system [3,4] that was developed for cerebrovascular procedures was used to obtain a 3D CT image of coronary arteries in a human cadaver heart that was clamped at the aortic root and cannulated. To equalize the x-ray attenuation path throughout the myocardium and its environment the heart was suspended in a saline bath and perfused with saline solution. Iodinated contrast agent was injected manually into the aortic root, providing adequate contrast for imaging the coronary arteries. Acquisition of 2D projections over 200o (30-Hz acquisition rate at 90-kVp and 2-mAs with a nominal field of view of 28-cm) commenced when the coronary arteries were filled with contrast agent and the contrast-agent injection continued throughout the 4.5-s acquisition. From the 129 acquired projections, a 400x400x400 volumetric image of the coronary arteries, with 400-µm isotropic voxels was reconstructed. 2.1.2 3D Motion Information from 2D Bi-plane Angiograms Based on the 3D CT image of the coronary, a cardiac patient with coronary anatomy similar to that of the excised heart was selected. The patient was imaged using a clinical bi-plane angiography system using the standard right anterior oblique (RAO) and left anterior oblique (LAO) geometries for imaging the coronaries. To determine the motion of the coronary arteries in 2D, arterial bifurcations were used as landmarks that could be followed throughout the cardiac cycle. To find the 3D distribution of the bifurcations identified in the LAO and RAO images, the imaging system was calibrated using a phantom containing eleven 1.5-mm diameter steel spheres. The 3D coordinates of the vascular landmarks were then determined using standard least-squares techniques. This procedure was repeated at successive intervals throughout the cardiac cycle, resulting in a series of volumes that tracked the dynamically changing 3D coordinates of the landmarks. From these coordinates, a dynamic set of vectors that describe the motion of the vascular landmarks throughout the cardiac cycle was constructed. 2.1.3 Non-linear Deformation This set of dynamic vectors was then used to drive a thin-plate-spline [6] based nonlinear warping algorithm to deform the 3D static image of the coronary arteries between time points in the cardiac cycle. For the purposes of the work presented here, the point constraints used in the non-linear warping algorithm are the bifurcation landmarks identified on the 3D CT image and the corresponding landmarks (typically 18) determined from the cine bi-plane images. The image analysis software used to create the coronary-artery model consisted of Python applications based on the Visualization Toolkit (VTK) libraries. The nonlinear deformation algorithm, contained in the VTK libraries was implemented based on the 3D motion of the landmarks. Each deformation was performed with respect to the original 3D CT image to minimize the image degradation associated with successive deformations. The resulting dynamic model consisted of 26-temporal volumes with a frame rate of 30-Hz, the model therefore represents a patient with a 69-bpm heart rate. The implementation utilized C++ classes and Python [7,8] applications based on the Visualization Toolkit (VTK) [9] libraries. The methods described in the following sections were also implemented in VTK.
Simulation of Intra-operative 3D Coronary Angiography
2.2
271
3D Coronary Angiography Simulation Environment
To be able to simulate different gating strategies within the 3D-coronary angiography simulation environment, we implemented a ray-driven projection method to generate angiographic views at arbitrary angles. For each ray passing through the model the values are summed to create a simulated radiographic projection. Note that unlike the projections obtained using the XRII-based detector of the CRA system, which represents variations in x-ray intensity, our algorithm calculates the sum of the linear attenuation coefficients along the path of each ray. This allows the calculated projections to be reconstructed without further image processing. The forward-projection algorithm used is based on a method [10] that considers the CT data to consist of the intersection volumes of three orthogonal sets of equally spaced, parallel planes. This algorithm scales to 3N with the number of planes, rather than previously-proposed projection algorithms which scale to N3, where N is the number of voxels in the 3D CT data set. The forward-projection algorithm simulated the geometric parameters of the CRA system. Projection views, within a 200o span are simulated from volumes representing the different phases of the cardiac cycle, and combined to produce complete sets of raw projection data. The projections were reconstructed using a modified Feldkamp cone-beam algorithm [11]. 2.3 Preliminary Simulation of Prospectively Gated Acquisition For cardiac imaging, the acquisition must to be synchronized to the cardiac cycle. Since the heart moves relatively little during diastole, it is possible to combine projections obtained at any time point during diastole as if the heart was stationary throughout that time period. To demonstrate the utility of the simulation environment, we preformed a preliminary study focused on determining the effect of reducing the number of views used in the reconstruction of a 3D coronary artery image and the amount of motion that can be tolerated during image acquisition. First, the effect of a reduced the number of views was investigated by reconstructing a 3D image from 129 (the number of views used to reconstruct the original CT image of the excised human heart), 65 and 33 views. For this test all views were obtained from the original static CT image of the excised human heart, thereby not introducing artifacts due to cardiac motion. The numerical enFig. 1. Comparison of the original CT image to simulated CT. (a) vironment was then MIP through the original volume (b) MIP through the simulated used to simulate dif- volume. Note that the heart is viewed at slightly different angles in ferent acquisition and (a) and (b).
272
G. Lehmann et al.
gating strategies. In these simulations, it was assumed that projections are collected at known view angles over a predetermined fraction of the cardiac cycle. A 30-Hz acquisition rate was assumed, mimicking the current acquisition rate of the CRA. In the simulation environment this parameter can be increased to model systems equipped with faster CCD cameras. Our preliminary study investigated two acquisition strategies: (I) consecutive views are acquired over a 200-ms acquisition window during a pre-selected time point in the cardiac cycle and multiple cardiac cycles are used to complete the acquisition of views over 200°, and (II) the same strategy implemented with a 100-ms acquisition window. For each of these strategies the number of views obtained per cardiac cycles depends on the length of the acquisition window and the frame rate of the imaging system; thus for strategy (I) 6 views are acquired per cycle and for strategy (II) 3 views are obtained. Both of these strategies were evaluated with a varying number of views, ranging between 65 and 129. Finally, the timepoint in the cardiac cycle, about which the acquisition window is centered, was varied from early to mid diastole.
3
Results
To demonstrate numerical simulation of 3D coronary angiography we have shown a maximum intensity projection (MIP), Fig. 1a of an excised human heart acquired with a CRA system and a MIP, Fig. 1b of a simulated CT image. The simulated CT image was created by the numerical environment using Fig. 1a as the input, re-projecting 129 view angles over 200o through the volume by mimicking the CRA parameters and reconstructing the simulated projections. The effect of a decreased number of views on 3D coronary angiography was investigated and the results are shown in Fig. 2. Figure 2a is identical to Fig. 1b, the simulated 3D coronary angiogram reconstructed using 129 views. To demonstrate the effect of a decreased number of views simulated angiograms were reconstructed using 65 views (Fig. 2b) and 33 views (Fig. 2c). The arrows in Fig. 2 show the decrease in detail of the smaller coronary vessels going from (a) to (c). The decreased contrast-to-noise ratio in the images is due to the decreased number of views. Gating strategies (I) and (II) were implemented using the dynamic model during the i) early and ii) mid dias-
Fig. 2. The effect of a decreased number of views on 3D coronary angiography with arrows indicating the decreased detail in smaller coronary vessels. Shown are (a) MIP of the simulated volume reconstructed using 129 views (b) MIP reconstructed using 65 views and (c) MIP reconstructed using 33 views.
Simulation of Intra-operative 3D Coronary Angiography
273
tole phases of the cardiac cycle. Figure 3 shows the result of gating strategy (I), acquired using a 200-ms acquisition window. Figures 3a,b show MIPs through the images created in early diastole, and similarly Figs. 3c,d show MIPs obtained through volumes reconstructed from projections acquired during mid diastole. Figures 3a-c and Figs. 3b,d were created using 129 views (or 22 cardiac cycles x 6 views per cardiac cycle) and 65 views, which corresponds to 22 cardiac cycles and 11 cardiac cycles, respectively. The effects of cardiac motion are seen Figs. 3a,b, where the heart was moving too rapidly at the end of Fig. 3. MIPs through the volumes created using gating systole and the beginning of strategy (I), a 200-ms acquisition window. Shown are (a) 129 views, and (b) 65 views acquired during early diastole, diastole to acquire 3D coro- and (c) 129 views, and (d) 65 views acquired during mid nary angiograms of sufficient diastole. quality. Figures 3c,d, acquired during mid diastole show improved image quality. The heart was relatively stationary during the mid diastole acquisition showing that a reasonable 3D coronary angiogram can be acquired in 11 cardiac cycles, Fig. 3d. Figure 4 shows the result of gating strategy (II), acquired using a 100-ms acquisition window. Figures 4a,b show MIPs through the images created in early diastole, and similarly Fig. 4c,d show the mid diastolic phase of the cardiac cycle. Figure 4a,c and Figs. 4b,d were created using 129 views and 65 views, which corresponds to 43 cardiac cycles and 22 cardiac cycles respectively. The 100-ms acquisition reduces the motion effects seen in Figures 4a,b, compared to Figures 3a,b acquired with the longer 200-ms acquisition window. However, Figures 4a,b and Figures 4c,d require 43 cardiac cycles and 22 cardiac cycles respectively, each of which are longer than the 11 cardiac cycles required for sufficient image quality, Fig. 3d.
4
Summary and Conclusion
We have demonstrated the capability of a numerical simulation environment to simulate different acquisition and gating strategies. This numerical environment will assist in the development of an intra-operative 3D coronary angiography system for use during minimally invasive CAB procedures. To simulate 3D coronary angiography
274
G. Lehmann et al.
we have implemented a forward projection algorithm, in combination with a dynamic model to create 2D projections using different gating acquisition strategies. The 2D projections are then reconstructed to create a simulated 3D CT image. Results from this preliminary study indicate that acquisition windows of 100-200 ms produces good-quality 3D angiograms. This acquisition window is similar to those used with retrospectively gated multi-slice spiral CT scanners. The data presented here were based on a specific dynamic model of the coronary circulation. Fig. 4. MIPs through the volumes created using gating strategy (II), a 100-ms acquisition window. Shown are (a) 129 However, there is great views, and (b) 65 views acquired during early diastole, and (c) diversity among the patient 129 views, and (d) 65 views acquired during mid diastole. population undergoing minimally invasive CAB procedures. To better simulate 3D coronary angiography, our future work will include extracting the motion of different human hearts to allow a variety of different gating strategies to be tested for both healthy and diseased hearts. Further studies will also include quantifying image quality, as well as ROC analysis to determine the optimum method for obtaining 3D coronary angiograms using intra-arterial injections during intra-vascular therapy procedures. These encouraging studies have indicated that it should be feasible to develop a gated acquisition strategy that can be used to acquire intra-operative 3D coronary angiograms. These angiograms can be incorporated within the VCSP for assisting initial placement of the ports and instruments, intra-procedure verification of port and instrument placement and the success of a bypass graft in MIRCAB procedures.
Acknowledgments We acknowledge Drs. A.J. Dick and M. Quantz for their clinical assistance, Chris Norley and Hristo Nikolov for their help acquiring the CRA images, as well as David Gobbi for the helpful discussions. Partial financial support for this work from the Canadian Institutes for Health Research and the Heart and Stroke Foundation of Canada is gratefully acknowledged.
Simulation of Intra-operative 3D Coronary Angiography
275
References 1. Chiu AM, Dey D, Drangova M, et al. "3-D Image Guidance for Minimally Invasive Robotic Coronary Artery Bypass." Heart Surg Forum 3, 224-31 (2000). 2. Lehmann G, Chiu A, Gobbi D, et al. "Towards dynamic planning and guidance of minimally invasive robotic cardiac bypass surgical procedures" MICCAI (2001). 3. Fahrig R, Fox AJ, Lownie S, et al. "Use of a C-arm system to generate true threedimensional computed rotational angiograms: preliminary in vitro and in vivo results" AJNR Am J Neuroradiol 18, 1507-14 (1997). 4. Fahrig R and Holdsworth DW. "Three-dimensional computed tomographic reconstruction using a C-arm mounted XRII: image-based correction of gantry motion nonidealities" Med Phys 27, 30-8 (2000). 5. Lehmann GC, Gobbi DG, Dick AJ, et al. "Dynamic Three-Dimensional Model of the Coronary Circulation" Proc. SPIE Med. Imaging (2001). 6. F.L. Bookstein. "Principal warps: Thin-plane splines and the decomposition of deformations" IEEE Trans Pattern Analysis and Machine Intelligence, PAMI-1, 567-85 (1989). 7. Python programming environment. www.python.com. 8. Atamai Inc. www.atamai.com. 9. The Visualization Tool Kit (VTK). www.kitware.com . 10. Siddon RL. "Fast Calculation of the exact radiological path for a three-dimensional CT array" Med. Phys. 12, 252-5 (1985). 11. Feldkamp LA, Davis LC, and Kress J.W. "Practical cone-beam algotithm" J. Opt. Soc. Am. 1, 612-9 (1984).
Computer Investigation into the Anatomical Location of the Axes of Rotation in the Normal Knee S. Martelli and A. Visani Istituti Ortopedici Rizzoli, Lab. Biomechanics, Via di Barbiano 1/10, I-40139 Bologna, Italy
[email protected] http://www.ior.it/biomec/ Abstract. The purpose of this paper was to investigate the anatomical location of the main axes of rotation of the knee. The study was performed as follows: joint motion registration by an electrogoniometer; joint digitization by the same electrogoniometer; computer reconstruction of motion and bone geometry; comparison of 4 hypotheses on correlation between the axis of flexion–extension and the femoral anatomy and between the axis of longitudinal rotation and the tibial anatomy in 6 human knees using an original geometrical interpolation. Our results suggest that: (a) the axis of flexion-extension lies in a cone spanned by the transepicondylar line and the so called FFc line [1]; (b) the axis of longitudinal rotation can be represented by a line parallel to the tibial anatomical axis intersecting the flexion axis in the medial compartment. Such a simplified frame for the representation of knee rotations may be useful and easily computed in computer-assisted knee reconstruction. Keywords: knee, axes, rotation, anatomy, kinematics, motion, PROM
1
Introduction
The increasing number of computer assisted surgical procedures have allowed surgeons to demand more information about the knee biomechanics and kinematic predictions. Therefore new methods to evaluate the knee motion are necessary, based on quantitative evidence and functional acquisitions. Despite agreement in literature about the identification of the two main components of knee kinematics (the flexion – extension and the longitudinal rotation), there is still no agreement about the anatomical location of the axes of rotation and therefore the method to identify them. Many authors have identified the axis of flexionextension in relationship to the femoral anatomy. [1], [2], [3], [4], [5], [6], [7], [8], [9] Similarly many authors have identified the axis of longitudinal rotation in relationship to the tibial anatomy, most reporting its orientation to be parallel to the tibial anatomical axis [2], [ 3], [5], [7] in the medial compartment, [1], [4], [5], [6], [7], [8], [10], [11], [12], [13] sometimes proposing an anterior-posterior slope [10], [11] or complex pattern of motion in the tibial plateaux. [1], [2], [11], [14] At present most scientists represent the 3D nature or the 6 degrees of freedom of the knee in a reference frame made of the flexion – extension axis fixed to the femur, the longitudinal axis fixed to the tibial anatomy, and a third axis to complete the definition of a suitable joint reference frame. [2], [7], [15], [16], [17] Therefore the anatomical location of the axes of the knee can improve not only geometrical measurements but also the kinematic evaluation in computer-assisted applications. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 276–283, 2002. © Springer-Verlag Berlin Heidelberg 2002
Computer Investigation into the Anatomical Location of the Axes of Rotation
277
The goal of this work is to verify some hypotheses on the location of the main axes of rotation of the knee during passive motion.
2
Material and Methods
2.1
Experimental Data Acquisition
6 normal knee specimens were examined. A 6 degrees-of-freedom electrogoniometer, FARO Arm (FARO Technologies, Lake Mary, FL, USA), was used to record passive motion and digitize the articular surfaces (0.3 mm / 0.3° accuracy in 1.8 m3 ), as follows (protocol in patent pending state): 1. The tibia was rigidly fixed to the experimental desk by screws while the Faro Arm was secured to the experimental desk and rigidly connected to the mobile femur (like in the set-up of robot-assisted knee surgery in [18]). The mobile bone was held consecutively by two expert orthopaedic surgeons, who performed a passive range of motion and recorded the neutral position of the intact knee at 0°, 10°, 30°, 60°, 90° and 120°. Also the internal–external rotation at 45° and 90° was recorded with the knee in the neutral position at 45° of flexion, in the maximal internal rotation possible at 45°, in the maximal external rotation at 45°, at 90° neutral position, 90° internal rotation, 90° external rotation. The motion were repeated twice for each knee. 2. Then the FARO Arm was detached from the mobile bone and equipped with a point–probe, three landmarks were implanted on the femur and on the tibia and the joint was dissected. One surgeon digitized the shape of femur, tibia (on the cartilage surface) and ligament insertions (on the external border of the insertion areas). 3. Data about knee anatomy and motion were processed off-line by dedicated software, allowing the 3D reconstruction and display of the bone shape and of the relative position of the two segments during the recorded trajectories in an anatomical coordinates system. Surfaces were represented as clouds of points and motion was represented tracking the positions of all structures in the 3D space or in 2D projections or sections corresponding to the standard sagittal, frontal and transversal planes. 2.2
Computer Evaluation of the Axes of Rotation
Four different definitions of the axis of flexion–extension were compared (Fig. 1): • F1, the line joining the most posterior points of the posterior condyles [2], chosen on a central section of the medial and lateral femoral condyles at full extension; • F2 the line joining the most distal points on femoral condyles in extension [3], chosen on a central section of the medial and lateral femoral condyles at full extension; • F3 (transepicondylar line), the line joining the centres of the femoral insertion areas of medial and lateral collateral ligaments [4], [5], [6], [7];
278
•
S. Martelli and A. Visani
F4, the line joining the centres of the posterior femoral condyles, computed as the centres of the circle fitting the posterior part of the femoral profile in a central section of the medial and lateral compartment at extension (the so called “flexion facet centres” (FFc) in [1] ). [4], [8], [9]
Fig. 1.
Similarly four definitions of the axis of longitudinal rotation were compared. All axes have the same orientation (parallel to the tibial shaft), but different location and patterns of motion with respect to the joint: • T1, the vertical line passing through the centre of the medial posterior condyles, i.e. intersecting F1; • T2, the vertical line passing through the centre of the medial anterior condyles, i.e. intersecting F2; • T3, the vertical line passing through the centre of the medial collateral ligament, i.e. intersecting F3; • T4, the vertical line passing through the centre of medial femoral flexion facet, i.e. intersecting F4. To verify the hypothesis that the knee rotates around one of the known axes, we used a geometrical method inspired by the one presented in [9], [11], [19], that we realised through computer elaboration and statistical evaluation. As all points in a mobile rigid body rotate around the instantaneous axis of rotation, the projection of their positions during the motion onto a plane perpendicular to the axis will track a circle or part of a circle. In this study we chose 10 points (named P1 to P10) distributed in the two compartments at increasing distances from the presumed axis of rotation; we tracked their projection during the recorded motion onto the plane perpendicular to the axis of rotation; then we computed the circle with its centre on the projection of the axis and radius equal to the mean distance from the tracked positions of the point (Fig. 2). The residual of this fitting was considered as an indication of how correct the identification of the axis of rotation is (the less the residual, the better the circularity of the motion of the tracked point).
Computer Investigation into the Anatomical Location of the Axes of Rotation
279
We used this method to verify the hypothesis that the knee rotates around the axes F1/T1, F2/T2, F3/T3 or F4/T4, during passive range of motion (PROM) and maximal internal – external rotation at 45° and 90°.
Fig. 2.
3
Results
The relative location of the examined axes was compared by computing the 3D angle made by pairs of F1, F2, F3, F4 axes (Table 1) and the distance among pairs of T1, T2, T3, T4 projected onto the horizontal plane (Table2). All tracked points appeared to move around the mentioned axes in a 120° arc of a path looking quite circular. However a more detailed analysis of the circularity of these paths (reported in Table 3) showed some statistical differences in the behaviour of F1, F2, F3 and F4. Table 1. Angles (in degrees) between the couples of axes reported in the heading of each column, computed as the arcosin of the dot between the unitary vectors of axes Knee 1 Knee 2 Knee 3 Knee 4 Knee 5 Knee 6 MEAN
F4-F3 11.6 5.6 4 8.1 8.4 5.6 7.2 (2.7)
F4-F2 10.5 15.1 5.2 8.8 9.9 14.6 10.7 (3.7)
F4-F1 3.7 5 2.5 3 0.9 2.1 2.9 (1.4)
F3-F2 4.6 9.8 3.3 5.6 5.8 11.1 6.7 (3.0)
F3-F1 8.2 4.6 5.9 6.2 9 4.3 6.4 (1.9)
F2-F1 8.2 13.8 5.8 8.9 10.7 15 10.4 (3.5)
280
S. Martelli and A. Visani
Table 2. Mean distance (in millimetres) during PROM (average of the recorded PROMs) of the projection of T1, T2, T3, T4 axis onto the horizontal plane. Each column reports the distance between the specified couple Knee 1 Knee 2 Knee 3 Knee 4 Knee 5 Knee 6 MEAN
T4-T3 16.8 18.4 20 21.2 19.3 21.1 19.5 (1.7)
T4-T2 18 19.9 22.2 18.5 19.2 22.1 20.0 (1.8)
T4-T1 13.6 12.5 16.1 13 12.4 13.7 13.6 (1.4)
T3-T2 24.6 26.4 29.2 25.1 27.7 31.9 27.5 (2.7)
T3-T1 26.1 25.4 30.7 25.3 24.1 27.8 26.6 (2.4)
T2-T1 27.3 29.3 33.9 27.6 28.8 31.4 29.7 (2.5)
Table 3. Residuals (in millimetres) of the circular fitting for each knee in planes perpendicular to F1, F2, F3, F4. Each value is computed as the mean (and standard deviation) of all sample points (P1 – P10) and all recorded passive ranges of motion Knee 1 Knee 2 Knee 3 Knee 4 Knee 5 Knee 6 MEAN
0.1 0.2 1.2 0.9 1.2 0.4
F1 (0.07) (0.20) (0.44) (0.32) (0.35) (0.11)
0.67
F2 0.3 (0.11) 0.4 (0.34) 1.5 (0.39) 1.1 (0.33) 0.8 (0.27) 0.5 (0.14) 0.77
0.1 0.1 0.6 0.4 0.3 0.1
F3 (0.06) (0.13) (0.37) (0.25) (0.21) (0.09)
0.27
0.1 0.2 0.6 0.7 0.6 0.2
F4 (0.07) (0.18) (0.35) (0.39) (0.34) (0.16)
0.40
To identify the optimal axis of longitudinal rotation during IE rotation at 45° and 90° we considered tracks of points during internal – external rotations and during PROM. All points appeared to move around T1, T2, T3 and T4 in a 29° ± 2° arc of a circular path. However no significant differences could be found among the compared axes, as all tracks were fitted by a circle with a mean residual (over the six knees and the 10 sample points) of 0.45 (0.12) mm at 90° and 0.85 (0.15) mm at 45°. To identify the optimal axis of longitudinal rotation during PROM (i.e. screwhome mechanism) we must consider that IE rotation is a secondary component of this movement, therefore it can be identified only by splitting it from the simultaneous rotation around the flexion – extension axis. Therefore during PROM we verified the coupled IE rotation around T1, T2, T3 and T4 tracking 10 points on each flexionextension axis (F1, F2, F3, F4). These points, fixed during pure flexion, revealed a circular pattern around the tibial axis during the PROM, without the artefacts present in the tracking of random points, but a detailed analysis of the circularity of these paths showed no significant statistical differences in the behaviour of T1, T2, T3 and T4.
Computer Investigation into the Anatomical Location of the Axes of Rotation
4
281
Discussion and Conclusion
This study confirms that the axis of flexion- extension of the knee lies in a cone made by the transepicondylar line and the line joining the femoral flexion-facet centres. Table 3 shows that there is no significant difference between these two axes, if we consider an error on anatomical data of around 1.5 mm (i.e. a residual of 0.5 mm with a 99.7% probability), mostly assessed in in-vivo applications. The equivalence between the transepicondylar line and FFc line, and therefore all lines in-between, may be explained by the uncertainty about anatomical data (bone shapes, ligaments’ insertions), by an equivalent contribution of ligaments and bone shapes in guiding PROM, or by small changes in the position of the instantaneous axis in different sub-ranges of flexion. It is interesting to notice that the orientation of F1, F2, F3, F4 is very similar (less than 11°, Table 1), but the position of the axis around the condyles’ centre instead of the surface appears more satisfactory. Therefore our study suggests that a kinematic frame of the knee with reference axes aligned with transepicondylar or FFc line, as proposed by Pennock [7], Hollister [11] or Pinskerova [1], may be more correct than the reference frame proposed by Grood and Suntay [2] or Lafortune [3] and reduce cross-talk errors in kinematic computations due to axes misalignment. [20], [21] This study also confirms that the longitudinal rotation occurs around a vertical axis located in the medial compartment, both during forced rotation at 45° and 90° and during PROM. It can be noticed that the residuals (and standard deviation) measuring the circular paths in both motions around T1, T2, T3, T4 are higher than the residuals measuring the circular paths around F1, F2, F3, F4. This was due to the fact that the amount of longitudinal rotation (and the arc fitted) is much smaller than the amount of flexion (30° versus 120°), and therefore the computation of the circle is more unstable from a numerical point of view. The higher residuals obtained during the forced internal–external rotation with respect to PROM and IE rotation at 90° with respect to 45°can be due to the fact that this movement is performed manually and may be affected by a non-negligible flexion during the manoeuvre, producing artefact in the circular tracks. Probably for these reasons this method was not able to discriminate the exact location of the longitudinal axis in the medial compartment (for example the central position with respect to location on the medial tibial spine). However, it showed that the location of the axis is in the medial compartment, in the central or posterior area of the tibial plateau (the T2 residual is higher than the others in Table 2) and moves with the femur keeping the orientation fixed with respect to the tibia. We can conclude that our results are compatible with a kinematic model of the knee rotating around the transepicondylar line and an axis parallel to the tibial one through the centre of MCL or around the FFc line and an axis parallel to the tibial one through the centre of the medial femoral flexion facet. The former (F3/T3) is similar to the model proposed by Hollister [11], but simplifies the description of the axis of longitudinal rotation. The latter (F4/T4) is similar to the model suggested by Pinskerova [1], who deduced the location of FE axis from anatomical observations and measured a small displacement of the medial flexion facet centre (i.e. the intersection of F4 and T4), but applies both in PROM and IE rotations. In both models the correlation between the longitudinal axis and the flexion – extension axis is simpler than previous models in literature.
282
S. Martelli and A. Visani
This model simplifies the computation of the knee reference frame from anatomical data, provide a unique the kinematic description in passive motion and forced IE rotations and a fix and predictable correlation between the main axes of rotation. This representation of the knee motion could be easily applied in navigated surgery and computer-assisted procedures. Moreover, the similarity of our results with Churchill’s study [4] in the loaded motion of the knee (although with a different technique), could suggest a similar behaviour also in active flexion. A finer validation of a 2 degrees-of-freedom knee kinematic model, and maybe a better discrimination between F3/T3 and F4/T4, if any, is not possible with the reported geometrical method. A mathematical representation of proposed model of the knee motion, as a double rotation around quasi–perpendicular axes intersecting in the medial femoral condyle or the implied presence of an origin behaving as a fixed point during motion, will be investigated by the author in the future, to find a simple, clinically interpretable and complete representation of the knee kinematics and to use in computer evaluation of the joint.
Acknowledgements The author wishes to acknowledge Dr. V. Pinskerova (University Charles, Prague, RC) and Dr. M. Freeman (University of London, UK), for their help in the acquisition of the experimental data and the comparison with their previous data.
References 1. Pinskerova, V., Iwaki, H., Freeman, M.: The Shape and Relative Movements of the Femur and Tibia in the Unloaded Cadaveric Knee: a Study Uusing MRI as an Anatomical Tool. In: Surgery of the Knee, 3rd edition J.N. Insall and W.N. Scott, Philadelphia: Saunders Inc. 2. Grood, E.S., Suntay, W.J.: A Joint Coordinate System for the Clinical Description of Three-Dimensional Motions: Application to the Knee. Journal of Biomechanical Engineering, Vol. 105 (1983) 136-144. 3. Lafortune, M.A.: The Use of Intra-Cortical Pins to Measure the Motion of the Knee During Walking. Ph.D. Thesis, Pennsylvania State University (1984). 4. Churchill, D.L., Incavo, S.J., Johnson, C.C., Beynnon, B.D.: The Transepicondylar Axis Approximates the Optimal Flexion Axis of the Knee. Clin Orthop, Vol. 356 (1998) 111118. 5. Fischer, O.: Kinematik Organischer Gelenke. F. Vieweg und Sohn, Braunschweig (1907). 6. Kaplan, E.B.: Some Aspects of Functional Anatomy of the Human Joint. Clin Orthop, Vol. 23 (1962) 18-29. 7. Pennock, G.R., Clark, K.J.: An Anatomy-Based Coordinate System for the Description of the Kinematic Displacements in the Human Knee. Journal of Biomechanics, Vol. 23 (1990) 1209-1218. 8. Langer, K.: Das Kniegelenk des Menschen. In: Sitzungsberichte der Akademie der Wissenschaften. Mathematisch – Naturwissenschaftliche Classe, Bde 2, 3. Karl Gerolds Sohn, Wien (1858) 99. 9. Todo, S., Yoshinori, K., Teemu, M. et al.: Anteroposterior and Rotational Movement of Femur During Knee Flexion. Clin Orthop, Vol. 362 (1999) 162-170. 10. Fick, R.: Mechanik des Gelenkes, in Handbuch del Anatomie und Mechanik der Gelenke. Gustav Fischer, Jena (1911).
Computer Investigation into the Anatomical Location of the Axes of Rotation
283
11. Hollister, A.M., Jatana, S., Singh, A.K., Sullivan, W.W., Lupichuck, A.G.: The Axes of Rotation of the Knee. Clinical Orthopaedics and Related Research, Vol. 290 (1993) 259268. 12. Barnett, C.H.: Locking at the Knee Joint. J Anat, Vol. 87 (1953) 91-95. 13. Shaw, J.A., Murray, D.G.: The Longitudinal Axis of the Knee and the Role of the Cruciate Ligaments in Controlling Transverse Rotation. J Bone Joint Surg Am, Vol. 56[8] (1974) 1603-1609. 14. Bugnon, E.: Le Mécanisme du Genou. Viret-Genton, Lausanne, CH (1892). 15. Chao, E.Y.S.: Justification of Triaxial Goniometer for the Measurement of Joint Rotation. Journal of Biomechanics, Vol. 13 (1980) 989-1006. 16. Martelli, S., Zaffagnini, S., Falcioni, B., Marcacci, M.: Intraoperative Kinematic Protocol for Knee Joint Evaluation. Computer Methods and Programs in Biomedicine, Vol. 62 (2000) 77-86. 17. Wu, G., Cavanagh, P.R.: ISB Recommendations for Standardization in the Reporting of Kinematic Data. Journal of Biomechanics, Vol. 28 (1995) 1257-1261. 18. Martelli, M., Marcacci, M., Nofrini, L., La Palombara, P.F., Malvisi, A., Iacono, F., Vendruscolo, P., Pierantoni, M.: Computer -and Robot- Assisted Total Knee Replacement: Analysis of a New Surgical Procedure. Annals of Biomedical Engineering, Vol. 28 (2000) 1146-1153. 19. Walker, P.S., Shoji, H., Erkman, M.J.: The Rotational Axis of the Knee and its Significance to Prosthesis Design. Clin Orthop, Vol. 89 (1972) 160-70. 20. MacWilliams, B.A., Des Jardins, J.D., Wilson, D.R., Romero, J., Chao, E.Y.S.: A Repeatable Alignment Method and Local Coordinate Description for Knee Joint Testing and Kinematic Measurement. J Biomech, Vol. 31 (1998) 947-950. 21. Piazza, S.J., Cavanagh, P.R.: Measurement of the Screw-Home Motion of the Knee is Sensitive to Errors in Axis Alignment. J Biomech, Vol. 33 (2000) 1029-1034.
Macroscopic Modeling of Vascular Systems Dominik Szczerba and G´ abor Sz´ekely Computer Vision Lab, ETH, CH-8092 Z¨ urich, Switzerland
[email protected] http://www.vision.ee.ethz.ch/∼domi
Abstract. Angiogenesis, the growth of vascular structures, is an extremely complex biological process which has long puzzled scientists. Better physiological understanding of this phenomenon could result in many useful medical applications, from virtual surgery simulators for medical interventions, to cancer therapy, where e.g. influence of certain factors on the system could be simulated. Although there is a lot of research being done on blood circulatory systems and many models with high level of mathematical sophistication have already been proposed, most of them offer very modest visual quality for the resultant vascularities. This report is a proposition of a macroscopic model allowing for generation of various vascular systems with high graphical fidelity for simulation purposes. Keywords: computer model, numerical simulation, angiogenesis, vascular system, capillary plexus, blood vessels, remodeling, hemodynamics.
1
Introduction
With the rapid progress on the field of computer technologies it is becoming now possible to build more and more realistic virtual simulators for medical training or surgery planning purposes. Modern graphic display devices offer new opportunities for medical visualization and constantly increasing computational power allows for heavy mathematical modeling needed by real-time simulations of human body. Virtual reality based surgical simulators require efficient biomechanical and physiological models of the organs as well as algorithms capable of providing convincing visualization of the anatomy. Realistic models of the vascularization is one of the most important requirements. Especially abdominal organs are highly perfused by blood vessels which become crucial for virtual simulations of any laparoscopic surgical interventions. Vascular systems are not just strongly influencing organ appearance as part of their visible surface texture, but also behave like deformable organs with certain mechanical properties. They will also lead to bleeding when cut through. The final goal is therefore, to provide methods which, given an intuitive set of parameters, will generate vascular structures in an arbitrary abdominal organ. Such systems should not only carry geometrical information but also provide data on elasto-mechanical properties of the vascular system and the related blood flow. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 284–292, 2002. c Springer-Verlag Berlin Heidelberg 2002
Macroscopic Modeling of Vascular Systems
285
There has been a lot of activity in modeling of vascular systems. A group of models based on physiology has been proposed by A.Anderson and M.Chaplain [1]. Angiogenesis as formation of blood vessels is modeled there as a process where capillary sprouts are formed in response to externally supplied chemical stimuli. The sprouts then grow, develop and organize themselves into branching structures by means of endothelial cell proliferation and migration. The models take into account essential endothelial cell-extracellular matrix interactions via the inclusion of the macromolecule fibronectin. The models consist of a system of nonlinear partial differential equations describing the response of endothelial cells to growth factors and the fibronectin in the extracellular matrix. Although these models are very advanced from biological modeling point of view they still do not fully address all the biophysical issues like flow induced shear stress and its influence on the endothelial cells for example. Even if offering good qualitative understanding of the biological process, the models are not capable of producing any visually convincing results. The outcome is mostly in form of concentration contours or density maps and gives very poor, non-intuitive graphical output. An interesting model has recently been proposed by R. G¨odde and H. Kurz [2]. It covers all important biophysical properties of the flow and accounts for hemodynamic peculiarities like non-newtonian properties of blood. In the first phase an initial growth is performed by means of random assembly of vascular elements, starting from the predefined center and periphery. Then hemodynamic remodeling rules are applied to this pre-generated system of bifurcations which is subsequently grown or degenerated accordingly. Although the graphical output of this advanced model is a little more convincing, it still provides far too regular and too predictable structures not accounting for the variety of natural patterns due to the very simple topology defined by a two-dimensional isometric hexagonal grid. A more visually oriented group of models have been used by the work of W. Schreiner and P. Buxbaum [3]. In so called constrained constructive optimization they were adding new terminal segments to an initial root of a tree according to a set of bifurcation rules. The simulation, performed on a twodimensional domain, is a global optimization routine for a target function being typically the total vessel volume of the tree. Although the model leads to a homogeneous perfusion of the tissue and an optimal blood flow, it turns out to be too simple to generate complex and unpredictable structures encountered in natural vascular systems. The resulting patterns are too regular and visually not convincing enough. A. Lindenmayer [4] proposed a mathematical model of cellular interaction known as L-systems. An L-system is an approach which can construct complex objects by subsequent replacements of parts of an initial object using a set of rewriting rules. The concept has been continuously developed over the years resulting in its many extensions, e.g. stochastic, context sensitive or parametric. The stochastic L-systems were used to generate vascular systems by V. Meier [5]. The algorithm he used consisted of two phases. First only a bifurcation pattern was generated from the root toward the leaves. Then, in the second
286
D. Szczerba and G. Sz´ekely
phase, geometric information of lengths and radii of individual segments was added. This recursive approach is able to produce reasonable vascular structures, they are, however, limited to tree topology and are still not very diverse in visual patterns compared with real cases. Another disadvantage is that the model incorporates no biophysics at all: there is no information on the distribution of pressures, flows and stresses in the system. Another approach has been described by the same author in order to combine technical methods with some biological aspects. A vessel generation algorithm has been proposed based on simple physiological mechanisms. The metabolic activity of the tissue is represented by a scalar field which depicts oxygen consumption and carbon dioxide production of the tissue. In each simulation step the current O2 and CO2 concentrations are computed from perfusion of the tissue by the arterial and venous system. Based on their comparison to the metabolic activity, cells are producing certain biochemical transmitters that can either stimulate or inhibit the local growth of the system. The advantage of this approach is that the growth process is not only treated as insertion of new bifurcations but also allows for fusion of colliding vessels resulting in a network topology of the generated system. In real vascular structures this phenomenon is widely encountered and is known as anastomosis. The disadvantage of this model, however, is still the very little biophysical knowledge incorporated and the omission of capillary plexus formation, although this stage in the maturation of vessels has been experimentally proved to be crucial [6]. Moreover, the blood pressure and flow is not addressed at all and oxygen transport is simply a function of a segment’s radius. While the generated visual patterns are of high quality, they are neither diverse nor complex enough to cover the whole range of real physiological cases. There is therefore still a strong need for a more advanced procedure offering sufficient understanding of biophysical properties on one hand, and good visual quality on the other. The new extended model should include the formation of primitive capillary plexus prior to maturation of the vascular system and treat its later development as a dynamic growth controlled by biophysical factors. This way the dynamic remodeling of the vascular system can be applied, and full information on biophysical properties of the system can be provided at any time. The simulation should take into account existing experimental knowledge of the growing process, namely endothelial cell proliferation, migration and vessel retraction [7], non-newtonian properties of the blood and effects of flow induced shear stress.
2
Definition of the Model
In order to supply a tissue with oxygen and nutrients, and to take away its metabolic wastes, new blood vessels penetrating the growing tissue are formed. This process is called angiogenesis and is one of the most crucial processes taking place in living organisms. In the first stage, as a response to the tissue’s oxygen demand, a primitive capillary plexus is created and a tissue becomes
Macroscopic Modeling of Vascular Systems
287
covered/penetrated with a preliminary capillary bed. In the first approximation it is very reasonable to assume that the capillary plexus formation: (a) is directly controlled by so called Angiogenesis Growth Factors (AGF): the higher the oxygen demand, the higher AGF concentration, sprouting rate and thus the penetration, and (b) leads to effectively random connections between capillaries. Detailed issues like different AGF response from arterial or venous cells will not be considered here. In the next stage the network grows and remodels in order to assure the optimal blood supply and disposal of wastes, and that’s where the main differentiation of vascularity’s random shape comes into play. The third process contributing to the final pattern of the vascular system are external forces or forces resulting from the shape change of the growing tissue. It happens very often that the development is very rapid and actually all the three processes are either overlapping or even taking place at the same time. The proposed model consists basically of two parts:(i) creation of a preliminary capillary plexus and (ii) vascular growth according to biophysical and hemodynamic rules. External forces and shape change of the tissue will not be addressed in this version of the model. The first part should be realized by a general purpose generator capable of producing random network structures with adjustable patterns. The input for the algorithm should contain information corresponding to a metabolic map of the tissue (oxygen demand and disposal of wastes). This way the pattern of the resultant vascular structure will depend on the tissue’s function and its metabolic rate, which is much different in muscles or nerves (where oxygen demand is high) than in cornea (which plays mostly a structural role). From the technical point of view the capillary network is handled by a graph of connections and is represented by an adjacency matrix containing information on connections between the network’s individual nodes. Each connecting vessel is ascribed a flow conductance dependent on the geometry of the vessel. The vessels are assumed to be straight elastic pipes of radius r and length L. In case of a laminar flow through a pipe, the ratio between the flow and the pressure difference inducing it is constant (so called Hagen-Poiseulle’s law) and, based on analogy to electricity, called conductance G: G=
πr4 Q = , ∆p 8ηL
(1)
where Q is a flow induced by a pressure difference ∆p, G is a vessel’s conductance, η stands for blood viscosity and L for a vessel’s length. Note, that in case of nonNewtonian fluid like blood, the viscosity is non-constant and becomes a function of a vessel’s radius (η = η(r)), a phenomenon known as Fahraeus-Linquist effect. In the presented study a rough fit on experimental data has been used. In order to calculate the hemodynamic variables needed to control the growth of such a capillary network, the nodal analysis from the theory of electrical circuits has been customized. In this method the network behavior is described by a matrix equation of the form: Q=G·p ,
(2)
288
D. Szczerba and G. Sz´ekely
where Qi is a source flow entering ith node, pi is a pressure at ith node and G is so called Nodal-Admittance Matrix derived directly from the network’s adjacency matrix carrying the conductance information on the system’s structure. The problem of finding individual pressures is therefore equivalent to finding the inverse of the Nodal-Admittance Matrix allowing then to derive all further biophysical and hemodynamic quantities like flow, wall tension, shear stress, oxygen transport etc. This information can then be used for remodeling and the system can grow according to the following rules: 1. When a certain threshold is exceeded a vessel may increase its diameter in order to resist the stretching force, 2. A vessel can increase its diameter only when neighboring vessels have enough endothelial cells to support vessel formation and/or when cell proliferation rate is high enough to provide sufficient number of cells, 3. Endothelial cell recruitment (migration and proliferation) depends on the shear stress at the vessel wall and a local concentration of AGF: 4ηQ ∆p · r , = πr3 2L AGF = const(O2demand − O2delivery ) . S=
(3) (4)
4. When the flow drops below a threshold the vessel can be deleted, and its cells can be contributed to the neighbors, 5. Oxygen transport through the vessel depends on the wall tension given by: WT =
pt · r , w
(5)
where pt is so called transmural pressure defined as the difference between the pressure inside and outside the vessel, and w = w0 er/r0 is the wall thickness fitted to experimental data. In order to calculate the total amount of oxygen transferred through a vessel integration over its surface must be performed. Note that this increases with the pressure and the diameter on one hand but quickly drops down with the wall thickness on the other hand. Therefore there should be an optimum in oxygen delivery for certain flow conditions. 6. When the tissue gets sufficient amount of oxygen (the system enters a dynamic equilibrium) no more AGF is being produced (no more cell recruitment) and the remodeling stops.
3
Results
A configurable generator to create random network structures has been developed in C++. The code is based on random walk algorithm and is capable of generating a wide range of different structures - from regular homogeneous grids and loops to very irregular and unpredictable patterns. The generator offers many parameters influencing the geometry of the network and an unlimited number of starting points or vessels with given initial properties. The vessels’ mean
Macroscopic Modeling of Vascular Systems
289
Fig. 1. Result from a simulation of a small test capillary system: an initial (left) and optimal (right) stage against the corresponding oxygen concentrations.
lengths and diameters, preferred global and local angles, bifurcation parameters and curvature are generated according to predefined probability distributions (uniform, normal, logarithmic normal). Any arbitrary probability distribution is also allowed facilitating future relation of growth probabilities to the metabolic map of the tissue. All the parameters can be interdependent and can also be an arbitrary function of vessel age. A collision detection of line segments has been implemented which is used for simulating fusion with certain segments (anastomosis) and/or avoiding selected vessels (e.g. to prevent so called arterio-venous malformation or to test if there is enough space to create a vessel). The program has been written in standard C++ and outputs a list of segments carrying information on a particular segment’s geometry (2D or 3D), age, and branching generation. These geometrical structures can then be visualized, studied or remodeled by a separate C++ codes using appropriate libraries. If blood vessels and capillaries are modeled as a network of very many small tubes it is possible to use this code to generate preliminary capillary beds forming bases for further growth. This approach may seem somewhat random, but firstly, by its randomness it offers a big diversity of network patterns, and secondly, the distribution of parameters can be actually related to physiology. For example, growth angles can be related to the concentration of angiogenic factors and metabolic activity of the tissue, lengths can depend on external physical forces or deformations etc. This option will be implemented in the later, refined version of the program. A code to calculate the pressure, flow and stress distributions in a flow network has been separately developed in C++. The output data of the capillary plexus generator is converted into an adjacency matrix of network connections, which can then be used to solve the biophysics of the whole system using the Nodal-Admittance Method described before. The previously described random walk algorithm was used to generate a sample vascular structure and the vnl sparse matrix linear system and vnl lsqr classes from the VXL package ([9]) were used to solve the linear problem. Figure 1 shows a preliminary result from a simulation of such a small test capillary system. The thickest vertical lines are pre-existing mother vessels. In-between there was a tissue with a simple metabolic map. First all mother vessels sprouted towards the tissue creating a capillary bed having more or less equal diameters. This corresponds to the first part of the simulation, where random connections between capillaries are established. In the second part hemodynamics driven remodeling was carried out and oxygen transfer was calculated for every iteration step. An optimum between oxygen delivery and complexity of the system has been found, like in
290
D. Szczerba and G. Sz´ekely
Fig. 2. Some structures generated by the code discussed in the text. Pictures in the second row show structures generated using collision detection. Full resolution images are available at http://www.vision.ee.ethz.ch/∼domi/angi/
normal living tissues. First, when only a capillary bed is present, the oxygen delivery is insufficient. Although the oxygen perfusion probability is high (because the walls are thinnest), there is not enough flow through the system, so not enough oxygen particles are delivered to the tissue. Growing vessels are increasing their diameters (thus increasing the flow), but as their walls become thicker the perfusion probability drops and the oxygen delivery is not any more sufficient. Figure 1 shows an initial and optimal stage of the system against the corresponding oxygen concentrations. Because of low speed of the remodeling part of the algorithm a simplified version has also been implemented. Instead of correct hemodynamic remodeling the diameters of the segments were simply related to the vessel’s age. As can be seen from resulting structures (Figure 2), even such a naive “remodeling” rule produces convincing visual results. This is biologically incorrect in general, but is very fast and can be a choice when the speed has a higher priority than physiological correctness.
Macroscopic Modeling of Vascular Systems
291
Figure 3 shows one of the resulting vascular structures mapped onto an artificial myoma generated for surgical simulation purposes [8].
Fig. 3. One of the generated vascular structures mapped onto an artificial myoma generated for surgical simulation purposes [8].
4
Outlook
It is still necessary to improve the robustness of the program. Reducing the number of parameters or possibly representing them in the form of some distribution maps would also make the algorithm more convenient to use. Tests with other algorithms to solve the linear problem are underway while first approaches to implementation of the third dimension as well as retraction and merging of neighboring vessels are planned for simulation in the future. At present, the metabolic activity of the tissue is represented by a static concentration map and arterial and venous parts of the system are coupled. It would be very interesting to see how a dynamically evolving metabolic map influences the growth of the capillary system and how the system changes if the arteries and veins are decoupled. It would also be interesting to perform tests with switching some of the remodeling factors on/off. This could allow for some simple studies on abnormal development of vascular systems. In parallel, experimental data is being gathered to compare real distributions of biophysical quantities to the ones obtained with the simulation.
Acknowledgments This work has been performed within the frames of the Swiss National Center of Competence in Research on Computer Aided and Image Guided Medical Interventions (NCCR CO-ME) supported by the Swiss National Science Foundation. Most of the calculations were performed on a standard PC running linux. During calculations and analysis VXL ([9]) and ROOT ([10]) packages have been used.
292
D. Szczerba and G. Sz´ekely
References 1. A.R.A.Anderson and M.A.J.Chaplain: Continuous and Discrete Mathematical Models of Tumor-induced Angiogenesis. Bull.Math.Biol (1998)60:857-900. 2. R.G¨ odde and H.Kurz: Structural and Biophysical Simulation of Angiogenesis and Vascular Remodelling. Devel.Dynam. (2001)220:387-401. 3. W.Schreiner and P.F.Buxbaum: Computer Optimization of Vascular Trees. IEEE Transactions on Biomedical Engineering (1993)40:482-491. 4. A.Lindenmayer: Mathematical Models for Cellular Interaction in Development. J.Theor.Biol. (1968)18:280-315 5. V.Meier: Realistic Visualization of Abdominal Organs and its Application in Laparoscopic Surgery Simulation. Diss. ETH No.13215 (1999) 6. V.G.Djonov et al.: Intussusceptive arbonization contributes to vascular tree formation in the chick chorio-allantoic membrane. Anat.Embryol(2000)202:347-357. 7. V.Nehls et al.: Guided migration as a novel mechanism of capillary network remodelling is regulated by basic fibroblast growth factor. Histochem. Cell Biol. (1998)109:319-329. 8. R.Sierra: Generation of Pathologies for Surgical Training Simulators. Accepted for publication in MICCAI 2002. 9. C++ Libraries for Computer Vision Research and Implementation (http://vxl.sourceforge.net) 10. An Object-Oriented Data Analysis Framework (http://root.cern.ch)
Spatio-temporal Directional Filtering for Improved Inversion of MR Elastography Images Armando Manduca, David S. Lake, and Richard L. Ehman MRI Research Lab, Mayo Clinic and Foundation, Rochester, MN 55905, USA {manduca.armando,lake.david,ehman.richard}@mayo.edu
Abstract. MR elastography can visualize and measure propagating shear waves in tissue-like materials subjected to harmonic mechanical excitation. This allows the calculation of local values of material parameters such as shear modulus and attenuation. Various inversion algorithms to perform such calculations have been proposed, but they are sensitive to areas of low displacement amplitude (and hence low SNR) that result from interference patterns due to reflection and refraction. A spatio-temporal directional filter applied as a preprocessing step can separate interfering waves so they can be processed separately. Weighted combinations of inversions from such directionally separated data sets can significantly improve reconstructions of shear modulus and attenuation.
1
Introduction
Magnetic resonance elastography (MRE, [1]) uses harmonic mechanical displacements as a probe of the material properties of soft tissues, spatially mapping and measuring motions with amplitudes of 1m or less. The resulting "wave images" reflect the displacement of spins due to acoustic strain wave propagation. These data allow the calculation of local values of material parameters such as shear modulus and attenuation, with possible uses ranging from tissue characterization to tumor detection and diagnosis. Wave images can be obtained at various phase offsets regularly spaced around a motion cycle. This allows extraction of the harmonic component of motion at the frequency of interest, giving the amplitude and the phase (relative to an arbitrary zero point) of the displacement at each point in space [2]. This complex displacement field is the input to the processing techniques described below. A single MRE acquisition captures only a single component of motion. However, the experiment can be repeated with three orthogonal senstitization directions. Thus, the data set acquired by MRE can be very rich: full 3D complex harmonic displacement information can be acquired at MR pixel resolution throughout a 3D volume. A variety of inversion techniques have been proposed for MRE data. Under certain assumptions (linearity, incompressibility, local homogeneity), the equation of harmonic motion simplifies to the Helmholtz equation [2], and shear modulus is T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 293–299, 2002. © Springer-Verlag Berlin Heidelberg 2002
294
A. Manduca, D.S. Lake, and R.L. Ehman
essentially determined by simple division of displacement values by their spatial Laplacian. Direct inversion (DI) uses filtered estimates of displacement and its Laplacian to perform such a division at each point. Local Frequency Estimation (LFE) adds the assumption of no attenuation and combines local estimates of instantaneous frequency over several scales. A simpler technique, phase gradient (PG), assumes propagation of a single shear wave, and simply calculates the gradient of its phase. It breaks down with any wave superposition (reflection, interference, etc.), and its use has been limited to specialized situations. By contrast, DI and LFE are based more fundamentally on the equation of motion, and depend simply on the presence of sufficient motion within the local region. Complex interference patterns from reflection, refraction, etc. pose difficulties only if they create areas of low displacement amplitude and hence low SNR. However, such conditions are frequently present in actual data, confounding these algorithms. In this paper, we propose the use of spatio-temporal directional pre-processing filters that separate complex wave fields into components, each of which can be analyzed separately.
2
Methods
An MRE experiment typically captures a number of phase offsets (usually 4-8) regularly spaced around a harmonic motion cycle. These can be thought of as snapshots of the wave field in time, although this is a simplification of the physics of the acquisition. Spatio-temporal filters can be designed in frequency space to select portions of the wave field propagating in specific directions. Since the filtering is a linear operation, the output of such filters satisfies the same equation of motion (under the assumptions above) as the original data. However, such filtering can yield simplified wave fields with less interference that can be analyzed more easily, as shown below. Fig. 1 illustrates a sample wave image from a phantom object with four stiffer cylindrical inclusions perpendicular to the slice. Acoustic shear waves are introduced from the top of the image and propagate downward, but reflect off the cylinders and the boundaries of the phantom, giving rise to interference patterns. The effect of selecting waves propagating in the top-down and bottom-up directions from the filtering approach described below is shown in the center and right panels. These data can then be processed individually, and areas of interference are reduced. The filters used are a product of radial and spatially directional components, and are oriented in time as well. The data is first transformed to a 3D frequency space (2 spatial directions, one temporal). A wave with a particular spatial frequency (denoted as ky with no loss of generality, to correspond with the top-down wave in Fig. 1) is represented by two peaks in frequency space: one at +ky in the positive temporal halfspace, and the other at –ky in the negative temporal half-space). The wave traveling in precisely the opposite direction (bottom-up, in this example) is represented by symmetrically opposite peaks (+ky in the negative temporal half-space, etc.). The two can thus be easily separated. The radial components are Butterworth bandpass filters designed to cut off very low and very high frequencies (see profile in Fig. 2). These have been used in the
Spatio-temporal Directional Filtering
295
past as a pre-processing step: the high frequencies contain only noise, while removing very low frequencies can aid in removing longitudinal wave and bulk motion effects. The image in the left of Fig. 1 has undergone this radial bandpass filtering.
Fig. 1. Left: Original bandpass filtered wave image (for one of eight phase offsets). Center: Data after top-down directional filtering. Right: Data after bottom-up directional filtering
The radial component is then multiplied by a spatial directional component with a cos2 dependence in the half-space about a selected direction, and zero in the opposite half-space, as illustrated in Fig. 2. This is similar to the orientation dependence of the filters used in the LFE, and has the useful property that, when performed in orthogonal directions, it decomposes the signal into components that can simply be added back together to form the original signal [3]. This filter is applied to the first positive temporal frequency plane (kt = +1). The other temporal frequency planes are discarded: the kt = -1 plane has conjugate symmetry and thus is not needed; while the other planes contain information at frequencies other than at the mechanical driving frequency that is the frequency of interest. These other harmonics can be filtered and kept if desired, but this is not usually the case.
Fig. 2. Left: Profile of the radial component (Butterworth bandpass) component of the filters. Center: Directional component (cos2 dependence from the selected direction). The filter is zero in the opposite half-space. Right: The filter in frequency space
An additional example of the decomposition offered by these filters is presented in Fig. 3, for a synthetic data set consisting of waves radiating sinusoidally outward from a central point. Two of the four directional components are shown; all four added together would yield precisely the original data set.
296
A. Manduca, D.S. Lake, and R.L. Ehman
Fig. 3. Left: One sample phase offset of a synthetic data set with waves radiating radially outward. Center: The bottom-to-top directional component as extracted from the filters described. Right: The left-to-right directional component
3
Results
Data sets were acquired for a gel phantom with 4 stiff cylindrical inclusions ranging from 5 to 25 mm in diameter (this is the source of the wave data in Fig. 1). The data were bandpass filtered, and then inversions were performed with all three processing techniques (LFE, DI, PG; Fig. 4, top row). The data were then passed through a topdown directional filter as described above (and illustrated in Fig. 1, center). The inversions were performed again on the directionally filtered data (Fig. 4, bottom). The effects of directional filtering in Fig. 4 vary strongly with the processing algorithm. The LFE result (left column) shows only minor improvements from the directional filtering; this is to be expected since the filters resemble processing already present in the LFE [3]. The direct inversion (DI) results are significantly improved by directional filtering. The inclusions are sharp and clearly visible, the artifact at region borders are eliminated, and the artifacts in the low amplitude areas with significant interference near the bottom of the image are greatly reduced. Finally, the phase gradient (PG) results are dramatically improved. This technique explicitly assumes a single wave and simply tracks its phase. Since this approach quickly breaks down when this assumption is violated, it has been used only in very restricted situations. The directional filter provides sufficient wave separation that the “single wave” assumption, though still simplistic, is now closer to reality, and the algorithm now yields far more useful results.
Spatio-temporal Directional Filtering
297
Fig. 4. Top row: The LFE, DI and PG reconstructions of shear modulus from the bandpass filtered data. Bottom row: the LFE, DI and PG reconstructions from the top-down directionally filtered data
The above results use a single top-down directional filter, since that is the main propagation direction in the phantom. In more general applications, especially in vivo, there may be no such preferred direction. In such a case, multiple filters in orthogonal directions may be applied and processed separately, and the results combined, weighted by the relative energy in each filtered data set at each pixel. Fig. 5 confirms that this technique works well on the phantom, even though two of the four directions contain mostly noise. Additionally, one could ask how sensitive the results are to the filter orientations in preferred directions (e.g., the top-down direction in this example). Fig. 5 (right) confirms that the results are essentially isotropic with respect to the filter orientations.
Fig. 5. Left: Magnitude image of phantom. Center: Weighted combination of direct inversion result from four orthogonal directional filters in the horizontal and vertical directions. Right: Combined direct inversion result with four orthogonal filters in the diagonal directions
A more complicated example is shown in Fig. 6. In this case the object is a breast phantom with a stiff inclusion near the center. The driving is upward from the bottom, in a longitudinal (not shear) manner. Mode conversion generates shear waves throughout the object, with significant displacement in all three directions in most parts of the object. The DI reconstruction of shear modulus from data sensitized to z
298
A. Manduca, D.S. Lake, and R.L. Ehman
(out of plane) displacement alone is shown in the left panel. The reconstruction is much improved by taking the weighted combination of four reconstructions from four orthogonal directional filters (center panel). At right is the result from additionally combining appropriately weighted reconstructions of directionally filtered data sensitized to displacements in the x and y directions (this is now a weighted combination of 12 separate reconstructions). The inclusion is now very well depicted.
Fig. 6. Left: Direct inversion reconstruction of breast phantom with stiff inclusion near the center with non-directionally filtered displacement data of motion in the z (out of plane) direction only. Center: Weighted combination of direct inversion result from four orthogonal directional filters in the horizontal and vertical directions. Right: Combined direct inversion result with four orthogonal filters in the diagonal directions
The DI technique also yields attenuation information, but such results typically are noisy, with artificially enhanced boundaries between regions, where the local homogeneity assumption is violated [4]. If directional filtering is applied, such reconstructions in areas away from boundaries are noticeably smoothed. Although artifacts at region boundaries remain, quantitative measurements of attenuation may become possible in sufficiently homogeneous areas (Fig. 7).
Fig. 7. DI reconstructions of attenuation without (left) and with (right) directional filtering
Spatio-temporal Directional Filtering
4
299
Conclusion
Directional filtering appears to be a very promising pre-processing technique for MRE inversion. By decomposing the wave field into separate components, it reduces the effects of interfering waves, and significantly improves the quality of shear modulus reconstructions. Inversions with the direct inversion and phase gradient algorithms are particularly improved, as are the attenuation maps generated by direct inversion. Orthogonal sets of filters can be combined and appear to handle propagation in arbitrary directions in an isotropic way, obviating any need to specify preferred wave directions. The application of this technique to in vivo data is currently being studied.
References 1. Muthupillai, R., Lomas, D.J., Rossman, P.J., Greenleaf, J.F., Manduca, A., Ehman, R.L.: Magnetic Resonance Elastography by Direct Visualization of Propagating Acoustic Strain Waves. Science 269 (1995) 1854–1857 2. Manduca, A., Oliphant, T.E., Dresner, M.A., Mahowald, J.L., Kruse, S.A. et al.:. Magnetic Resonance Elastography: Non-invasive mapping of Tissue Elasticity. Medical Image Analysis 5 (1997) 237–254 3. Knutsson, H., Westin, C.F., Granlund, G.: Local Multiscale Frequency and Bandwidth Estimation. In: Proceedings of IEEE Intl Conf on Image Processing (1994) 36-40 4. Oliphant, T.E., Manduca, A., Ehman, R.L., Greenleaf, J.F.: Complex-valued Stiffness Reconstruction for Magnetic Resonance Elastography. Magnetic resonance in Medicine 45 (2001) 299-310
RBF-Based Representation of Volumetric Data: Application in Visualization and Segmentation Yoshitaka Masutani Image Computing and Analysis Laboratory, Department of Radiology, University of Tokyo (UT-RAD/ICAL) 7-3-1 Hongo Bunkyo-ku Tokyo 113-8655 Japan tel. +81-3-5800-8666 ex.37418; fax. +81-3-5800-8935 QEWYXERMYXVEH$YQMREGNT
Abstract. A new scheme of data-driven segmentation is proposed, which is based on detection of object boundary, and volumetric pattern reconstruction as implicit function by using the detected object boundary and the radial basis functions (RBF). By using clinical X-ray CT data, applications in visualization of the pancreatic duct by MINIP of curved thin-slab and in liver segmentation are shown.
1
Introduction
Segmentation procedure is required in many fields of medical imaging. For example, in computer-assisted detection of lung nodule, segmentation of lung is an important process for limiting search area. In general, methods of segmentation are often categorized in two: data-driven, and model-based (mostly deformable models). The former is conventionally employed in many applications, based on thresholding, connected component analysis, math-morphology, etc. On the other hand, the latter is reported extensively, and is shown to be promising in several applications since SNAKES [Kass87] and other related methods based on deformable models were proposed. However, several limitations of deformable models are also found in cases shape variation of segmentation target is large, for example colon segmentation. For such cases, data-driven methods have still advantages and are employed [Masutani01]. However, in data-driven methods, it is hard to introduce a priori knowledge for eliminating misconnected objects. In this paper, a new segmentation method is proposed by extending the notion of radial basis functions (RBF), which were conventionally employed as surface shape interpolator / extrapolator. A new notion of non-boundary control points for representing a priori knowledge is also presented.
2
Methods
2.1
Basic Theory of Radial Basis Functions for Construction of Volumetric Pattern
Radial basis functions (RBF) are employed as a representation method of a solid and a surface interpolator / extrapolator since it was reported by several research groups [Savchenko95][Carr97][Turk99]. An interpolation / extrapolation process based on T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 300–307, 2002. © Springer-Verlag Berlin Heidelberg 2002
RBF-Based Representation of Volumetric Data
301
RBF is to construct an implicit function s(x) by using several pairs of control point coordinates xi and function value si at xi , In this section, the theory of the RBF is briefly described. 2.1.1 Construction of Volumetric Pattern as Implicit Function by Control Points An approximated implicit function s(x) in 3D space(x∈R3) is expressed as N
s (x) = p ( x) + ∑ λiφ ( x − x i )
(1)
i =1
where x=(x,y,z)T is coordinates at arbitrary position in 3D space, p(x) is a low degree polynomial function, N is the number of control points, λ i is coefficient for control point, xi φ(r) is a basic function, and | | is the Euclidean norm in R3. A basic function φ(r) is normally chosen from the well-known families of the spline functions of smoothing, such as the thin-plate, the Gaussian, the multi-quadratic, the bi-harmonic, the tri-harmonic, and so on. In this study, the bi-harmonic spline function, shown in equation (2) is employed for its energy minimization properties. (2) φ (r ) = r In addition, the bi-harmonic spline function is one of the non-compactly supported functions, which are suitable for interpolation/extrapolation of irregular and nonuniform control points [Carr01]. For the bi-harmonic spline function as RBF, the polynomial function is
p ( x) = c1 + c2 x + c3 y + c4 z
(3)
Therefore, the approximation process equals to computing the following values.
Λ = (λ1 , K , λ N ) T
(4)
c = (c1 , c2 , c3 , c4 ) T
(5)
However, it is clear that four degrees of freedom are still left in determining all these values only by using the equation (1) and N pairs of control point coordinates x and function values s. To guarantee that s(x) belongs to the Beppo-Levi space on R3, additional conditions of the orthogonality are required as follows. N
N
N
N
∑λ = ∑λ x = ∑λ y = ∑λ z i =1
i
i =1
i i
i =1
i
i
i =1
i i
=0
(6)
Finally, for the case bi-harmonic spline is chosen as RBF, the computing process is summarized as solving linear equations:
A T P where
P Λ S = 0 c 0
Ai , j = φ ( x i − x j ) = x i − x j
,
(7)
S = (s1 , K , s N )
T
,
302
Y. Masutani
1 x1 y1 P= M 1 x yN N
z1 z N
Then, a volume data set is simply reconstructed, by determining voxel value at x as s(x). 2.1.2 Configuration of Control Points with Boundary Points and Normals In constructing an implicit function representing an object shape that has complex boundary, normal vectors with boundary points are often necessary for stable construction of a function [Carr01]. As shown in Fig.1, in such cases, three control points with function values: (x, s) = (x, 0), (x+n,1), (x-n,-1) are generated for a boundary point x with its normal n(x).
Fig. 1. Three Control Points Generated For A Boundary Point
2.2
Application in Visualization by Forming Curved Thin Slab
A simple application of RBF is determination of curved thin slab from sparsely distributed control points with normals. Fig.2 shows an example of the pancreatic duct visualization by minimum intensity projection. In this example, seven boundary points were interactively set on the pancreatic duct and all the normals of the boundary points are set to the normal of coronal plane. After computing the implicit function s(x), the entire volume was cleared except for the voxels of which s(x) is nearly zero, i.e. close to the boundary. 2.3
Application in Segmentation of Objects in Volumetric Data
By using RBF, segmentation of objects in volumetric data, is regarded as construction of an implicit function of which value is: 0 on the object boundary, positive inside the boundary, and negative outside the boundary. In other words, RBF-based segmentation, proposed in this paper, equals to defining boundary points of the object with their normals. 2.3.1 Segmentation by Defining Boundary Points with Normals Unlike the nodes of SNAKES [Kass87], the boundary points are not required to be structured because any structural or order information is not utilized in construction of implicit function s(x). Therefore, by using RBF approximation, interactive definition of unstructured boundary points leads to interactive segmentation with a simple inter-
RBF-Based Representation of Volumetric Data
303
face, and automated detection of boundary points leads to fully automated segmentation. In the automated boundary point definition, gradient vectors are used as normals at the boundary.
Fig. 2. Curved Thin Slab by RBF. Left: an axial slice of the reconstructed volume. Right: visualization of the main pancreatic duct by MINIP of the thin slab (arrows)
2.3.2 Introducing a Priori Knowledge by Non-boundary Control Points In the RBF-based segmentation proposed in this paper, a priori knowledge is introduced by defining non-boundary control points, which are employed to suppress function values at the positions where the target object does not exist, as shown in Fig. 3. For example, in segmentation of the liver, points on other structures such as the body surface can be employed.
Fig. 3. Suppressing Function Value by Non-Boundary Control Points
3
Results
3.1
Semi-automated Liver Segmentation
Fig. 4 shows a result of liver segmentation based on detection of 605 boundary points. Because the algorithm of the boundary point detection in detail is out of the scope of this paper, the process is summarized briefly by the following steps: (1) presegmentation of the body surface, (2) initial segmentation of the liver by thresholding,
304
Y. Masutani
(3) collecting boundary points with feature values, (4) classification of boundary points and eliminating false boundary points, if any (4) generating control points by using the selected boundary points with their normals, (5) placing non-boundary control points of the body surface, and (6) construction of implicit function by the control points.
Fig 4. A Segmentation Result by Boundary Point Detection and RBF Reconstruction. Top left: detected boundary points, Top right: enlarged of Top left, the cones represent boundary points and their normal directions. Bottom left: Reconstructed volume data from the boundary points (only four slices shown). Bottom right: Reconstructed volume data visualized by volume rendering (matrix size: 128x128x80)
3.2
Reducing Number of Boundary Points
An important interest in the proposed segmentation method is number of the boundary points employed. Naturally, more boundary points yield more fine structures of shapes and require more computational cost. The optimal number of boundary points for object segmentation depends on complexity of object surface. Fig. 5 shows a
RBF-Based Representation of Volumetric Data
305
result of an experiment of boundary point reduction for the liver shape segmented in the previous section. For several number of boundary points reduced, the percentages of the true positive (TP), false positive (FP), and false negative (FN) to the original liver volume were evaluated. Volume reconstruction by using only 605 points covers 97 percent of the original liver volume segmented by over 20,000 boundary points.
(a)
(b) Fig. 5. Segmentation Error by Reducing Boundary Points, (a) Segmentation results with boundary points reduced, the number of boundary points: N=24,520 is the original, results are visualized by isosurfaces. (b) Segmentation error in volumetric fractions of the true positive voxels, false positive voxels, false negative voxels
306
4
Y. Masutani
Discussion and Summary
A new scheme of segmentation in volumetric data was proposed based on the radial basis function reconstruction. It is interesting to compare the proposed method with the method based on deformable models as shown in Table 1. Because the proposed method is driven by data of detected boundary points, no fitting process is required. In a segmentation process, as some nodes of deformable models often miss boundary due to unsharp edge, boundary point detection is not guaranteed to be successful in the proposed method of RBF-based segmentation. In such cases of missing partial boundary, deformable models manage to interpolate the missing part based on internal energy to smooth the boundary, or is guaranteed to be smooth by defining the entire shape parametrically in frequency domain [Szekely96][Chuang96]. RBF also guarantees the smoothness of the resulting implicit function. In general, such pre-guaranteed smoothness of objects and initial shape of the models are kinds of a priori or anatomical standard information, which helps segmentation. In the proposed method, such a priori knowledge as constraint is represented more moderately to suppress implicit function values. Apparently, the key to fully automated segmentation is methods for fully automated boundary point detection or for fully automated elimination of false boundary points. As shown in Fig. 6, false boundary points yield isolated regions when they are apart from the true boundary. In such cases, it is not hard to remove the false regions. An important feature of the proposed method is that the control points are not structured, unlike the point distribution model (PDM) [Cootes95]. However, the notion of points with explicit features (ex. corner) might help more accurate boundary point detection. As current work, improvement of the boundary point detection based on statistical pattern analysis is in progress. Table 1. Comparison of Proposed Method and Deformable Models
Category Optimization process A Priori Knowledge Points/Nodes
Proposed Method Data-driven -
Deformable models Model-based Required for fitting
Features of BP Non-boundary CP Independent
Initial shape Parameter Range Structured as contiguous model Fitting to other structures, Underestimation of structures (missing corners, etc.) Compact as a model Geometric Model
Typical errors
Isolated Artifact Region by False BP, Missing BP
Contiguity Result
No guarantee for contiguity Implicit function (Volumetric pattern)
Variation
-
BP: boundary points, CP: control points
Parametric Model (Fourier, Wavelet), Point Distribution Model
RBF-Based Representation of Volumetric Data
307
Fig. 6. An Example of Effect by False Boundary Points. Left: a slice including an isolated region, Right: an isolated region apart from the liver (normals are also shown by cones)
Acknowledgements The author is grateful to the research groups in the laboratories of the associate professor Hiromasa Suzuki, and the professor. Fumihiko Kimura in the School of Engineering, the University of Tokyo Graduate School for the encouraging discussion in computer graphics and modeling. The author also appreciates Dr. Shigeru Kiryuu’s providing the clinical pancreas CT data and advice on pancreas visualization. The clinical CT data of the liver are distributed by the Japanese society of computerassisted diagnosis in medicine (CADM) by courtesy of Dr. Shigeru Nawano.
Reference [Kass87] Kass, M, et al., Active contour models. International Journal of Computer Vision, pages 321-331, 1987. [Masutani01] Masutani, Y, et al., Automated Segmentation of Colonic Walls for Computerized Detection of Polyps in CT Colonography, Journal of Computer-Assisted Tomography, vol.25, no.4, pp629-638, 2001 [Savchenko95] Savchenko VV, et al.: Function Representation of Solids Reconstructed from Scattered Surface Points and Contours, Computer Graphics Forum, 14(4): 181-188, 1995 [Carr97] Carr CJ, and Fright WR, Surface Interpolation with Radial Basis Functions for Medical Imaging", IEEE Trans. on Med. Img, 16(1): pp 96-107, 1997. [Turk99] Turk G, and O’Brien J: Shape Transformation using Variational Implicit Surfaces, ACM SIGGRAPH1999, pp.335-342, 1999 [Carr01] Carr CJ, et al.: Reconstruction and Representation of 3D Objects with Radial Basis Functions, ACM SIGGRAPH2001, pp67-76, 2001 [Szekely96] Szekely G, et al. Segmentation of 2-D and 3-D objects from MRI volume data using constrained elastic deformations of flexible Fourier contour and surface models Medical Image Analysis vol.1 no.1 pp19-34, 1996 [Chuang96] Chuang GCH, et al.Wavelet Descriptor of Planar Curves: Theory and Applications IEEE trans on Image Proc. vol.5 no.1pp.56-70 1996
[Cootes95] Cootes TF, et al.,. Active Shape Models - their training and applications. Computer Vision and Image Understanding, 61(2), January 1995.
An Anatomical Model of the Knee Joint Obtained by Computer Dissection S. Martelli1, F. Acquaroli1, V. Pinskerova2, A. Spettol1, and A. Visani1 1
Istituti Ortopedici Rizzoli, Lab. Biomechanics, Bologna, Italy 2 Charles University, Prague, Check Republic WQEVXIPPM$FMSQIGMSVMX
Abstract. This paper reports the analysis of the articular surfaces of the femur and the tibia in normal knees. Six cadaveric joints were digitized with FARO Arm electrogoniometer and elaborated off-line fitting the profiles of multiplanar sections with least square curves. We found that: the femoral medial condyle can be represented by an ellipsoid with its main axis in the AP direction, semi-axes equal to 30 and 23 ± 2 mm (roughly spherical posteriorly) and circular ML section (radius = 20 ± 2 mm); the tibial medial plateau can be approximated by a semi-cylinder with its main axis in the AP direction and circular ML section (radius = 22 ± 1.3 mm); the femoral lateral condyle can be represented by a ellipsoid with semi-axes equal to 26 and 20 ± 4 mm and an elliptic ML section with a sloped major axis parallel to the tibial spine; the articulating tibial surface in the lateral compartment is approximately flat. Keywords: femur, tibia, knee, anatomy, electrogoniometer.
1
Introduction
Computer techniques and new electronic equipment are progressively changing medical investigations and treatment, but have been seldom used for accurate anatomical investigations [1], [2], [3]. In particular anatomical studies of the knee could benefit from an improved accurate knowledge of its structures, as still some controversial opinions exist on the geometry of the knee joint and poor quantitative information is available in 3D. The femoral condyles in the sagittal plane have been described as spirals [4], [5], [6] [7] or circles, at least posteriorly, [4], [5], [8], [9], [10], [11], [12], [13], [14], [15]. The description of the femoral shape in the frontal plane and in 3D is scarcer. To the authors’ knowledge the frontal sections of the femur were measured only by Wismans [16] and by Kurosawa [8] in 3D studies, and the surfaces were fitted respectively with a spline of degree 4 or a sphere. The description of tibial anatomy in the literature is almost always qualitative and anyway very poor [17], [6], [11], [18], [19]. It was approximated to one or more planes in [15], [9] , [6] [4], or a complex parametric surface in [16],[20]. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 308–314, 2002. © Springer-Verlag Berlin Heidelberg 2002
An Anatomical Model of the Knee Joint Obtained by Computer Dissection
309
In this paper we report an original anatomical study of six human specimens based on the use of an accurate digitizer and in off-line computer elaboration of data, least square fitting of sections and statistical evaluation. The goal of this study is to update the classical anatomical data, in order to complement present knowledge and provide reliable numerical data for 3D modeling and computer-assisted reconstruction. This work integrates a previous work by two of the authors using a different numerical approach [21].
2
Material and Methods
Six normal knee specimens were digitized using the FARO Arm electrogoniometer (FARO Technologies, Lake Mary, FL, USA; 0.3 mm / 0.3° accuracy), obtaining 6634 points on average on the femur and 3335 points on average on the tibia. For display purposes the configuration of the joint at extension, 10°, 30°, 60°, 90°, 120° of flexion was recorded via pins implanted in the bones. An anatomical joint reference frame was established according to the clinical conventions, choosing the vertical axis along the tibial anatomical axis and the perpendicular anterior-posterior direction. The knee anatomy was studied in the 3D computer reconstruction and 2D sections in different orientations. In particular this study consists in a statistical evaluation of the shape of femoral and tibial articulating surfaces in 2 mm consecutive sagittal (30 per knee on average), frontal (17 per knee on average) and coronal sections (9 per knee). The 2D profile of the articulating area was fitted by least square circles, ellipses and line to find the best approximation (Figure 1).
M
L
Fig. 1. Software elaboration of knee data: 3D reconstruction (left) and sagittal sections (right)
310
S. Martelli et al.
3
Results
3.1
Femoral Shape in Sagittal Section
In the 15 mm central part of the medial condyle the whole profile was fitted (residual < 0.3 mm ) by an ellipse with mean e=0.62 ± 0.08, and semi-axes R1 = 30.4 ± 1.9 mm, R2 = 23.18 ± 1.9 mm. The ellipse fitted also the whole patello-femoral joint anteriorly in all knees and posteriorly an arc often going beyond the contact point between tibia and femur at 120° (but not including the final 5-10 mm arc), except in a few sections of three knees. Table 1 reports the mean eccentricity and axes of the least-square ellipses fitting in each knee parallel sagittal profiles in a range of equivalent variable arcs and sections residuals < 0.4 mm. The sagittal profile of this part of the medial condyle could also be interpolated by two different circles within the acquisition accuracy (i.e. residuals < 0.4 mm), one posterior and one fitting a smaller anterior arc. The identification of posterior circle was quite stable, from a numerical point of view, so that the variability of possible circles fitting arcs > 60° with residuals < 0.4 mm was limited in each section and also in adjacent sections. However the least square fitting of the remaining anterior part of the tibio-femoral joint was numerically uncertain, as it was affected by a very high variability due to the small size of the fitted arc (usually < 40°) and significant individual differences (e.g. knee 1 and 2). Details of this fitting and the discussion of its implications are reported in a different work [21]. In the 13 mm central part of the lateral condyle, and around the section of maximal curvature, the articulating profile could be fitted by an ellipse with mean e=0.48 ± 0.05, and semi-axes R1 = 26.4 ± 3.5 mm, R2 = 20.4 ± 4.4 mm (Table 2). The ellipse fitted a large arc, extending from the posterior contact in full flexion to a large part of the patello – femoral joint, except in one knee in which the end of the tibio-femoral joint was clearly marked by an evident wedge. In 2 knees and single sections of the other specimens the lateral profile clearly showed a similar "dent", even if smaller and under the fitting resolution. The sagittal profile of this part of the lateral condyle could also be interpolated by circles. Being more "round" (lower mean eccentricity) than the medial compartment, the least square fitting of almost the whole profile was possible with a single circle and the small remaining part could be described by as a line in one knee or as an arc of a larger circle, which was numerically undefined (the mean arc was 30°). [21] 3.2
Tibial Shape in Sagittal Section
The central concavity of the medial plateau could be very clearly fitted by an arc of a circle in 2 knees, it could be fitted equivalently by an arc of circle or two lines making a large angle in most knees degenerating into a single flat line in several sections [21]. The circle fitting the medial sagittal tibial plateau in this central articular region had a mean radius R =75 ± 18 mm (mean on sections where this fitting can be computed). However, the small size of the tibial sagittal profile made the numerical interpolation more unstable than in the femoral sagittal sections (higher standard deviation
An Anatomical Model of the Knee Joint Obtained by Computer Dissection
311
in the same knee) and less reliable (e.g. even in the same section each data have more than 5mm error on R for 35° arc). The shape of the lateral plateau in sagittal sections was quite regular, because it basically stayed convex in all sections, gradually changing the sharp edge of the tibial eminence to a circular convex profile with radius 36.02 ± 11.3 mm, over an arc of 69.44° ± 18.1°. Table 1. Fitting of the medial femoral sagittal profile with an ellipse: mean eccentricity, major semi-axis (R1) and minor semi-axis (R2) ± standard deviation of the fitting in parallel sections Knee 1 2 3 4 5 6 Mean
Eccentricity 0.68 ± 0.08 0.67 ± 0.02 0.59 ± 0.06 0.54 ± 0.05 0.64 ± 0.03 0.62 ± 0.04 0.62 ± 0.05
R1 (mm) 31.3 ± 3.0 30.3 ± 0.6 33.8 ± 1.0 27.9 ± 1.9 28.6 ± 1.8 30.4 ± 2.0 30.4 ± 1.9
R2 (mm) 22.4 ± 0.7 22.4 ± 0.6 27.2 ± 1.0 23.3 ± 1.2 21.9 ± 0.8 21.9 ± 1.9 23.2 ± 2.0
Table 2. Fitting of the lateral femoral sagittal profile with an ellipse: mean eccentricity, major semi-axis (R1) and minor semi-axis (R2) ± standard deviation of the fitting in parallel sections Knee 1 2 3 4 5 6 Mean
3.3
Eccentricity 0.58 ± 0.02 0.57 ± 0.03 0.57 ± 0.03 0.65 ± 0.07 0.52 ± 0.02 0.0 ± 0.02 0.48 ± 0.05
R1 (mm) 27.3 ±1.2 26.3 ± 2.3 30.1 ± 0.9 31.6 ± 3.8 22.7 ± 1.4 20.5 ± 0.8 26.4 ± 3.5
R2 (mm) 22.6±1.5 21.4 ± 2.0 25.1 ± 0.5 13.5 ±1.17 19.4 ± 1.5 20.5 ± 0.8 20.4 ± 4.4
Femoral Shape in Frontal Section
The articular surface of the medial condyle appeared circular, or occasionally elliptic with a low eccentricity. A circle with average radius R = 20.3± 2.4 mm fitted the whole condyle in all knees. The lateral condyle appeared almost flat near the contact with the tibial spine, and could be fitted by an ellipse with e = 0.81 ± 0.07, R1 = 16.5 ± 2.12 mm , R2 = 9.3 ± 1.3 mm and the longer axis parallel to the lateral wall of the tibial spine, making an average -17.1° ± 2.5° with our horizontal plane. The part of the profile in contact with the tibia and near the notch could be fitted also by a line with similar slope, and in one knee individual deformation of the bone or cartilage made this fitting clearly better than the elliptic interpolation, although numerically unstable [21]. The posterior frontal sections of the femoral condyles (in the extension position) became more and more circular in both compartments. In the posterior part of the femur the frontal sections of both compartments became very similar, profiles could be well fitted by circles with decreasing radius (maximum R = 19.4 ± 1.1).
312
S. Martelli et al.
3.4
Tibial Shape in Frontal Section
In the central part of the tibial plateau the frontal profile was very typical, clearly circular in all knees and conform to the femoral profile (average radius = 21.9 ± 1.3 mm over an arc of 69.2° ± 14°, including tibial the medial side of the spine and excluding the curved medial extremity). The lateral tibial plateau was more “flat” than the medial one, with small anterior and posterior flat portion, with a wedge between the tibial spine and the flat plateau. The frontal profile was fitted by 2 lines making an average angle of 167°. The line fitting the lateral wall of the tibial plateau (average slope -21.3° ± 3.8°) was roughly parallel to the ellipse fitting the femoral lateral condyle or the line describing its medial portion . 3.5
Joint Shape in Coronal Section
The profile of the femoral condyles varied as the coronal sections moved proximally. Both medial and lateral condyles appeared “round” posteriorly, with increasing radius becoming maximal and stable in the area corresponding roughly to the contact area between tibia and femur at 90°. In this region (around 25 mm over the tibial plateau ± 4 mm) selective coronal sections were examined to describe the posterior curvature of the femoral condyles. The profile was well fitted by a circle in both compartments, but the lateral side was bigger (19.8 ± 2.5 mm average radius) than the medial one (17.3 ± 1.8 mm average radius) The coronal sections of the tibial plateau did not provide consistent profiles, because of the almost horizontal distribution of data. However, the coronal sections of the tibial spine showed a repeatable interesting pattern in the articular region of the two compartments, height 4 mm on average. The medial compartment had a linear profile, while the lateral one was C-shaped, even if it was not fitted satisfactory by circles or ellipses.
4
Discussion and Conclusions
We have presented an original method to study the knee anatomy, which helps to understand the 3D shape of femoral and tibial articulating surfaces more precisely because it is based on multiple sections analysis (compared to single sections/slices [15], [22]) and on acquisition equipment more accurate than X-rays [8, 11, 12], manual measurements on dissected joints [17], [19] and similar to recent methods for kinematic experimental studies [23], [24], [25], [26], [27]. Moreover the numerical description of the tibial shape is an original result of the present study and can be poorly compared with previous qualitative results. The reported analysis of the tibio-femoral joint suggests a new representation of the bone segments, compatible with some of the previous anatomical models proposed in the literature [8], [10], [21], [11], [13], [14], [15], but providing a more global description in all views and both bone segments. We propose a model of the tibio-femoral joint as made of two rigidly connected ellipsoids with their main axes in
An Anatomical Model of the Knee Joint Obtained by Computer Dissection
313
the AP direction (medial semi-axes = 30 and 23 ± 2 mm; lateral semi-axes = 26 and 20 ± 4 mm), with the ML section circular in the medial side (R=20 ± 2 mm) and elliptic in the lateral side (semi-axes =17 and 9 ± 2 mm; slope=160°). It articulates on a surface which is a cylinder with its AP axis and circular section on the medial side (conformal to the femoral shape), and a flat wedge on the lateral side, representing the conic or flat tibial eminence and the central part of the tibial plateau. (Figure 2) This new description of the anatomy of the normal knee suggests that new asymmetric design of artificial knees and non-circular profiles might improve the outcomes of knee reconstruction, reproducing the balance of the natural joint. Moreover an elliptic model of the femoral condyles may explain the apparently different pattern of motion in the first 20-30° of flexion and the remaining range of flexion as the consequence of a continuous change in the curvature of the articulating surface roughly from the bigger to the smaller ellipse radius.
Fig. 2. Anatomical model of the knee joint obtained by computer – dissection (medial compartment on the left, lateral on the right)
References 1. L. Blankevoort, R. Huiskes : Validation of a three-dimensional model of the knee, J. Biomech., 29(7) (1996) 955-961. 2. H. Kurosawa, P.S. Walker, S. Abe, A. Garg, T. Hunter : Geometry and motion of the knee for implant and orthotic design, J. Biomech., 18(7) (1985) 487-499. 3. D.R. Wilson, J.D. Feikes, J.J. O’Connor : Ligaments and articular contact guide passive knee flexion, J. Biomech., 31 (1998) 1127-1136. 4. W. Weber, E. Weber: Mechanics of the human walking apparatus (Springer-Verlag, Berlin 1992 - 1st published in german), On the Knee, 4 (1836) 75-92. 5. U. Rehder : Morphometrical studies on the symmetry of the human knee joint: femoral condyles, J. Biomech., 16(5) (1983) 351-361. 6. C.H. Barnett : A comparison of the human knee and avian ankle, J. Anat., 88 (1954) 59-70. 7. T. Röstlund, L. Carlsson, B. Albrektsson, T. Albrektsson: Morphometrical studies of human femoral condyles, J. Biomed. Eng., 11 (1989) 442-448. 8. H. Kurosawa, P.S. Walker, S. Abe, A. Garg, T. Hunter : Geometry and motion of the knee for implant and orthotic design, J. Biomech., 18(7) (1985) 487-499.
314
S. Martelli et al.
9. J.J. O’Connor, J.W. Goodfellow : The geometry of the knee in the sagittal plane, IMechE., Proc. Instn Engrs, 203 (1989) 223-233. 10. H. Albrecht : XXIII Contribution to the anatomy of the knee joint (translated from German by P. Maquet), Diss. Deutsche Zeits. + Chirurgie III. (1876) 11. J.S. Mensch, C. Harlan, C. Amstutz : Knee morphology as a guide to knee replacement, Clinical Orthopaedics and Related Research, 112 (1975) 231-241. 12. M.J. Erkman, P.S. Walker : A study of knee geometry applied to the design of condylar prostheses, J. Biomech. Eng., 9 (1974) 14-17. 13. T. Röstlund, L. Carlsson, B. Albrektsson, T. Albrektsson: Morphometrical studies of human femoral condyles, J. Biomed. Eng., 11 (1989) 442-448. 14. M. Zoghi, M.S. Hefzy, K.C. Fu, W.T. Jackson: The three-dimensional morphometrical study of the distal human femur, IMechE., Proc. Instn Engrs, 206 (1992) 147-157. 15. V. Pinskerova, H. Iwaki, M. Freeman : The shape and relative movements of the femur and tibia in the unloaded cadaveric knee: a study using MRI as an anatomical tool. In J.N. Insall and W.N. Scott (eds.): Surgery of the knee. 3rd edition , Saunders Inc., Philadelphia USA in press 16. J. Wismans, F. Veldpaus, J. Janssen: A three-dimensional mathematical model of the kneejoint. J. Biomech., 13 (1980) 677-685. 17. V. Pinskerova, P. Maquet, M.A.R. Freeman: Writing on the knee between 1836 and 1917. J Bone and Joint Surg, 82-B(8) (2000) 1100-02. 18. G.S. Langa : Experimental observations and interpretations on the relationship between the morphology and function of the human knee joint. Acta anat., 55 (1963) 16-38. 19. Y. Yoshioka, D.W. Siu, A. Scudamore, T.D.V. Cooke (1989): Tibial anatomy and functional axes. J. Orthop. Res., 7(1), 132-137. 20. Y.Y. Dhaher, L.D. Scott, W.Z. Rymer : The use of basis functions in modelling joint articular surfaces: application to the knee joint. J. Biomech., 33 (2000) 901-907. 21. S. Martelli, V. Pinskerova: The shapes of the tibio-femoral articular surfaces. To appear in J Bone and Joint Surg. 22. A.M. Hollister, S. Jatana, A.K. Singh, W.W. Sullivan, A.G. Lupichuck : The axes of rotation of the knee, Clinical Orthopaedics and Related Research. 290 (1993) 259-268. 23. T.P. Quinn, C.D. Mote : A six degree-of-freedom acoustic transducer for rotation and translation measurements across the knee. J. Biomed. Eng, 112 (1990) 371-378. 24. S. Martelli, S. Zaffagnini, B. Falcioni, M. Marcacci: Intraoperative kinematic protocol for knee joint evaluation, Computer Methods and Programs in Biomedicine, 62, (2000), 77-86. 25. R.J. Minns, W.K. Walsh, J.A. Clarket: Techniques for measuring the static and dynamic properties of the patella. J. Biomed. Eng., 11 (1989) 209-14. 26. H. Fujie, K. Mabuchi, S.L.-Y. Woo, G.A. Livesay, S. Arai, Y. Tsukamoto: The use of robotics technology to study human kinematics: a new methodology. J. Biomed. Eng., 115 (1993) 211-7. 27. L.J. Ruijven, M. Beek, E. Donker, T.M.G.J. van Eijden : The accurancy of joint surface models constructed from data obtained with an electromagnetic tracking device. J. Biomech., 33 (2000) 1023-8.
Models for Planning and Simulation in Computer Assisted Orthognatic Surgery Matthieu Chabanas1, Christophe Marecaux1,2, Yohan Payan1, and Franck Boutault2 1
TIMC/IMAG laboratory, UMR CNRS 5525, Université Joseph Fourier, Grenoble, France Institut Albert Bonniot - 38706 La Tronche cedex, France 1EXXLMIY'LEFEREW$MQEKJV 2 CCRAO Laboratory – Université Paul Sabatier – Toulouse, France Service de Chirurgie Maxillo-faciale, Hôpital Purpan - 31059 Toulouse cedex, France
Abstract. Two aspects required to establish a planning in orthognatic surgery are addressed in this paper. First, a 3D cephalometric analysis, which is clinically essential for the therapeutic decision. Then, an original method to build a biomechanical model of patient face soft tissue, which provides evaluation of the aesthetic outcomes of an intervention. Both points are developed within a clinical application context for computer aided maxillofacial surgery.
1
Introduction
Orthognathic surgery attempts to establish normal aesthetic and functional anatomy for patients suffering from dentofacial disharmony. In this way, current surgery aims at normalize patients dental occlusion, temporo mandibular joint function and morphologic appearance by repositioning maxillary and mandibular skeletal osteotomized segments. Soft tissue changes are mainly an adaptation of these bones modifications. The cranio-maxillofacial surgeon determines an operative planning consisting in quantitative displacement of the skeletal segments (maxillar and/or mandibular) for estimating normal position. Real clinical problems are to respect temporo mandibular joint functional anatomy and to predict soft tissue changes. This last point is important on one part for surgeon as the final soft tissue facial appearance might modify the operative planning and on the other part for the patient who expects a reliable prediction of his post operative aesthetic appearance. In current practice, the orthognatic surgical planning involves several multimodal data: standard radiographies for bidimensional cephalometric analysis, plaster dental casts for orthodontic analysis, photographs and clinical evaluation for anthropometric measurements. In comparison with normative data set and according to orthodontic and cephalometric analysis, the surgeon simulate dental arch displacements on plaster casts to build some resin splints as reference of the different occlusion stages. These splints are used during surgery to guide maxillary and mandibular osteotomies repositioning. No reliable per operative measurement guaranties defined planning respect. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 315–322, 2002. © Springer-Verlag Berlin Heidelberg 2002
316
M. Chabanas et al.
This difficult and laborious process might be responsible of imprecision and requires a strong experience. Medical imaging and computer assisted surgical technologies may improve current orthognatic protocol as an aid in diagnostic, surgical planning and surgical intervention. This work presents our experience in computer assisted orthognatic surgery. First, we remind sequences of a computer aided cranio-maxillofacial protocol as defined in literature stressing on missing points for a clinical application of these techniques. Then, we address the two points which are, in our minded, the two main remaining problems in this way: a 3D cephalometric analysis and a postoperative facial soft tissue appearance prediction.
2
Computer Assisted Cranio-maxillofacial Surgery
The different steps of a computer-aided protocol in cranio-maxillofacial surgery are well defined in the literature [1,2]. They can be summarised in 7 points: 1. CT data acquisition, with computer generated 3D surface reconstruction 2. Three-dimensional cephalometric analysis and dental occlusion analysis for clinical diagnosis and operative planning 3. Surgical simulation, including osteotomies and real time mobilisation of the bone segments with 6 degrees of freedom 4. Prediction of the facial soft tissue deformation according to the repositioning of the underlying bone structures 5. Validation of the surgical planning according to the soft tissue simulations 6. Data transfer to the operative room and per-operative computer aided navigation, to ensure the established planning is accurately respected 7. Evaluation of the surgical outcomes Different stages of this computer aided protocol for cranio-maxillofacial surgery have been addressed in the literature. Three dimensional cephalometric analysis, despite being essential for planning decision, has been studied very little so far. A previous work, proposed by our group [3], was an extension of a 2D cephalometry (from Delaire) used for osteotomized segments repositioning. However, cephalometric and orthodontic planning were made in traditional way (on 2D standard teleradiography and plaster dental casts). The most interesting and original work was proposed by Treil [4]. He introduces a new cephalometry based on CT scanner imaging, anatomic landmarks and mathematic tools (maxillofacial frame and dental axis of inertia) for skeletal and dental analysis. However, in our point of view, this cephalometric analysis is not relevant for operative planning and computer guided surgery. Most of the existing works deal with the interactive simulation of a surgical procedure on 3D skull computer generated models [5,6]. Physical models were also developed to evaluate the aesthetic outcomes resulting from underlying bone repositioning [6,7,8,9]. However, despite their evident scientific interest, most of these work cannot be used in clinical practice, since the bone simulations are not clinically relevant.
Models for Planning and Simulation in Computer Assisted Orthognatic Surgery
317
Few works exist about per-operative guiding in cranio-maxillofacial surgery [10,11,12]. According to us, none of the working groups consider the whole computer aided sequence. Our group strives to develop a facial skeletal model for cephalometry and osteotomy simulation, and a finite element model of the facial soft tissues for post operative aesthetic appearance simulation. Moreover, we have already addressed bone segmentation, mobilisation and guidance for orthognatic surgery in previous works [10,3].
3
3D Cephalometry: A Morphometric Analysis
A complete computer aided cranio-maxillofacial surgery sequence requires a bone skull model that enables the medical diagnostic, supports the surgical bone osteotomies simulation, integrates the post operative facial soft tissues deformation prediction and can be used as interface in computer guided surgery. To be accepted by medical community, this model must be coherent from an anatomical, physiological and organ genetic point of view. A 3D cephalometric tool as an aid in diagnostic is admitted as useful [1,3,6]. 3D CT scanner imaging is already currently used to apprehend the difficult three dimensional part of this pathology. However, there is no relevant direct three dimensional analysing method of these images. A reliable cephalometry requires defining a referential for facial skeleton orientation, used for intra and inter patient measurements reproducibility and for quantification of bone displacement, and a facial morphologic analysis for treatment planning decision in comparison to a norm determined as “equilibrated” face. This model should be able to be segmented for simulation as in a surgical procedure. The finite element facial soft tissue model described in section 4 should also be integrated. 3.1
Referential Definition
We propose an invariant, reproducible, orthogonal referential, defined by 3 planes (figure 1). An horizontal plane close from cranio basal planes of previous 2D cephalometries and from the horizontal vestibular plane defined as the craniofacial physiologic plane. Its construction uses anatomic reliable landmarks: head of right and left mallei and the middle point between both supraorbital foramina. The medial sagittal and frontal planes are orthogonal to the horizontal plane, and contain the middle point of both head mallei. As defined, this referential is independent from the analysed operated facial skeleton. The x, y and z coordinates of each voxel are transferred from the original CT scanner referential to this new referential. These normalised coordinates allow location or measurement comparison between two patients or in the same one across time.
318
M. Chabanas et al.
3.2
Maxillofacial Framework for Skull Analysis
The cephalometry definition requires both a maxillofacial frame for morphologic analysis and a norm, quantitative or qualitative, defined as an ideal for a pleasant equilibrated face. The operative planning is defined by differences between current patient state and this norm. We propose a maxillofacial frame (figure 1) composed of 15 anatomic reliable landmarks and 9 surfaces [13]. Mathematical tools allow metric, angular and surfacic measurements. Contrary to traditional 2D cephalometry, these are direct values and not measurements between projected and constructed points on a sagittal radiography.
Fig. 1. The craniofacial referential, and a 3D analysis example
4
Finite Element Model of the Face Soft Tissue
4.1
Methodology
Different face models have been developed for simulating maxillofacial surgery outcomes. Although the first ones were based on discrete mass-spring structures [7], most of them use the Finite Element Method to resolve the mechanical equations describing soft tissue behavior [6,8,9]. These models are based on a 3D mesh, generated from patient CT images using automatic meshing methods. Such algorithms are not straightforward in this case, as the boundary of the face soft tissue, i.e. the skin and skull surfaces, must be semi-automatically segmented, which is time-consuming and cannot be used in clinical routine. Moreover, these meshes are composed of tetrahedral elements, less efficient than hexahedral ones in terms of accuracy and convergence. Our methodology consists, first, in manually building one “generic” model of the face, integrating skin layers and muscles. Then, the mesh of this generic model is conformed to each patient morphology, using an elastic registration method and pa-
Models for Planning and Simulation in Computer Assisted Orthognatic Surgery
319
tient data segmented from CT images. The automatically generated patient mesh has then to be regularized in order to perform Finite Element analysis. 4.2
Patient Mesh Generation
A volumetric mesh was manually designed, representing soft tissue of a “standard” human face [14]. It is composed of two layers of hexahedral elements representing the dermis and hypodermis (figure 2). Elements are organized within the mesh so that the main muscles responsible of facial mimics are clearly identified.
Fig. 2. The generic 3D mesh, with embedded main facial muscles
The generic mesh is adapted to each patient morphology using the Mesh-Matching algorithm [15]. This method, based on the Octree Spline elastic registration algorithm [16], computes a non-rigid transformation between two 3D surfaces. The external skin and skull surfaces of the patient are automatically built out of CT images [17]. Then, the patient mesh is generated in two steps (figure 3) : 1. An elastic transformation is computed to fit the external nodes of the generic model to the patient skin surface, then applied to all the nodes of the mesh. 2. Another transformation is thus calculated between the internal nodes of the mesh and the patient skull surface. This second transformation is applied to non-fixed internal nodes, i.e. not located in the lips and cheeks area. A mesh conformed to the specific patient morphology is then available, still integrating the skin and muscles structures. Since nodes of the mesh are displaced during the registration, some elements can be geometrically distorted. If an element is too distorted, the “shape function” that maps it to the reference element in the Finite Element method cannot be calculated. An automatic algorithm was thus developed to correct these mesh irregularities, by slightly displacing some nodes until every element is regular [18]. Therefore, a regularized patient mesh is obtained, which can be used for Finite Element analysis. 4.3
Mechanical Properties and Boundary Conditions
In a first step, simple modeling assumptions are assumed, with linear elasticity and small deformation hypothesis [14]. The anisotropy of the face due to muscular or-
320
M. Chabanas et al.
ganization is taken into account by setting linear transverse elasticity in the muscles fibers directions. As boundary conditions, internal nodes are rigidly fixed to the skull model, except in the lips and cheeks area. To simulate bone repositioning, nodes fixed to the mandible or maxilla are displaced according to the surgical planning. Muscular activation can also be simulated to produce facial mimics [14].
Fig. 3. External nodes of the generic mesh are non-rigidly matched to the patient skin surface (left). Then, internal nodes are fitted to the patient skull surface (right). Muscles are still integrated in the new patient mesh
Fig. 4. Three models of patients with different morphologies. Each model is quasiautomatically built in about 15 minutes
4.4
Results
This mesh generation method was successfully used to build six models of patients with different morphologies. Three of them are presented in figure 4. First results of simulation, carried on the patient presented in figure 3, are shown in figure 5. The accuracy, given by the matching algorithm, is under 1mm. Although, one of the main advantage of this straightforward and easy to use method is the time re-
Models for Planning and Simulation in Computer Assisted Orthognatic Surgery
321
quired to build a patient model. It is almost automatic, and does not require the interactive definition of landmarks on patient data. The only thing the user has to check is the quality of the marching cubes reconstruction, then the initial position of the generic mesh with respect to the patient skin surface before the registration. The total reconstruction time for a patient model is 15 minutes in mean, principally for the Marching-cube and the mesh regularization computation. Hence, this model generation method is suitable to be routinely used by a surgeon in the elaboration of a surgical planning.
Fig. 5. Simulation of soft tissue deformation resulting from mandible and maxilla repositioning
4.5
Validation Protocol
A primary point when using biomechanical models is their validation, especially in a quantitative way. Since our modeling has been designed in the framework of computer-aided maxillofacial surgery, the simulations of soft tissue deformation will be validated within this framework, using the developed clinical application. Postoperative CT data will be acquired, initially for at least two patients. Three steps will therefore be carried out: 1. The first point is to determine the surgical displacements, in the referential defined in section 3, from anatomical landmarks located on maxilla and mandible. 2. These displacements will then be inputted in the model, to simulate the surgical outcomes in terms of bone repositioning, and therefore soft tissue deformation. 3. Finally, these simulations will be quantitatively compared to the real postoperative appearance of the patient. Once these quantitative measurements are available, the biomechanical model could be improved (large deformations, non linear law) to enhance the simulations quality.
5
Conclusion
New concepts of 3D cephalometry and soft tissue prediction were introduced for computer aided techniques and surgical practice. Current research concerns clinical validation of both models and their integration in a complete protocol.
322
M. Chabanas et al.
References 1. Cutting, C., Bookstein, F.L., Grayson, B., Fellingham, L., Mc Carthy, J.G., 3D computerassisted design of craniofacial surgical procedures: optimization and interaction with cephalometric and CT-based models. Plast Reconstr Surg, 77(6), pp. 877-885, 1986. 2. Altobelli DE, Kikinis R, Mulliken JB, Cline H, Lorensen W, Jolesz F, Computer-assisted three-dimensional planning in craniofacial surgery. Plast Reconstr Surg 92: 576-587, 1993. 3. Bettega G., Payan Y., Mollard B., Boyer A., Raphaël B. and Lavallée S., A Simulator for Maxillofacial Surgery Integrating 3D Cephalometry and Orthodontia, Journal of Computer Aided Surgery, vol. 5(3), pp. 156-165, 2000. 4. Treil J., Borianne Ph., Casteigt J., Faure J., and Horn A.J., The Human Face as a 3D model: The Future in Orthodontics, World Journal of Orthodontics, vol 2(3), pp. 253-257, 2001. 5. Barré, S., Fernandez, C., Paume, P., Subrenat, G., 2000. Simulating Facial Surgery. Proc. of the IS&T/SPIE Electronic Imaging, vol. 3960, pp. 334-345 6. Zachow, S. ; Gladilin, E. ; Zeilhofer, H.-F. ; Sader, R.: Improved 3D Osteotomy Planning in Cranio-Maxillofacial Surgery. MICCAI’2001, Springer Verlag 2208, pp. 473-481, 2001. 7. Lee Y., Terzopoulos D. and Waters K., Realistic Modeling for Facial Animation, SIGGRAPH'95, pp. 55-62, 1995. 8. Keeve E., Girod S., Kikinis R., Girod B., Deformable Modeling of Facial Tissue for Craniofacial Surgery Simulation, J. Computer Aided Surgery, 3, pp. 228-238, 1998. 9. Schutyser P., Van Cleynenbreugel J., Ferrant M., Schoenaers J., Suetens P., Image-Based 3D Planning of Maxillofacial Distraction Procedures Including Soft Tissue Implications, MICCAI'2000, Springer-Verlag 1935, pp. 999-1007, 2000. 10. Bettega G, Dessenne V, Cinquin P, Raphaël B, Computer assisted mandibular condyle positioning in orthognatic surgery. J. Oral Maxillofac. Surg. 54(5): 553-558, 1996. 11. Marmulla R., Niederdellmann H., Surgical planning of computer-assisted repositioning osteotomies. Plast Reconst Surg; 104:938-944, 1999. 12. Schramm A, Gelldrich NC, Naumann S, Buhner U, Schon R and Schmelzeisen R., Noninvasive referencing in computer assisted surgery. Med Biol Eng Comp, 37:644-645, 1999. 13. Chabanas M., Marecaux Ch., Payan Y. and Boutault F., Computer Aided Planning For Orthonatic Surgery, Computer Assisted Radiology and Surgery, CARS’2002. 14. Chabanas M. and Payan Y., A 3D Finite Element model of the face for simulation in plastic and maxillo-facial surgery, MICCAI'2000, Springer-Verlag 1935, pp.1068-1075, 2000. 15. Couteau B., Payan Y., Lavallée S., The Mesh-Matching algorithm: an automatic 3D mesh generator for finite element structures, J. of Biomechanics, vol. 33/8, pp. 1005-1009, 2000. 16. Szeliski R., Lavallée S., Matching 3-D anatomical surfaces with non-rigid deformations using octree-splines, J. of Computer Vision, 18(2), pp. 171-186, 1996. 17. Lorensen W.E, Cline H.E., Marching Cube: A High Resolution 3D Surface Construction Algorithm. ACM Computer Graphics 21:163-169, 1987. 18. Luboz V., Payan Y., Swider P., Couteau B., Automatic 3D Finite Element Mesh Generation: Data Fitting for an Atlas, Proceedings of the Fifth Int. Symposium on Computer Methods in Biomechanics and Biomedical Engineering, CMBBE’2001.
Simulation of the Exophthalmia Reduction Using a Finite Element Model of the Orbital Soft Tissues Vincent Luboz1, Annaig Pedrono2, Pascal Swider2, Frank Boutault3, and Yohan Payan1 1
Laboratoire TIMC-GMCAO, UMR CNRS 5525, Faculté de Médecine Domaine de la Merci, 38706 La Tronche, France _:MRGIRX0YFS^=SLER4E]ERa$MQEKJV 2 INSERM, Biomécanique, CHU Purpan, BP 3103, 31026 Toulouse Cedex 3, France _ETIHVSRSTEWGEPW[MHIVa$XSYPSYWIMRWIVQJV 3 CHU Purpan, Service de Chirurgie Maxillo-faciale, 31026 Toulouse Cedex 3, France &398%908*$GLYXSYPSYWIJV
Abstract. This paper proposes a computer-assisted system for the surgical treatment of exophthalmia. This treatment is classically characterized by a decompression of the orbit, by the mean of an orbital walls osteotomy. The planning of this osteotomy consists in defining the size and the location of the decompression hole. A biomechanical model of the orbital soft tissues and its interactions with the walls are provided here, in order to help surgeons in the definition of the osteotomy planning. The model is defined by a generic Finite Element poro-elastic mesh of the orbit. This generic model is automatically adapted to the morphologies of four patients, extracted from TDM exams. Four different FE models are then generated and used to simulate osteotomies in the maxillary or ethmoid sinuses regions. Heterogeneous results are observed, with different backwards movements of the ocular globe according to the size and/or the location of the hole.
1
Introduction
The exophthalmia is a pathology characterized by an excessive forward displacement of the ocular globe outside the orbit (Figure 1 (a)). This forward displacement (“protrusion”) is a consequence of an increase in the orbital content behind the globe. The consequences of the exophthalmia are aesthetical and psychological. It may have functional consequences such as a too long cornea exposition or, in the worst case, a distension of the ocular nerve that leads to a decrease of the visual acuity and sometimes to a total blindness. Four origins can be found to exophthalmia [1]. First, following a trauma, an haematoma can compressed the optic nerve which can reduce the visual perception. A surgical decompression of the haematoma may then be necessary. The second exophthalmia cause is cancerous, with tumor in one or both orbits that may reduce the mobility of the globe. Radiotherapy can be used and surgical extraction of the tumor is sometimes needed. Third cause of exophthalmia: infections, that are treated with antibiotics. Finally, and for most cases, exophthalmia can be due to endocrinal dysfunction, such as the Basedow illness which is related to a thyroid T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 323–330, 2002. © Springer-Verlag Berlin Heidelberg 2002
324
V. Luboz et al.
pathology. This cause often leads to a bilateral exophthalmia as this dysfunction induces an increase of the ocular muscles and fat tissues volume. Once the endocrinal situation is stabilized, a surgical reduction of the exophthalmia is usually needed.
Fig. 1. (a) Left: three degrees of exophthalmia (from top to bottom, the protrusion becomes more and more severe). (b) Right: decompression in the sinuses region; some fat tissues are evacuated inside those new cavities
Two surgical techniques are classically used for this decompression. The first one aims at extracting few cm3 of fat tissues via an external eyelid incision. The advantage of this method is the relative security during surgery since the ocular nerve is far from the fat tissues that are extracted. The drawbacks are (1) a small backward displacement of the globe due to the limited fat tissues in the eyelid region, and (2) an aesthetical risk since the eyelid incisions can leave visible scars. The second surgical technique aims at increasing the volume of the orbital cavity, with an osteotomy (i.e. a hole in the maxillary or ethmoid sinuses regions Figure 1 (b)) of the orbital walls [2][3]. No fat tissues are therefore extracted. Indeed, following this osteotomy and with an external pressure exerted by the surgeon onto the ocular globe, some fat tissues can be evacuated through the hole, towards the sinuses. The advantages of this method are (1) limited scars due to the incision that is made inside the eyelid region and (2) a backward displacement of the ocular globe that can be much higher. The drawback is the risk for the surgeon to cause a decrease of the visual acuity or even a blindness since he works near the ocular nerve. The surgical gesture has therefore to be very precise to avoid critical structures. Moreover this intervention is technically difficult since the small eyelid incision gives a small visibility of the operating field. This gesture could be assisted in a computer-aided framework from two points of view. First, a computer-guided system could help the surgeon in localizing the tip of the surgical tool inside the orbit and therefore maximize the precision of the intervention. Second, a model could assist the surgeon in the planning definition: where opening the orbit walls and to which extend? These two questions are directly related to the surgical objective that is expressed in term of suited backward displacement of the ocular globe. This paper addresses the second point, by proposing a biomechanical 3D Finite Element (FE) model of the orbit, including the fat tissues (modeled by a FE mesh) and the walls (simulated by boundary conditions). First qualitative simulations of walls opening are provided on different patient models and compared with observed phenomena.
Simulation of the Exophthalmia Reduction
2
Orbital Soft Tissue Modeling and Surgical Simulation
2.1
Strategy
325
The modeling strategy that has been adopted by our group is driven by the surgeons needs: (1) the models must be precise as they are used for clinical predictions and (2) tools must be provided to automatically adapt models to each patient morphology. Our strategy to follows those requirements is (see [4] for a review): – the manual elaboration of a “generic” (“atlas”) biomechanical model; this step is long, tedious, but is a prerequisite for accuracy of the modeling; – patient data acquisition, from US, CT or MRI exams; – conformation of the generic biomechanical model to patient morphology, by the mean of elastic registration. In the case of exophthalmia, pre-operative and post-operative CT exams are systematically collected. This modality is therefore used for the elaboration of the generic model, as well as for its conformation towards patient geometries. 2.2
A Generic Biomechanical Model of the Orbit Cavity
In order to define the generic model of the orbital soft tissues, CT data, collected for a given patient (considered therefore as the “generic” patient), are used for the extraction of the orbital cavity, fat tissues, ocular muscles and nerve geometries. 2.2.1 Tri-dimensional Reconstruction of the Orbit Cavity For each image of the CT exam, a manual segmentation of the main ocular structures is made through B-Splines definitions (figure 2 (a)). By connecting those curves, 3D reconstruction is provided for the orbital cavity, the muscles as well as the ocular nerve (figure 2 (b)).
nerve
muscle
Fig. 2. (a) Manual Spline segmentation. (b) 3D representation of the orbital cavity, with the orbit limits, the four muscles and the ocular nerve (the ocular globe is not displayed)
326
V. Luboz et al.
In addition to the 3D reconstruction, this image processing tool can be used to compute volumes of individual ocular structures. For example, in the case of the patient used to define the geometry of the “generic” model, a value of 30cm3 is measured for the orbital cavity. 2.2.2 Biomechanical Modeling of the Orbital Structures Besides the study of the intra orbital soft tissues volumes, the segmentation of the CT exam provides a base for the definition of a 3D Finite Element mesh. To our knowledge, the human orbit has not yet been modeled through a FE approach. The ocular globe has been studied using a FE model [5] but this model is not appropriate for the exophthalmia study since it does not include the soft tissues behind the globe. Starting from the successive splines segmented from CT slices, a volumetric mesh, made of bilinear hexahedrons (with 20 nodes), has been manually built. Because of the huge amount of work needed for this meshing, the internal soft tissues, namely the fat, muscles and nerve, were all modeled as a single biomechanical entity. The biomechanics of this entity was modeled with the FE Method considering a poroelastic behavior of the structure. This material was chosen following discussions with clinicians that describe the intra-orbital soft tissues as similar to the behavior of a sponge full of water. The material is composed of a fluid phase that saturates and flows through the pores of a deformable solid skeleton (see Biot [6]). The MARC Finite Element package was used to model this material, with mechanical values closed to soft tissues values reported in the literature [7]: 20kPa for the Young modulus, 0.4 for the Poisson’s ratio. A 1mm4/N.s value was chosen for the permeability, in order to model the observed retention of the fluid.
Fig. 3. Left : 3D FE generic mesh: all elements are hexahedrons ; the simulated hole is located in the ethmoid sinus region. Right: schematic of the orbit and location of the mesh contour
2.2.3 Simulation of the Surgical Gesture In order to simulate the surgical gesture characterized by (1) an osteotomy of the orbital walls, (2) a pressure exerted by the surgeon onto the globe and (3) an orbital initial pressure, specific boundary conditions were defined for the generic model. First, to simulate the osteotomy, nodes located in the hole region (figure 3) were assumed to be free to move, whereas the other surface nodes located along the orbital
Simulation of the Exophthalmia Reduction
327
walls were maintained fixed. All the other nodes located inside the volumetric mesh were modeled without any boundary conditions. Second, in order to model contact forces between internal soft tissues and the ocular globe, resulting from the pressure exerted by the surgeon, an imposed force of 10 Newton was distributed along the nodes located at the interface (figure 3). Third, an initial pressure of 0.01MPa was introduced in the model to simulate the increase of soft tissues volume. 2.3
Conformation of the Generic Model to Patient Morphology
Following the strategy introduced in part 2.1, data has to be collected for each patient. CT exam being classically required for exophthalmia, surfacic points, located along the orbital walls, were manually extracted from CT images, by the mean of the manual B-Spline segmentation process described above. A set of 3D points, describing the patient orbit surface geometry, is thus obtained. The Mesh-Matching algorithm [4], followed by a correction method [8] is then applied to match the generic model towards the patient geometry. A new 3D mesh is therefore automatically generated, adapted to patient geometry and sufficiently regular to perform a FE Analysis.
3
Results
3.1
Patient Conformation and Simulation of the Orbital Decompression
The aim of this study is to assist the surgical planning by estimating (1) the influence of the hole size and of its location and (2) the mechanical behavior of the orbit, with the simulated soft tissues evacuation through the hole. Four patients were studied for this paper. In collaboration with surgeons, and for each patient, four different holes were simulated, assuming two locations (the forwards and backwards ethmoid sinus regions) and two sizes for the degree of osteotomy (1.4cm² and 2.9cm²). Figure 4 plots one patient FE model generated by the conformation process, and the corresponding four holes. For these figures, a 10 Newton pressure force values was applied to the globe and the displacement of the hole elements is outlined. Two aspects of the simulations were studied: (1) the relation between the hole size/location and the ocular globe backward displacement and (2) the observed decompression of the orbit, with the mechanical behavior of the soft tissues. 3.2 Relation between Hole Size/Location and Ocular Globe Backward Displacement For each of the four patients, simulations were carried out and relationships between the exerted globe pressure and the corresponding simulated backward displacement were computed.
328
V. Luboz et al.
Fig. 4. Patient simulations with four different holes, from left to right: large holes at the front of the orbit and at the back of the orbit (top), and medium holes at the front of the orbit and at the back of the orbit (bottom). Moving elements are darker than non moving elements
The first interesting point is that, whatever the patient, all relationships are very similar. Figure 5 plots the relationship for one given patient. Whatever the level of exerted pressure force, the globe displacements due to the big hole located at the front of the orbit are 10% greater than displacements observed with a backward hole, and 40% greater than displacements due to the medium holes. Two consequences can be addressed by those results. First, both medium holes seem not able to provide a globe backward displacement larger than 3mm. Second, the big hole located at the front of the orbit seems more “efficient” as it provides a greater backward displacement of the globe. For example, with a 0.02MPa pressure value (which correspond to a force of 6N applied onto the globe), a backward displacement of nearly 5mm is reached with this hole. Those results seem to be realistic as they stand in the range of values reported by [9]. Of course, clinical measurements are needed to confirmed these analysis. Note also that the time required for a FE simulation with our model still does not fulfill the needs of the surgeon. Indeed, on a PC equipped with a 1,7GHz processor and a 1Go memory, the process lasts nearly 2 hours which is far from the real-time model needed for a computer assisted surgery.
Simulation of the Exophthalmia Reduction
329
globe backward displ. (mm)
6 5 4 3 2 BigHoleFront MedHoleFront
1
BigHoleBack MedHoleBack
0 0
0,005
0,01
0,015
0,02
pressure on the globe (MPa)
Fig. 5. Globe backward displacement resulting from various pressure forces (simulations are carried out for a given patient and with the four holes describe above)
3.3
Decompression of the Orbit
volume of fat tissues decompressed (cm3)
This part evaluates the mechanical behavior of the intrinsic soft tissues, with in particular the total amount of tissues that are evacuated through the hole. Figure 6 plots, for each patient, the computed evacuated volume under different levels of pressure forces. The interesting observation is that, whereas few heterogeneity among patients was observed for the globe backward displacement, significant variations of the volume of fat tissues evacuated through the hole are obtained from one patient to the other one. Those variations are probably due to the shape of the FE mesh and therefore to the geometry of each patient orbit. For example, a difference of 30% can be seen between patientZJ and patientFJ, for the same hole and the same pressure.
3,5 3 2,5 2 1,5 1
patientBJ patientFJ
0,5
patientLF patientZJ
0 0
0,005
0,01
0,015
0,02
pressure on the globe (MPa)
Fig. 6. Volume of fat tissues decompressed and evacuated through the large hole located at the front of the orbit. The volume is computed for the four patients
330
4
V. Luboz et al.
Conclusion
This paper has presented a first evaluation of the feasibility of a Finite Element modeling of the surgical orbital decompression, in the context of exophthalmia. First simulations were carried out on four patients. Finite Element models of these patients were automatically generated from an existing generic model. For each patient, four different holes were simulated in the ethmoid sinus regions: two locations and two sizes. The ocular globe backward displacements as well as the intrinsic soft tissues that are evacuated through the hole were carefully studied, as they can be both considered as criteria for the “efficiency” of the surgery. Heterogeneity is the main conclusion of these simulations. Indeed, different levels of globe backward displacement are observed according to the sizes and/or to the locations of the holes. Moreover, for a given hole size and location, different volumes for soft tissues evacuated through this hole are observed according to patients geometries. It seems therefore that surgeons should use such results in order to optimize their surgical planning. Before such a perspective, two points must be addressed. First, the method must be clinically validated, by comparing simulations provided by the model, with pre and post-operative data collected on a given patient and then on a set of standard patients. Second, computation times must be studied and drastically improved, in order to provide a solution that allows fast simulations.
References 1. Saraux H., Biais B., Rossazza C., 1987. Ophtalmologie. Ed. Mason. Chap. 22 Pathologie de l’orbite, pp. 341-353. 2. Wilson W.B., Manke W.F., 1991. Orbital decompression in Graves’ disease. The predictability of reduction of proptosis. Arch. Ophthalmology., vol. 109, pp. 343-345. 3. Stanley R.J., McCaffrey T.V., Offord K.P., DeSanto L.W., 1989. Superior and transantral orbital decompression procedures. Effects on increased intra-orbital pressure and orbital dynamics. Arch. Otolaryngology Head Neck Surgery, vol. 115, pp. 369-373. 4. Couteau B., Payan Y., Lavallée S., 2000. The mesh-matching algorithm: an automatic 3D mesh generator for finite element structures. Journal Biomechanics, vol. 33, pp. 10051009. 5. Sagar M.A., Bullivant D., Mallinson G.D., Hunter P.J., Hunter I.W., 1994. A virtual environment and model of the eye for surgical simulation. Supercomputing Vol. 2, No. 7. 6. Biot M.A., 1941. General theory of three dimensional consolidation. J Appl Phys Vol.12, pp 155-164. 7. Fung, Y.C., 1993. Biomechanics: Mechanical Properties of Living Tissues. New York: Springer-Verlag. 8. Luboz V., Payan Y., Swider P., Couteau B., Automatic 3D Finite Element Mesh Generation: Data Fitting for an Atlas, Proceedings of the Fifth Int. Symposium on Computer Methods in Biomechanics and Biomedical Engineering, CMBBE'2001. 9. Jin H.R., Shin S.O., Choo M.J., Choi Y.S, 2000. Relationship between the extent of fracture and the degree of enophthalmos in isolated blowout fractures of the medial orbital wall. Journal Oral Maxillofacial Surgery. Vol. 58, pp 617-620.
A Real-Time Deformable Model for Flexible Instruments Inserted into Tubular Structures Markus Kukuk1,2 and Bernhard Geiger1 1
SIEMENS Corporate Research, Imaging & Visualization, Princeton, NJ, USA 2 University of Dortmund, Computer science VII, Germany
[email protected]
Abstract. In this paper we present an approach to the problem of modelling long, flexible instruments, such as endoscopes or catheters. The idea is to recursively enumerate all possible shapes and subsequently filter them according to given mechanical and physical constraints. Although this brute-force approach has an exponential worst-case complexity, we show with a typical example that in case of tubular structures the empirical complexity is polynomial. We present two approximation methods that reduce this bound to a linear complexity. We have performed accuracy, runtime and robustness tests in preparation for first clinical studies.
1
Introduction
Flexible instruments, like endoscopes or catheters, play a central role in the field of minimally invasive surgery. They provide access to even remote operating sites within the human body through natural body openings or small incisions. However, performing endoscopic procedures or catheterizations presents a challenge to the physician. Though an endoscope has a CCD camera inside its tip that can be actively moved around by the physician, it is still difficult to derive the 3D position of the tip from a 2D video image. In recent years numerous attempts have been made to guide endoscopic procedure by determining the position and orientation of the instrument’s tip. Solomon et al. (see [1]) uses position sensors attached to the tip, Bricault et al. [2] introduced the idea of analyzing only the video images and Mori et al. [3] significantly improved this approach by achieving continuous tracking independent of the presence of strong features in the image. The authors report a processing time of 6 second per frame. Our group has recently presented a new approach to the problem [1]. We have suggested the use of a flexible endoscope model to guide a “blind” biopsy (TBNA). The model has been used in a pre-operative planning phase to derive a set of parameters that describe how to handle the endoscope in order to manoeuvre the biopsy needle inside the target. This approach requires no computer in the operating room and is inherently real-time. A dynamic endoscope model based on multibody mechanics has been published by Ikuta [4] as part of a virtual endoscopy simulator with force sensation. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 331–338, 2002. c Springer-Verlag Berlin Heidelberg 2002
332
M. Kukuk and B. Geiger
Models for catheters have been published by vanWalsum et al. [5] who uses a snake-like approach and Anderson et al. [6] who uses FEM analysis. In this paper we present a new deformable model for flexible instruments that is accurate, fast and robust. For a given insertion depth and target site, the model calculates the shape of the instrument by considering its mechanical constraints and physical environment.
2
Model Description
The model described in this paper consists of the following three components: 1. Discrete representation of the instrument. 2. Generator that enumerates, based on (1), all possible shapes considering given internal mechanical constraints. 3. Filter that selects from (2) only those shapes that comply with the instrument’s physical constraints. Note, that depending on the filter selectivity, more than one solution may be found. A natural way to represent a flexible tube like structure is as a chain of rigid links, interconnected by discrete ball and socket joints. A link is represented by a cylinder of certain length and diameter. A joint connects two adjacent links. If a joint restricts a link to a finite number of positions (joint positions) with respect to its predecessor, we call it a discrete joint. Link length and joint range determine the maximum flexibility of the endoscope. The mechanical constraints like varying diameter, rigid sleeves and maximum flexibility are modeled by determining for each link of the chain a suitable link diameter, length and maximum joint range. Let L be the set of all links and L = P(L) the power set of L. This system can formally be described as the concatenation of two functions: I = ffilter ◦ fgen (L)
(1)
with fgen : L → L the generator, ffilter : L → L the filter and I ⊂ L the resulting instruments. Algebraically, the generator can be described as the concatenation of two filter functions operating on L: u,v,θ s,N ◦ flink (L) fgen (L) = fjoint
(2)
with u the number of rotation axis and v the number of discrete rotation steps of angle θ for each axis. The number of possible positions between two adjacent links is uv and the joint range is vθ. The parameter s ∈ L denotes the start link and N the number of links (desired instrument length). Filter flink controls the length and size constraints and fjoint controls the flexibility. Function ffilter is given as a concatenation of a geometry, tube and energy filter: α,β,p ◦ ftube ◦ fgeom ffilter = fenergy
(3)
A Real-Time Deformable Model for Flexible Instruments
333
with α, β two material constants for bending and torsion and p the filter selectivity. Filter fgeom filters those links that collide with the organ wall and ftube represents a simple bounding tube filter, which based on an insertion protocol defines a ROI in case of the existence of bifurcations. Finally, filter fenergy finds the global minima (for p = 1) regarding the instrument’s deformation energy: fenergy = {A ∈ L | (Eκ (A) + Eτ (A)) = min}
(4)
p
with min reading “among the p smallest values”, Eκ the internal bending energy p
and Eτ the internal torsion energy of an instrument. The two discrete energy terms are given by: Eκ (A) =
N −1
α κ(A, i)
i=1
2
and
Eτ (A) =
N −1
2
β τ (A, i)
(5)
i=1
with α the amount of resistance to bending, κ(A, i) the angle between link i and i + 1 of an instrument A, β the amount of resistance to twisting and τ (A, i) the torsion between link i and i + 1 of an instrument A. The physical basis for this model is that we regard an endoscope as an elastic structure, which obeys Hooke’s law. Although our model reflects quite well the instrument’s mechanical structure, it disregards external factors like organ deformation, friction etc. Instead, we accept some degree of fuzziness regarding the exact tip location and try to cover this fuzziness by computing a set of possible tip shapes. A natural way to expand the solution set of our model, is to relax the selectivity of filter fenergy in the sense that it determines the shapes of the p > 1 smallest energies, rather than just one shape of minimum energy. 2.1
Implementation
We introduced the concept of modelling a flexible instrument by enumerating all possible shapes and filtering the result according to given constraints. A natural way to implement this concept is to recursively create a spatial tree (depthfirst search backtracking), whose growth is constraint according to a set of filter functions. A spatial tree is an ordinary tree data structure, where each node represents a joint in 3D space. Each edge that connects a node with its child represents a link in 3D space. Each path from the root to a leaf represents a chain of links and therewith a flexible instrument. The entire spatial tree, so all path from the root to the leafs represent the instrument’s full workspace under the given constraints. A link is represented by a coordinate system attached to a cylinder with length c and diameter d. The coordinate system is attached to the cylinder, in a way that it’s z-axis is the centerline of the cylinder. The cylinder’s bottom and top bases lie in the z = 0 and z = c plane. An algorithm fgen that creates such a spatial tree, takes a link l ∈ L as input, attaches uv links to l and recursively calls fgen for each attached link: fgen (l) = fgen (Ri,j l T(c)) for i = 1, . . . , u , j = 1, . . . , v
(6)
334
M. Kukuk and B. Geiger 5
Number of basic operations (recursive calls)
2.5
x 10
2
polynomial least−squares fit measures for naive method measures for approx. method linear fit
1.5
1
y=1.1x4 + 39.4 x3 + 426.3 x2 +1339.5 x
0.5
0
−0.5 0
y = 761.4 x + 4358
5
10
15
20
25
Number of links N (recursion depth)
30
35
Fig. 1. Measured complexity of a catheter inserted into a brain artery. Left: Naive method. Right: Same data set using (n, k)- and (u, v)- approximation. Center: Fourthorder polynomial model for the naive method. Linear model for the approximation.
with T(c) a translation matrix that moves l a distance of c mm along its main axis towards the link end and Ri,j a rotation matrix that rotates l into the new position. The rotation matrix is given as an entry in a pre-computed look-up table of u columns and v rows: Ri,j = R(r i , j θ) for i = 1, . . . , u , j = 1, . . . , v
(7)
with jθ the rotation angle and r i the i-th rotation axis. For example for u = 9: T r i ∈ (x y 0) | x, y ∈ {−1, 0, 1}
2.2
(8)
Complexity
We propose to create a spatial tree to enumerate all shapes an endoscope can take, given a start link and a maximum number of N links. The time and space complexity for this naive approach is O((u v)N ). However, depending on how much the growth of the spatial tree is constraint by the filter functions, the practical complexity, given a real anatomy, is more feasible. Especially tubular structures such as the tracheobronchial tree and the vasculature greatly limit the growth of the tree. The following experiment confirms this hypothesis: For a fix start position, we use our flexible instrument model (u = 9, v = 1) to calculate a catheter of length N, inserted into a brain artery. We do this 31 times, for N = 1, . . . , 31. For each calculation we count the number of recursive calls needed to enumerate all possibilities, given the geometry and energy filter. As shown in Fig. 1 a fourth-order polynomial can be fit (least squares) to the resulting curve, indicating that the real complexity for this anatomy is O(N 4 ). For this naive method, 17 links could be computed in less than one second on a Pentium 4, 1.3 GHz processor. The second pair of curves shows the complexity after activation of two approximation methods described in the following two sections.
A Real-Time Deformable Model for Flexible Instruments
2.3
335
(n, k)-Approximation
We have shown with a typical example that the actual complexity appears to be polynomial (O(N 4 )) for tubular structures, instead of exponential. We have also shown that for a considerable number of links n << N computation can be done in real-time, given off-the-shelf PC hardware. A straight forward idea for accelerating the computation, is to compose an instrument of length N from several instrument segments of length n. The idea is, to calculate an instrument of length N , by connecting together Nk segments of length k. Each segment of length k is obtained by simply taking the first k links of an “sub-instrument” of length n. The start link of each sub-instrument is given by the k-th link of the previous sub-instrument. All root-to-leaf paths of a sub-instrument’s spatial tree can be regarded as a set of “tentacles”, that reach out to explore the environment ahead. Formally: N k
I(s, N ) =
Ii (si , n)[0, . . . , k − 1] with s0 = s, si = Ii−1 (si−1 , n)[k]
(9)
i=1
I(s, N ) is the resulting instrument with start link s and length N and Ii () the i-th “sub-instrument”. The notation I()[i], resp. I()[i, . . . , j] denotes the i-th, resp. i-th to j-th link (head to tip) of instrument I. The new complexity is:
N n (10) (u v) ≤ O (u v)N O k Fig. 2 top right shows an example with n = 16 and k = 2. It shows as an intermediate result the first 13 segments, which means that 26 links out of N = 44 have been computed. For the bottom figure n = 20 and k = 4. To take account of the frictional forces acting on the endoscope’s tip, we set p = 7 (eqn. 4) for the last segment and p = 1 for all others. 2.4
(u, v)-Approximation
We now describe a technique to further speed up the computation. It is based on the observation that the energy filter favors configurations with many small joint angles instead of a few big angles. In other words it tends to distribute bending on many joints, using a small angle for each joint. As a consequence the dispersion of all angles tends to be small. This is particularly true for short tentacles of length n << N used in the (n, k)-approximation scheme. The idea now is, to constrain for each execution of the model the maneuverability of each joint to only one possible angle and to stepwise increase this angle between subsequent executions: Ih (s, n, j) for j = 1, . . . , v with fgen (l, j) = fgen (Ri,j l T(c)) for i = 1, . . . , u (11) with Ih () the “sub-instrument” of equation 9 extended by a parameter j. Note, that a 0◦ angle is included in the u principal directions. The resulting configuration is the one with minimum energy among the v executions of Ih (). The new complexity is given by:
336
M. Kukuk and B. Geiger
O
N k
n
u v
≤ O
N k
n
(u v)
(12)
Fig. 1 shows a linear complexity for N = 32, n = 10, k = 5, u = 9, v = 20, θ = 2◦ .
3 3.1
Experiments Model Calibration and Validation
Objective: We describe an experiment to determine the intrinsic model parameters. The idea is, to measure the center line of a real endoscope inserted into a calibration phantom and to generate a matching virtual endoscope by finding suitable values for the model parameters. Material: Hardware, see Fig. 2, left: Optical tracking system ARTtrack 1 (A.R.T. GmbH, www.ar-tracking.de), comprising two IR cameras and a set of passive, retro-reflective markers. Video endoscope OLYMPUS GIF-100 (9.5 mm diameter) with 30 stripe markers (10 mm width) wrapped around it’s shaft like a ring in a distance of ca. 25 mm. Board with a “M”-shaped calibration path, 1050 mm long, 50 mm wide. We have attached three disk markers to the board. Pointing device with a calibrated tip. Design and Methods: (1) Since the tracking system is designed to determine the center of either ball or disk markers, we have to verify it’s accuracy in measuring the center of our ring markers. The idea is to verify whether the center line of the endoscope (given by the center of each ring marker) is in a distance to the board according to the radius of the endoscope. The board surface plane is determined using the three disk markers. (2) We insert the endoscope a distance of 900 mm into the “M” path. After the insertion, we record a reading of all endoscope markers. To draw the measured markers and the digital model of the “M” in one common reference frame, we need to find a rigid body transformation that maps one frame into the other. To find the transform, we calculate the best fit in the least-squares sense between a set of reference points obtained with the pointer and the corresponding virtual points. To solve the resulting non-linear minimization problem, we use the “Levenberg-Marquardt” method. Results: (1) We are able to accurately determine the center line of a real endoscope using ring markers. The tracking system could determine the position of 26 out of 30 markers. The average distance between the markers and the board is 5.17 mm (SD: 0.8 mm), given a 4.75 mm shaft radius and a 0.5 mm marker thickness. (2) Fig. 2, bottom right shows the best match between the markers (black balls) and the model. The model consists of 9 segments (color coded), the first 8 consist of 4, the last consists of 12 links. Each link is 20.45 mm long. By calculating the 7 smallest energies for the last segment, we obtained one segment that matched the last 6 markers by a Hausdorff distance of 0.8 mm.
A Real-Time Deformable Model for Flexible Instruments
337
Fig. 2. Left: Experimental setup (photo, flash). Right: Screen shots. Top: Resulting tentacles (p = 1) of the first 13 segments. Bottom: Real endoscope (black balls) and final result showing the 7 smallest energies for the last segment.
3.2
Accuracy, Run-time and Robustness Tests
Material: In addition to the hardware from the previous experiment: Phantom of the tracheobronchial tree made of transparent plastic tubes. We placed 37 sticks as reference points for the rigid body transformation in the model. Software: The phantom was scanned (CT, 512x512x382, 1 mm slice distance, 1.2 mm thickness), the reference points were manually identified and a lung phantom model was reconstructed (ca. 13000 triangles). Design and Methods: (1) To asses the influence of refraction caused by the plastic tubes on the measures, we have inserted a calibration wand with markers of known distance into the phantom. (2) We insert the endoscope into different branches of the phantom (see Fig. 3 (a)) and record the position of the markers for different insertion depths. The position and orientation of the real endoscope’s tip is given by it’s first rigid sleeve (d), to which we have attached two markers. To asses the accuracy of our model, we compare these two markers to the position and orientation of the corresponding virtual sleeve. We have performed 5 tests and for each we have calculated the best result regarding a match between 10 virtual sleeves (p = 10) and the real sleeve. Results: (1) The influence of refraction is within the accuracy of the tracking system (0.1 mm). (2) The worst result out of the 5 best results is 2.5 mm (position) and 7◦ (orientation). The best result is 0.7 mm and 4◦ . The time needed to calculate and display the model shown in Fig. 3 (b) with a maximum insertion depth of 355 mm is 0.6 seconds on a Pentium 4, 1.3GHz. We were able to generate a continuous animation of a 240 mm long insertion, by increasing the insertion depth by 1 mm, which demonstrates the algorithm’s robustness. Computation and display time for the animation (240 frames) is 27 seconds.
338
M. Kukuk and B. Geiger
Fig. 3. (a) OLYMPUS GIF-100 endoscope with reflective markers inserted into a lung phantom. (b) Model of the GIF-100 inside the same branch as in (a). (c) Tentacles shown for each segment. (d) The two rigid sleeves (35 and 25 mm long) of the GIF-100 digital model. The model’s tip was bent by 90◦ to reach into the smaller branch.
4
Conclusion
We have presented a deformable model for flexible instruments. The model reflects special mechanical constraints often found with flexible instruments, like a tip that can be bent to a much higher degree than the shaft (Fig. 3 (d)), rigid sleeves within flexible sections and a non-negligible shaft diameter. We demonstrated its use as a catheter inserted into vasculature and as an endoscope inserted into the tracheobronchial tree. An interesting property of our model is the option to generate several possible shapes for the instrument’s tip. Furthermore, it requires no initialization in form of an initial “good guess” (like snake approach) for the final shape and no preprocessing (like FEM approach).
References 1. Markus Kukuk et al. TBNA-protocols - Guiding TransBronchial Needle Aspirations Without a Computer in the Operating Room. In MICCAI ’01 - LNCS 2208, pages 997–1006. Springer. 2. I. Bricault, G. Ferretti, and P. Cinquin. Multi-level Strategy for Computer-Assisted Transbronchial Biopsy. In MICCAI, volume 1496 of LNCS. Springer, 1998. 3. Kensaku Mori et al. A Method for Tracking the Camera Motion of Real Endoscope by Epipolar Geometry Analysis and Virtual Endoscopy System. In MICCAI, volume 2208 of LNCS, pages 1–8. Springer, 2001. 4. Koji Ikuta et al. Portable Virtual Endoscope System with Force and Visual Display for Insertion Training. In MICCAI, volume 1935 of LNCS. Springer, 2000. 5. Theo van Walsum et al. Deformable B-splines for catheter simulation. In CARS’99, 1999. 6. J. Anderson, W. Brody, et al. daVinci - A vascular catheterization and interventional radiology-based training and patient pretreatment planning simulator. In Proc. of Society of Cardiovascular and Interventional Radiology (SCVIR), March 1996.
Modeling of the Human Orbit from MR Images Zirui Li*1, Chee-Kong Chui1, Yiyu Cai2, Shantha Amrith3, Poh-Sun Goh4, James H. Anderson5, Jeremy Teo1, Cherine Liu6, Irma Kusuma6, Yee-Shin Siow6, Wieslaw L. Nowinski1 1
Biomedical Imaging Lab, Singapore School of MPE, Nanyang Technological University, Singapore 3 The Eye Institute, National University Hospital, Singapore 4 Dept. Diagnostic Imaging, National University Hospital, Singapore 5 Johns Hopkins University School of Medicine, Baltimore, USA 6 School of Computer Engineering, Nanyang Technological University, Singapore
[email protected] 2
Abstract. We previously described a parametric eyeball modeling system for real-time simulation of eye surgery (MICCAI 01). However, in the simulation of ophthalmologic surgery, the model of the eyeball alone is not sufficient. The orbital structures are as important as the eyeball. In this paper, we describe the approach to model the orbital structures from patient specific MRI data set and integrate the orbital model with the parametric eyeball model. The orbital tissues including the eyeball, muscles, and orbital fat are segmented from MRI data. An interactive image-based geometrical modeling tool is developed to generate a finite element model of the orbit. Preliminary results include biomechanical models of three human subjects, one of which is a young patient with a benign tumor in the right orbit. The biomedical model can provide quantitative information that is important in diagnosis. It can also be used to accurately analyze the result of intervention, which is an important component of the simulator for training and treatment planning. Our analysis includes a deformation study on an eyeball subjected to simulated tumor growth using the finite element method.
1
Introduction
Eye surgery, which is a specialty of microsurgery, involves very delicate tissues. The surgery is often performed relying on the magnified eye structure. Performing such a procedure requires a thorough knowledge of the clinical diagnosis and a very careful and accurate execution. Complications of ophthalmologic diseases, however, often pose additional difficulties for surgeons when making critical decisions. Very intensive training is required for ophthalmologic professionals in order to provide high quality standard surgical services. Current imaging techniques are able to generate 3D data of the patient. In imaging the orbit, MRI provides considerable information of different soft tissues. The surgeons must investigate these data, slice by slice, and try to identity the pathology. However, without a patient specific model of the eye tissues, even the most experienced experts face difficulty in planning interventions. In order to help the ophthalmologist to examine the patient specific data, and to help the student to get a better understanding of the related anatomy and surgery, we have T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 339–347, 2002. © Springer-Verlag Berlin Heidelberg 2002
340
Z. Li et al.
developed a system to construct biomechanical model of the human orbit. Such model contains the eyeball, the muscles responsible for eye movement, the optic nerve, and the fat which fills the spaces in between. The model is aimed to help the ophthalmologist in different ways: Quantification. The patient-specific model is able to provide information on the sizes, thickness, or volumes of different tissues. Such quantitative information is very important to the ophthalmologists in certain pathological conditions such as thyroid eye disease and in the cases of orbital trauma. Diagnosis. The model can help the surgeons to better identify the abnormalities. If a certain disease is suspected to be developing, such a process could be detected and monitored through scanning and modeling of the patient data as a function of time. For example, creating 3D views of tumors in the orbit can enable the surgeon to understand the proximity and the exact relationship of the lesion to vital structures so that a more accurate diagnosis could be achieved. Analysis. Along with proper material properties, the patient-specific model could be analyzed with FEM or other tools. The behavior of the eye tissues when interacted with surgical instruments can be accurately simulated. A more realistic simulation of intervention can be achieved. Training. With the predefined models and result of the surgical procedure, the model can play an important role in training. If a virtual reality system is set up properly with some friendly graphic and haptic user interfaces, the trainee can obtain first hand experience and the training cost and its duration can be decreased. Moreover, orbital surgery is very highly specialized and training in such models can equip surgeons with the necessary skills to do a safe and efficient surgery. Pretreatment planning. Such a model can help the surgeon to work out a better pretreatment plan. As in the case of brain, surgery in the orbit has very small margin for error. Practice with a virtual model can certainly help to reduce the risk of complications. Small incision surgery or stereotactic surgery in the orbit can be made possible in the future (such techniques are not available for orbital surgery at the moment) if a 3D image of the orbit and the pathological lesion is constructed. At present time, several medical simulators have been reported [1–7]. For example, Hanna et al described their simulation work of accurate keratotomy for astigmatism [2]. Sagar et al built an eye surgical simulator within a virtual reality environment [3]. In this paper we discuss the patient specific modeling of the human orbit and how this model can be used to expand our virtual reality eye surgical simulation system [1].
2
Patient Specific Orbital Modeling
Current MRI scanners are able to scan the patient data at a resolution of 0.5mm in three dimensions. However, the acquiring of the higher resolution data needs a longer time. If the scanning objects move or undergo deformation, distortion will be introduced. Because our eye is moving almost all the time, the scanning process should be finished as fast as possible. Therefore, the MRI data for the eye usually comes with a moderate resolution of, for example, 1.0mm. This resolution is good enough to work out the geometric model for the eyeball, muscles, nerves, and orbital bones. However, to define an image-based model for the internal structure of the eyeball, such a resolution is insufficient. In our approach, we construct an image-
Modeling of the Human Orbit from MR Images
341
based model for the muscles, nerve, fat, and eyeball as a whole. For the structures contained in the eyeball, we employ a parametric model. This allows us to build up a patient specific model suitable for simulation and analysis. Figure 1 illustrates our approach. M RI Data Orbital M odel Database
Registration
Segmentation
K nowledge-based Segmentation
Contours of Orbit (Eyeball, M uscles, Orbital Fat, and Nerve) Parametric Eyeball M odel M aterial Properties
Construct ive G eometrical M odeling
Biomechanical M odeling
Modeling
Orbital M odel
Quantification
Analysis
Planning
Simulation
Fig. 1. Orbital modeling for ophthalmologic simulation
2.1 Knowledge-Based Segmentation Our approach employs a database of orbital models in extraction of the orbital tissues from MR images. Currently, we have three MRI volumetric scans of human orbits including both left and right eyes. The first set is an infant with a diseased right eye. The remaining two sets are volunteer adults with no apparent pathologies in either eye. The voxel sizes are about 1.0x1.0x1.0mm for the adults and 0.625x0.625x 0.625mm for the infant. Since the MRI data set does not contain sufficient information on the internal structures within the eyeball, in our segmentation work, we have to treat the eyeball as a single object in segmentation. This virtual orbital model includes the eyeball, muscles, and the outer boundary of the orbital fat, which is similar to the shape of a cone. The critical feature points in the patient data such as the anterior and posterior end of the eyeball and the tip of the geometrical cone are used to develop a rough correspondence between the patient data and the virtual model. After that, the contour of the tissue from the virtual model is displayed on top of each slice of the patient data slices in axial, sagittal and coronal directions, respectively. The user will identify and confirm the exact position and shape of such tissues. Because a predefined model is used in segmentation and modeling, our method is knowledge-based. To implement knowledge-based segmentation, a suitable software tool is developed. Using this software, registration and segmentation can be performed interactively. The user can adjust shapes of the virtual model to make it fit as much as possible to the patient data.
342
Z. Li et al.
Adjustment of the shape tissue is implemented through interactive region growing. If a region is identified as a particular structure and its boundary is not the same as the one defined by the virtual model, the user can adjust boundary of the virtual model manually. This can be done by changing the position of a control point and/or generating a new one through interactive region growing. Figure 2 (a) and (b) illustrates the contour of optic nerve before and after adjustment through region growing.
(a) (b) Fig. 2. (a) initial postion of optic nerve from the model database (b) after editing
2.2 Modeling The main structures from segmentation process consist of the eyeball (EB), optic nerve (ON), superior rectus (SR), inferior rectus (IR), medial rectus (MR) and lateral rectus (LR). In addition to these structures, the outer boundary of the soft orbital tissues is also defined. We have considered these orbital tissues as orbital fat (OF). There might also be tumor in addition to the orbital fat. All the structures are segmented separately, and are regarded as independent objects. These objects are combined to obtain a single human orbital model. At this stage, the eyeball is modeled as a unique, homogenous 3D structure. The reason is that the resolution of the patient specific MRI data is not high enough to segment the isolated eyeball tissues for modeling and analysis. In our previous work, we have modeled the eyeball parametrically[1].. Individual parts of the eyeball were parametrically represented and the variation of the geometry was possible. Figure 3 illustrates the polygonal eyeball representation. Through adjusting the parameters, the parametric eyeball is fitted with the segmented contours of the eyeball from the patient. A full geometrical model of the eyeball can be constructed. The parametric eyeball, together with the other component models, contributes to our virtual orbital model. Figure 4 shows an example of the component models. From these intermediate geometrical models, quantitative information such as dimension and volume of the various components could be determined. If further biomechanical analysis is required, the material properties are assigned to each component model that represents the respective anatomical structure. The resultant biomechanical model is used to update the orbital model database.
Modeling of the Human Orbit from MR Images
343
Fig. 3. Parametric representation of eyeball model
Fig. 4. Component models of various anatomic structures
3
Finite Element Analysis
3.1
Biomechanical Properties
Studies have shown that 90% of corneal thickness is formed by the stroma. We assumed that this controls the majority of the biomechanical behavior of the cornea, which consists mainly of water (78%) and layered protein fibres (16%). For such materials, the stress-strain relationship is nonlinear. We take the formula introduced by Uchio et al. [8]. Compared with other soft tissues in the orbit, the orbital bone could be defined as rigid. In our model, the functions of the orbital bones are implemented through the application of appropriate boundary conditions. As listed in Table 1, the physical properties of aqueous, vitreous, and fatty tissues are taken from Power et al [9] and Todd et al[10]. Table 1. Material Properties of different orbit tissues Structure Cornea Sclera Lens Ciliary’s body Aqueous Vitreous Muscles Fatty tissue Orbital bone
Young’s Modulus E (MPa) Nonlinear Nonlinear Rigid 20 0.037 0.042 20 0.047 Rigid
Poisson ratio
ν
NA NA NA 0.40 0.49 0.49 0.40 0.49 NA
Density
ρ(kg/m3) 1400 1400 315 1600 999 999 1600 999 NA
344
Z. Li et al.
3.2 Governing Equation of Deformation of Biostructures In our previous work [1], we considered small incremental deformation involved in the interaction of point surgical device and the virtual eyeball model. It could calculate the deformation of the tissue and the force feedback from the scalpel before cutting happens. Linear analysis is performed. In the current model, we use non-linear analysis to determine the deformation over the outer surface of the eyeball subjected to an incremented contact force from a growing tumor in the orbit. Because the stress-strain relationships of some of the materials in the human tissues are nonlinear, which means that the stiffness is a function of the deformation, the problem can be expressed as Κ (d ) d = Q (1) where, N
Κ = ! " B T D(d ) B dvi 0 Vi
i =1
(2)
d = [d1x d1 y d1z , d 2 x d 2 y d 2 z ,!, d Nx d Ny d Nz ]
(3)
Q = [q1x q1 y q1z , q2 x q2 y q2 z , ! , q Nx q Ny q Nz ]T
(4)
T
In the above equations, d is the displacement vector, K is the global stiffness matrix and Q is the force vector applied to the orbit structure. In order to solve these equations, K and d have to be built and solved step by step repeatedly. In our system, we adopted a load increment algorithm, in which the load is applied by several steps. If we define, ∆q m = Qm+1 − Qm (5) the iterative displacement change can be obtained using,
d m+1 − d m = ( K T ) −m1 ∆qm 4
(6)
Results and Discussions
As previous work, Yu et al [11] has built a 3D orbital model from Visible Human Female data set and Scheimpflug images. They segmented the orbital tissue manually for the purpose of visualization and atlas construction. But it is not completed enough for the purpose of surgical simulation. Existing ophthalmic simulators, such as [1-7], define their ophthalmic models either geometrically and/or physically. These simulators simulate retinal coagulation, cataract surgery, and vitrectomy. They are mainly concentrated on the eyeball. There are, however, numerous procedures in the orbital region [12]. To our best knowledge, there is no orbital simulator using patientspecific data. Our work contributes towards building a suitable simulator for this purpose. The biomedical model we re-constructed from the patient MRI data set includes the geometrical structure and the definition of the physical properties of tissues. The 3D geometrical model serves the simulation system in the aspect of realistic visualization, which is one of the most critical components of the medical
Modeling of the Human Orbit from MR Images
345
simulator. Quantitative information such as thickness and volume are also important to the ophthalmologist in his/her diagnosis and treatment planning. In addition, the physical-based model provides the displacement or deformation of the orbit tissues under interventions. The force feedback calculated from the model can be applied to the haptic devices to achieve realistic force feedback. Table 2. Examples of orbital models and their volume in mm3
EB:11623 IR: 790 ON: 591 SR: 663 MR: 1032 LR: 909 OF: 8673
EB:10291 ON: 748 MR: 805 OF:10493
IR: 725 SR: 963 LR: 708
EB: 2980 ON: 142 MR: 230 OF: 3485
IR: 247 SR: 257 LR: 182
Table 2 is a snap shot of the orbital models and the volume of the different tissues in the three orbital models. In this table, the volume of orbital fat includes the soft tissues other than segmented and defined, such as the eyelid. We have extended our analysis to the study of the deformation and stress distribution on pathological eyeball using our orbital model. Compared with other work based on the surface model, our 3D nonlinear model can yield better results because more accurate physical models are employed. Figure 5 shows the deformation of a patient’s eyeball with a growing tumor at the orbit. Such results can help the clinicians in evaluating the pressure inside the orbit and take proper measures for diagnosis and treatment. Essentially, the model represents the orbit structures in a static state when the eyes are closed and relaxed. This is the zero-stress state when the deformation and force feedback are calculated. However, the position of eyeball and the shape of muscles are changing all the time. A more flexible model is under development. At the current stage, our FEM analysis does not include cutting and remeshing components. However, such functionalities are essential to ophthalmic surgery simulation. In addition, the current model is not completed yet. Many important structures such as blood vessels, nerves, and different orbital bones are not included yet. In the simulation point of view, quantitative validation has to be carried out on the speed of the system, accuracy of the model, etc.
346
Z. Li et al.
Fig. 5. Deformation of the eyeball under tumor pressure
5
Conclusions
We developed a method for the patient-specific modeling of the orbit. Such a method makes use of the available anatomic model to extract and build the patient-specific model. Confirmation of patient structure is done using interactive image processing, mainly 3D region growing. Using the published material properties, an analyzable FEM model is generated. A customized FEM algorithm is developed to analyze the behavior of the patient specific model. We demonstrated that our model could be used to study deformation in tumorous human orbit. The orbital model can provide the ophthalmologist with: (1) quantitative information that is important in diagnosis; (2) a better way to identify the abnormalities; (3) a more accurate results in analyzing intervention and a more realistic simulation; (4) cost efficient training; and (5) better pretreatment planning through practicing with a virtual patient orbital model. As future work, we are developing techniques to provide a really useful simulator. In addition to the functions such as cutting and remeshing, we are also work on the speeding up of analysis processes. Comparative evaluation between simulation results and real cases is also under way in order to evaluate the efficacy of the simulation system.
Acknowledgements Support of this research development by the Agency for Science, Technology and Research, Singapore is gratefully acknowledged.
References 1. Cai Y, Chui CK, Wang Y, Wang Z, Anderson JH, Parametric eyeball model for interactive simulation of ophthalmologic surgery, MICCAI 2001, pp. 465-472 2. Hanna KD, Jouve FE, Waring GO, and Ciarlet PG, Computer simulation of accurate keratotomy for astigmatism, Refractive & Corneal Surgery, Vol. 8, 1989, pp. 152-163.
Modeling of the Human Orbit from MR Images
347
3. Schill MA, Wagner C, Hennen M, Bender HJ, Maenner R, EyeSi – a simulator for intraoccular surgery. MICCAI’99, pp. 1166-1174. 4. Sagar, M. A., Bullivant, D., Mallinson, G. D., Hunter, P. J., and Hunter, I. W., A virtual environment and model of the eye for surgical simulation, Proceedings of SIGGRAPH Annual Computer Graphics Conference 1994, Florida, pp. 205-211. 5. Parshall R. F., Computer-aided geometric modeling of the human eye and orbit. J. Biomedical Computing 1991, 18(2), pp. 32-39. 6. Li Z, Chui CK, Anderson JH, Chen X, Ma X, Hua W, Peng Q, Cai Y, Wang Y, Nowinski WL, Computer environment for interventional neuroradiology procedures, Simulation and Gaming, 2001, 32(3), pp. 405-420. 7. Peifer J. Virtual environment for eye surgery simulation. Medicine Meets Virtual Reality II: Interactive Technology and Healthcare, San Diego, 1994:166-173. 8. Uchio E, Ohno S, Kudoh J, Aoki K, Kisielewicz LT: Simulation model of an eyeball based on finite element analysis on a supercomputer. British Journal of Ophthalmology, Vol. 83, pp.1106-1111, 1999. 9. Power E. D., Stitzel J. D., West R. L., Herring I. P., and Duma S. M., A nonlinear finite element model of the human eye for large deformation loading, Proceedings of 25th Annual Meeting of Biomechanics, San Diego, August 2001, pp44-45. 10. Todd BA, Thacker JG, Three-dimensional computer model of the human buttocks in vivo, J rehabil. Res. Dev. 1994 31(2), pp.111-119. 11. Yu CP, Jagannathan L, Srinivasan R, Nowinski WL. Development of an eye model from multimodal data. Proceedings SPIE Medical Imaging 1998: Image Display. Vol. 3335, San Diego, California, Feb 1998, pp. 93-99. 12. Rootman J, Steward B., Goldberg RA. Orbital Surgery. A Conceptual Approach. Lippincott-Raven, Philadelphia, 1995.
Accurate and High Quality Triangle Models from 3D Grey Scale Images P.W. de Bruin1 , P.M. van Meeteren2 , F.M. Vos2 , A.M. Vossepoel2 , and F.H. Post1 1
2
Computer Graphics, Faculty of Information Technology and Systems, Delft University of Technology {P.W.deBruin,F.H.Post}@its.tudelft.nl Pattern Recognition Group, Department of Applied Physics, Delft University of Technology {magchiel,frans,albert}@ph.tn.tudelft.nl
Abstract. Visualization of medical data requires the extraction of surfaces that represent the boundaries of objects of interest. This paper describes a method that combines finding these boundaries accurately and ensuring that this surface consists of high quality triangles. The latter is important for subsequent visualization and simulation. We show that the surfaces created using this method are both accurate and have good quality triangles.
Introduction Analysis of D medical images is aided by creating D surface representations that describe the boundaries of anatomical structures. A triangle mesh facilitates viewing and manipulating the data easily. Apart from that, these meshes can be used to calculate metrics on its size and shape. Applications include computer aided diagnosis, surgical planning, and simulation. Extracting a mesh, or segmenting data, requires detecting and grouping of parts in the data that share certain characteristics. A straightforward method to segment data is isosurfacing (e.g., Marching Cubes []), where points that share the same greyscale value (the isovalue) are assumed to describe the surface. However, this assertion does not always hold due to noise and bias (e.g., in Magnetic Resonance Imaging). Another approach to segmenting is to identify objects by an edge: an abrupt change in greyscale intensity. By doing so, the exact greyscale value of the points on the boundary is not relevant. Edge detection is also hampered by noise due to the sensitivity of the derivative operator to high frequency noise. Therefore, a Gaussian derivative is often used to suppress noise. Unfortunately, the reduction of noise coincides with dislocation of edges. A good trade-off is achieved between detecting and locating edges and suppressing noise by using a scale-space approach [, ]. The quality of an extracted mesh is important for subsequent processing steps, such as visualisation and finite element modeling methods. Here good quality triangles and a smooth mesh and are preferred and sometimes even required (see Section .). Surface extraction methods such as deformable models [, ] and snakes [, ] do not explicitly attempt to produce meshes of good triangle quality. Apart from that, deformable models and snakes require several parameters that have to be tuned for each case. Therefore, it is difficult to maintain reproducibility. The SurfaceNets method does take triangle quality into account, but is limited to isosurfaces [, ]. The conventional approach is to proceed after extraction by applying mesh improvement techniques. However, this T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 348–355, 2002. c Springer-Verlag Berlin Heidelberg 2002
Accurate and High Quality Triangle Models from 3D Grey Scale Images
349
Fig. . Constraining regions based on (from left to right) Voronoi, edge midpoint and centroid criterion. The triangulation does not meet the Delaunay criterion, hence the odd shape. The edge midpoint does not cover the entire surface. The centroid method is insensitive to the Delaunay criterion and fills the entire mesh area.
is an undesirable situation because such techniques are purely geometrical and topological methods. The connection is lost between the mesh and the original data. Summarizing, a good mesh extraction method should meet requirements concerning accuracy (edge detection), reproducibility (number of parameters), and smoothness and triangle quality of the generated mesh. In this paper we present a new method that produces meshes that fulfill these criteria and does not lose the connection with the greyscale data. Our main contribution is the combination of two types of methods: multi-scale edge detection and mesh improvement. We will discuss several novel adaptations of the techniques used, and we show how the number of user-specified parameters is kept to a minimum. In the next Section first each of the techniques indicated above is explained, followed by our method of applying these techniques. Section shows the results of our method when applied to synthetic and to medical data. Conclusions and future research can be found in Section .
Methods
To achieve the goals and criteria outlined in Section a two-fold approach is used in which high quality scale-space edge detection and mesh improvement methods are combined. Our approach consists of alternating between edge-detection and mesh improvement until a desired result is obtained. Each of the separate methods is explained below, followed by aspects of the combination. . Scale-space edge detection Edge detection based on image gradients is the identification of intensity changes between a region of interest and its surroundings. The modulus of the first derivative of an image has a local maximum at locations corresponding to edges. However, highfrequency noise will yield local maxima as well. Therefore, noise suppression is necessary. A common solution to this problem is to apply a low-pass filter to the image. Although this suppresses noise, it dislocates and suppresses edges also. The Gaussian kernel optimises the criteria of good signal-to-noise ratio and good localisation simultaneously []. The parameter σ of the Gaussian kernel determines the level of smoothing that is applied to the image. Scale-space theory embeds an image in a one-parameter family of derived images: the scale space [, ]. A stack of images is created by convolving the original image with increasingly smoothing Gaussian derivative kernels.
350
P.W. de Bruin et al.
The image corresponding to the largest value for σ is severely blurred, but noise is suppressed. In this image the detection of edges is a simpler task. Once an edge is detected, the found location is taken as the initial position in the image corresponding to the next lower value for σ . This process continues until the lowest σ is reached. At this point the edge is localised completely. In order to limit the search space a simple representation of the object to extract is required. This coarse segmentation does not need to be accurate but an estimate of the maximum distance from the mesh to the real object is required. The search for edges is limited to the region indicated by the mesh and the maximum distance. The following three important points have to be addressed: ) selecting the upper and lower bound of σ , ) sampling the scale space, and ) locating the edge (selection of maximum). The upper and lower bound σ selection is important because not every scale is relevant. The upper bound effectively determines the region of capture of the edge detector and is set equal to the maximum deviation of the initial mesh to the real object. The lower bound of σ is set equal to voxel length. Next, a discrete set of σ ’s between the upper and lower bound of σ is calculated. A straightforward linear sampling is not appropriate because this might lead to aliasing at fine scales and over-sampling at coarse levels of scale. We have chosen for a logarithmic √ n i sampling method described by σi = 2 [?]. The number n determines how many times σ is sampled per doubling of the scale parameter. By doing so, a proper and intuitive sampling is obtained because a large σ corresponds to a large stepsize and a small σ corresponds to a small stepsize. At every vertex in the initial mesh the grey scale gradient vector is calculated. Along this vector the search for the edge is performed. The search window is made scalerelative to avoid distraction by spurious edges. The total width of the search window is set equal to 2σ . At five points along the line the modulus of the gradient is calculated with step size equal to 0.5σ . The point corresponding to the maximum value of the five points is the starting point in the next lower scale. Sampling the search window at 5 discrete positions introduces an uncertainty of half the step size . Because the lowest scale we encounter is σ = 1, the localisation will have an uncertainty equal to 0.25 voxel lengths. When going from one scale to the next lower the new search window is placed centered at the found maximum. If the new window exceeds the boundaries, it is translated along the gradient vector until it is completely located inside the region of capture. Without this correction the search could move outside the region of capture of the detector. . Mesh improvement techniques Mesh improvement techniques operate either on the geometry or the topology to improve the quality of a mesh [] (for measures of mesh quality see Section .). Geometric methods reposition vertices, whereas topological methods operate on the connections between the vertices. For our work we have chosen one of each type: Laplacian smoothing (geometric) and edge swapping (topological). The Laplacian smoothing operator SL (see e.g., []) moves each vertex vi of a mesh to the average position of its N linked neighbour vertices v j by 1 L v¯i = SL (v j ) = N
N
∑ v j , j=1
j = i .
Accurate and High Quality Triangle Models from 3D Grey Scale Images
351
The position of each vertex in the mesh is updated after all new positions are calculated. Advantages of Laplacian smoothing are that it is a computationally inexpensive operation, it produces a mesh with a smooth surface, and it does not require any parameters to tune. Disadvantages of this method are that it does not guarantee an increase of the triangle quality (e.g., vertices placed symmetrically around a vertex), it shrinks the mesh, converges slowly, and it can introduce geometric errors. A geometric error is an inversion of the triangle orientation, which is detected by an change in direction of the triangle normal (e.g., pointing outward from a closed object surface). To prevent geometric errors a region of constrained movement for each vertex is created. These regions should be mutually exclusive and collectively exhaustive. By doing so, the region of movement for each vertex is as large as possible without the possibility of geometric errors. Figure shows three constraining regions. The first is a Voronoi region. Here, the regions are constructed by connecting the regions created by a Delaunay triangulation (i.e., the circumcircle of each triangle is an empty circle). The figure shows that if the mesh does not meet the Delaunay criterion the regions overlap. The second idea is to connect the midpoints of the edges as is shown in Figure . The regions do not overlap, but do not cover the entire surface area either. We propose a new method where we create an allowable region of movement for each vertex by connecting the centroids of the triangles surrounding the vertex. This method covers the entire mesh surface and the regions are mutually exclusive. Our centroid smoothing method SC prevents geometric errors and is expressed as a weighted sum of standard Laplacian smoothing SL and the current vertex vi 1 2 C v¯i = SC (vi ,v j ) = vi + SL (v j ) . 3 3 Edge swapping is a technique to improve the quality of the triangles in a mesh. Each pair of triangles sharing an edge is considered as a quad with the edge as a diagonal. According to a criterion the edge is swapped, or not. Here, the edge swapping algorithm proposed by [] and improved by [] is used. The algorithm improves the regularity of a mesh. For each quad in the mesh a relaxation index Ri is calculated that depends on the connectivity number of each vertex. Let the degree of the four vertices be d1 , d2 , d3 , and d4 , then Ri is defined by Ri = (d1 − 6)2 + (d2 − 6)2 + (d3 − 6)2 + (d4 − 6)2 . An edge is swapped if the relaxation index after swapping is closer to zero. This method actively creates vertices with connectivity number , which allows all the incident angles to be close to degrees. . Quality measures The edge detector places vertices on the edge with an a priori known inaccuracy of 0.25σ . The inaccuracy of a mesh is defined by the distance from the centroid of each triangle to the surface as defined by the edge detector, averaged over all triangles. There is no generally agreed definition of mesh quality; the quality measure depends on the subsequent use of the generated mesh. For specific cases quality indicators can be derived based on the approximation function and a known target. More generally, the consensus appears to be that a good quality mesh consists of triangles with not too small and not too large angles. Apart from that, meshes with regular or smoothly varying elements are “visually pleasing” []. A simple approach is to define the equilateral triangle as the highest quality triangle. Straightforward expressions of the quality of a triangle are the ratio of the shortest
352
P.W. de Bruin et al. 1 0.8 0.6 Triangle quality
Area
0.4
Length Angle
0.2 0 0
2
4
6
8
10
Fig. . A comparison of three triangle quality criteria. Starting with an equilateral triangle the top vertex is moved along the line indicated by the arrow until the triangle is degenerate. At each point along the line the quality is calculated and displayed in the graph.
edge to the longest edge and the ratio of the smallest to the largest angle. Both criteria define the equilateral triangle as the perfect triangle with quality (one). However, it is required that an ideal mesh (with average quality ) consists of equal-area triangles. Therefore, a third criterion is considered that weighs the area √ A of the triangle in the quality measure [] (the area of a unit equilateral triangle is 3/4. The criteria are: min(|e1 |, |e2 |, |e3 |) , qlength = max(|e1 |, |e2 |, |e3 |)
min(α , β , γ ) qangle = , max(α , β , γ )
√ 4 3A qarea = |e1 |2 + |e2 |2 + |e3 |2
where edges are indicated byei , angles are represented by α , β and γ and A denotes the area of the triangle. Figure shows a graphical representation of the behaviour of the quality functions in a specific situation. Clearly, the qlength criterion is not well behaved, because for increasingly lower quality triangles the criterion outcome increases. The qangle and qarea criteria are well behaved because they are bijective and monotous mappings. We have chosen the qarea criterion for our experiments. .
Protocol
We have described three processes that are necessary to extract a mesh with accuracy and quality. Here we describe the protocol consisting of successive applications of each of the methods. The consecutive steps in the method are: ) Edge detection (ED), ) Edge swap (ES), ) (Multiple) improvement steps (S), ) repeat steps ,, and until desired quality is reached, and ) a final ED and ES. The first step should be an edge detection step to ensure that all the vertices are positioned on the boundary. Next, we proceed by regularising the mesh using the edge swap. We have found in our experiments that smoothing steps after an edge swap converge faster than starting without an edge swap. Subsequently, one or more modified Laplacian smoothing steps are applied, followed by and edge detection step and edge swap. The region of capture of the edge detector is known and, therefore, several smoothing steps can be applied to achieve faster mesh improvement. The distance that each vertex travels from its initial position is calculated and if this distance exceeds the region of capture of the edge detector the next cycle is started. When the quality of the triangles has reached a desired value the sequence is ended with an edge detection step and an edge swap to ensure that each vertex is on the boundary.
Accurate and High Quality Triangle Models from 3D Grey Scale Images
353
Table . The results of the protocol on the sphere dataset. The original mesh consists of vertices and triangles. An edge-detection step is indicated by ED, modified smoothing by S, and edge swapping by ES. Of the multiple smoothing steps only the first and last are shown in each case. Operation Average Min. Angle Average Quality Average Inaccuracy Mean Radius Variance Org . . . . ED . . . . . ES . . S . . . . S smoothing steps performed S . . . . ED . . . . . ES . . S . . . . S smoothing steps performed S . . . . ED . . . . . ES . .
Results The method was applied to D images of a sphere and CT images of a human wrist. A grey scale volume (100 × 100 × 100) containing a sphere of diameter 25 placed at the centre is used to test our method. Table shows the results from a run of the complete protocol. The table shows that the difference between each radius of the mesh after an edge-detection step (ED) and the real radius never exceeds 0.25. Therefore, the vertices are placed within the precision of the edge detector. The convergence of modified smoothing steps (S) is slow, but several steps can be applied before ED is necessary. The variance after a modified smoothing step is lower than after ED. This can be attributed to the low-pass nature of smoothing and the high-pass nature of edge detection. Note that the variance after ED is a tenth of the initial variance. Figure shows a histogram of the triangle quality of the mesh before and after processing by our method. Clearly, the method removes low-quality triangles and increases the amount of high-quality triangles. Similar results are obtained where the sphere was corrupted with additive Gaussian noise (σ = 2). Next, we present some of the results obtained from a CT dataset of a human wrist. Four different metacarpal bones were processed: the os lunatum, the os scaphoid, the os triquetrum, and the os trapezium (see Figure ). Table shows the results of the protocol. From the table it follows that the quality obviously increases and the inaccuracy decreases for all bones. The os lunatum and the os scaphoid allow more modified smoothing steps until an edge detection is required. Figure shows histograms of triangle quality of the os scaphoid and the os lunatum before and after processing. The histograms show a distinct improvement of the mesh quality.
Conclusions
We have created a high-quality scale-space edge detector that requires an initial approximation of the shape to extract and an upper bound estimate of the deviation from the actual object. A method to improve the mesh without generating geometric errors was shown. Both methods were succesfully coupled by the sequential application protocol.
354
P.W. de Bruin et al. Histogram of triangle quality
Histogram of triangle quality 800
300
600
200
400
100
200
0
0 0
0.2
0.4 0.6 Triangle quality
0.8
1
0
0.2
0.4 0.6 Triangle quality
0.8
1
Fig. . Histograms of triangle quality (qarea criterion) before (solid grey) and after processing of a sphere with additive noise (left) and the Os Lunatum (right).
Fig. . A rendering of the wrist. The highlighted bones were processed by our method. From left to right the bones are: the os trapezium, the os scapoid, the os lunatum and the os triquetrum.
The meshes generated by our method are both accurate and consist of triangles of high quality. Apart from the initial mesh and the maximum deviation of this mesh, the only input required from a user is the desired quality of the mesh. Further research will focus on adjusting the resolution of the mesh. Validation using phantom models in CT and MRI modalities is planned.
References . B, J., W, A., B, M., D, R. Uniqueness of the gaussian kernel for scale space filtering. IEEE Transactions on Pattern Analysis and Machine Intelligence , (), –. . B, R. E., S, R. K. Mesh smoothing using A posteriori error estimates. SIAM Journal on Numerical Analysis , (), –. . B, M. Mesh quality: A function of geometry, error estimates or both. In th International Meshing Roundtable, Sandia National Lab (Oct. ), pp. –. . B, E. Reliable delaunay-based mesh generation and mesh improvement. Communications in Numerical Methods in Engineering (), –. . C, J. A computational approach to edge-detection. IEEE Transactions on Pattern Analysis and Machine Intelligence , (), –. . B, P., V, F., P, F., F-G, S., V, A. Improving triangle mesh quality with surfacenets. In Medical Image Computing and Computer-Assisted Intervention – MICCAI (oct ), S. Delp, A. DiGioia, and B. Jaramaz, Eds. Third International Conference, Pittsburgh, PA, USA.
Accurate and High Quality Triangle Models from 3D Grey Scale Images
355
Table . Results using the protocol on four metacarpal bones. Edge detection is indicated by ED, modified smoothing by S, and edge swapping by ES. For each ED step the inaccuracy is calculated. Note that the number of S steps is different for some bones (indicated by ↓). For each bone the number of vertices and triangles is indicated. Operation Os Lunatum v and t Average→ Min. Angle Quality Inacc. Org . . ED . . . ES . . S . . S . . ED . . . ES . . S . . S . . ED . . . ES . .
Os Scaphoid v and t Min. Angle Quality Inacc. . . . . . . . . . . . . . . . . . . . . . . . . .
Os Triquetrum v and t Min. Angle Quality Inacc. . . . . . . . . . ↓ ↓ . . . . . . . . . . . . . .
Os Trapezium v and t Min. Angle Quality Inacc. . . . . . . . . . ↓ ↓ . . . . . . . ↓ ↓ . . . . .
. F, L., R, B. T. H., K, J., V, M. Scale and the differential structure of images. Image and Vision Computing (), –. . F, L. A. On combining laplacian and optimization-based mesh smoothing techniques. In AMD Trends in Unstructured Mesh Generation (), vol. , ASME, pp. –. . F, W. H., F, D. A. Mesh relaxation: A new technique for improving triangles. International Journal for Numerical Methods in Engineering (), –. . G, S. Constrained elastic surfacenets: generating smooth surfaces from binary sampled data. In Proceedings Medical Image Computation and Computer Assisted Interventions, MICCAI ’ (), pp. –. http://www.merl.com/reports/TR99-24/. . L, S., V, M. A. A discrete dynamic contour model. IEEE Transactions on Medical Imaging , (Mar. ), –. . L, W., C, H. Marching cubes: a high resolution D surface construction algorithm. In Proc. ACM SIGGRAPH’ (July ), pp. –. . MI, T., T, D. Deformable models in medical image analysis: A survey. Medical Image Analysis , (), –. . P, J., C, D., H, J. A snake for model-based segmentation of biomedical images. Pattern Recognition Letters, (), –. . S, J. G. Wrists in Space, Deformable models for segmentation and matching techniques for registration of D MR and CT images of the wrist. PhD thesis, Universiteit van Amsterdam, . . W, A. Scale space filtering. In Proceedings, International Joint Conference on Artificial Intelligence (), pp. –.
Intraoperative Fast 3D Shape Recovery of Abdominal Organs in Laparoscopy Mitsuhiro Hayashibe1 , Naoki Suzuki1 , Asaki Hattori1 , and Yoshihiko Nakamura2 1
Institute for High Dimensional Medical Imaging, Jikei Univ. School of Med 4-11-1, Izumihoncho, Komae-shi, Tokyo 201-8601 Japan 2 Department of Mechano-Informatics, University of Tokyo 7-3-1, Hongo, Bunkyoku, Tokyo 113-8656 Japan Abstract. Precise measurements of geometry should accompany robotic equipments in operating rooms if their advantages are further pursued. For deforming organs including the liver, intraoperative geometric measurements play an essential role in computer surgery in addition to preoperative geometric information from CT/MRI. The laser-scan endoscope system acquires and visualizes the shape of the area of interest in a flash of time in laparoscopy. Results of in-vivo experiments on the liver of a pig verify the effectiveness of the proposed system. This system offers surgeons high-speed 3D geometric visualization to provide an intuitive orientation under laparoscopic surgery.
1
Introduction
Endoscopic surgery forces surgeons to operate under mental tension with mechanical and visual constraints. The visual difficulties are due to the narrow area of view of endoscopes and the lack of depth perception. This means surgeons are required to extract a sense of orientation in the abdominal cavity from the anatomical structure and the direction of the laparocope, which leads to awkward operating technique for surgeons. As a result this may cause you misperception due to the difference in environment from open surgery. Laparoscopic surgery would be technologically improved if surgeons were provided with a 3D representation of the internal geometry in an intuitive manner. Being integrated with real time interface, such technology would make a significant difference in the operational environments. We believe this would be effective under surgery directly operated by surgeons and robotic surgery. The technical advance of minimally invasive surgery has introduced more and more such equipped environments[1]. As there are many mechanical devices such as the laparoscope, with its holding arm and forceps, it would be essential to do a position management for these devices and organs from the viewpoint of safety. The 3D shape of the internal geometry can play an important role in enabling a surgeon to approach his target from the correct direction. In this paper, we show the function of intraoperative fast 3D shape recovery system. The laser-scan endoscope acquires and visualizes the shape of the area of interest in a flash of time in laparoscopy. This system offers surgeons high-speed 3D geometric visualization to provide an intuitive orientation under laparoscopic surgery. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 356–363, 2002. c Springer-Verlag Berlin Heidelberg 2002
Intraoperative Fast 3D Shape Recovery of Abdominal Organs
2
357
Intraoperative 3D Geometric Visualization
Here, we depict the measured models by this system (first prototype). Please refer to the previous paper[6] as to system configuration. Figure 1 shows the scanned surface of a 50yen coin done by Laser-Pointing Endoscope. The hole on the 50yen coin could be measured and the error is within 1%. The scanned 3D data is automatically redescribed in VRML 2.0 by our developed program. The surface is composed of numerous triangle patches. The prime purpose of this research is to obtain 3D geometry under the laparoscopic surgery. We made an in-vivo experiment under the laparoscopy to obtain the intraoperative 3D geometry as shown in Fig.3. We scanned the area of 8cm square and obtained 400 points of data. Sampling time took 1.2ms for each point including the calculation time for the 3D position, and the total measuring time was 0.5 seconds. The intraoperative 3D geometry of the liver surface could be as quickly obtained as Fig.2.
Fig. 1. Scanned surface of 50yen coin
Fig. 2. The in-vivo experiment using pig’s liver in laparoscopy
358
M. Hayashibe
Fig. 3. Scanned VRML liver model in laparoscopy
3
Extension for Real-Time Deformation Imaging
The precise anatomical data of a patient is accessible using CT/MRI. The data is usually evaluated in the pre-operative conferences to discuss and plan the surgical procedures. One current issue is how to utilize such data in the operating room. The CT or MRI data shows geometrical inconsistency with the deformation due to the change of the patient’s body postures. Regarding deformation, a surgeon has to rely on his imagination to map the pre-surgical data onto the deformed one. If we scanned the organ in-situ using the laser-scan endoscope, we would compute and estimate its internal deformation using the scanned data. The first system we developed mainly focused on the surface geometric management. As a useful application, the effectiveness is shown in the field of robotic safety support[7]. However, visualization of the organ surface was not on-line. Now, we have developed an fast on-line 3D visualization system to show real-time deformation of organs under laparoscopy. In laparoscopy, it is very difficult to get spatial perception. To supply a convenient interpretation in laparoscopy, we think it is useful to superimpose the deformed preoperative 3D model, which is reconstructed from CT/MRI, onto the actual organ in the laparoscopic image[8]. In order to deform the internal structure model of organ, the intraoperative 3D geometric information of the deformed organ should be required in real-time. Fig.6 shows the system configuration of second prototype. Being controlled by the mirror of a galvano scanner, a laser line beam is projected inside the patient’s body through an endoscopic optic device. We used a closed-loop galvano scanner which responds up to 1kHz. The laser line pattern is captured by a 262fps highspeed camera (528*512 pixels, 256 gray scale) and a high-speed image processing board. The laser and camera coordinate systems are identified by OPTOTRAK attached to the devices. 3D coordinates of the reference lines are reconstructed based on the triangulation between the high-speed camera image and the mirror angles of the galvano scanner. The image captured through the endoscope is distorted by the relayed lens inside. It is corrected and used to calculate the 3D position. The computed graphics are automatically produced from reconstruction of the scanned intraoperative 3D geometry. The feature of this system is on-line visualization of organ shape. We executed this task by parallel processing with 2 PCs. The task can
Intraoperative Fast 3D Shape Recovery of Abdominal Organs
359
be divided into measurement and visualization. Scanned points data is stacked into the shared memory board. The other PC gets out these points and makes polygons for the surface data. The scene is updated in OpenGL to visualize the deformation of object. Present frame rate of entire surface visualization is 5-6 frames per second(15 lines sampled in every frame), although we estimate that the processing time can be further shortened. The concept of this system (second prototype) is to provide the surgeon with a completely intuitive interface. Therefore, the surgical navigation program is also being developed as Fig.6. The surgeon is provided with a laparoscopic view or another favored point of view to see the intraoperative liver deformation.
Fig. 4. System configuration
4
Auto-polygon Generation from Range Data
The laser-scan endoscope first outputs range data set of scanned organ. When an effective visualization of organ shape are performed using this range data, the surface should be composed from numerous triangle patches. These combination of points which form the triangles can be used for geometric computation to check the collision between an objective organ and equipped medical tools such as forceps which position is measured. We developed the program which automatically generates triangular patches from range data every frame. In view of a laser scanner side, range data is a gathering of points mostly located in a lattice as shown in Fig.5(left). When closely neiboring three points exist in this ordered set of points, the surface is constituted as a gathering of triangles of these three points. If the number of sampled points in a line is nDot
360
M. Hayashibe
and the number of scanned lines is nLine, The combination of vertexes of upper triangle is (i ∗ nDot + j, (i + 1) ∗ nDot + j, i ∗ nDot + j + 1) (1) as to lower triangle, ((i + 1) ∗ nDot + j, (i + 1) ∗ nDot + j + 1, i ∗ nDot + j + 1)
(2)
Related with column i=0,1,2,....nLine-1; related with row j=0,1,2,...nDot-1,The combination of vertexes of triangle patch is decided repeatedly. Fig.5(right) depicts the surface model transformed by this algorithmDThe condition of unevenness is easily perceived from shading and the sequentially scanned images.
Fig. 5. Auto Polygon Generation
5
Results
5.1 Deformation Imaging Figure 6 shows the result of deformation imaging at the laboratory. We scanned the deformation of the pinned cloth pulled by a string. Reconstructed polyhedron in the virtual space deformed in real-time in accordance with how the cloth was pulled by the string. The scanning area was 9cm by 9cm.
Fig. 6. Real-time Deformation Imaging left:video image middle:scanned surface data(view1) right:scanned surface data(view2)
Intraoperative Fast 3D Shape Recovery of Abdominal Organs
361
The frame rate of surface visualization actually depends on the scanning resolution. In rough scan mode, the number of sampled lines are decreased to accelarate the refresh cycle. If a fine surface measurement is required, you can change scanning settings from GUI of our developed prgram. In rough scan mode, we usually set the sampling number of scan line 15. In this case, its frame rate is 6-7fps. In other words, 100lines per second are able to be captured in this system. Even in rough mode, it is enough to perceive the entire shape of object. Next, we tested the measurement and the visualization of the surface of an isolated pig’s liver as shown in Fig.7. The laser scope and the camera scope are positioned at the distance of 15cm from the organ. Even in the shinny and wet surface condition, the image processing of high-speed camera was succeed and the deformation of organ surface could be observed with the rendered computer graphics. An incision was made on the liver surface in Fig.8. The length of the incision is 44mm and the depth 7mm. Figure 9 depicts the image of reconstructed surface of liver in one frame. The scanning area is 8cm by 6.5cm. The yellow lines are normal vectors. In Fig.10, one more incision was made in the crossing position of the former. The incision with the length of 42mm and the depth of 5mm was addedon the surface. Even on this uneven surface, the visualization of liver deformation has been successfully done in Fig.11.
Fig. 7. Real-time Deformation Imaging Experiment in in-vitro
Fig. 8. Incision on Liver Surface(one in- Fig. 9. Visualization of the Shape of Incision) cised Organ(one incision)
362
M. Hayashibe
Fig. 10. Incision on Liver Surface(crossed Fig. 11. Visualization of the Shape of Inincision) cised Organ(crossed incision)
5.2
Registration of Preoperative 3D Organ Model
The precise anatomical data of a patient is accesible using CT or MRI. The data is usually evaluated in the pre-operative conferences to discuss and plan surgical procedures. One of current issues is how to utilize such data in the opetarion room. The CT or MRI data shows geometrical inconsistency with the deformation due to the change of patient’s body postures. For deformation, a surgeon has to develop the imagination to map the pre-surgical data onto the deformed one. As first step, we tested the registration of preoperative 3D model assuming thatthe organ is rigid. Here, preserved human liver in formalin was used. Figure 12 shows the plaster cast of the liver scanned by this system. As preoperative model, CT slices of preserved liver was reconstructed in 3D model. To correspond the 3D model with in-situ scanned surface of plaster cast, a few points are manually chosen as feature points from scanned surface. Here, anatomical feature points around gallbladder are used. Figure 13 shows the fitted 3D liver model onto scanned surface. This system can be applied for the navigation in laparoscopy because the position of scope and forceps is also observed and displayed continuously. For the future, we plan to take into account of the elasticity of organ.
6
Conclusion
The conclusion of this paper is summarized in the following two points: – The intraoperative geometric information of the internal surface in laparoscopy was obtained successfully in the pig experiment. Preliminary results of in-vivo experiments verified the functionality and showed the performance. – Real-time deformation imaging has been realized using parallel processing and auto polygon generation. Reconstructed polyhedron in the virtual space deformed in real-time in accordance with how the shape of the object changed in the experiment using an isolated pig’s liver.
Intraoperative Fast 3D Shape Recovery of Abdominal Organs
363
Fig. 12. Plaster Cast of Human Fig. 13. Registered 3D Liver Model onto Scanned Surface Liver
References 1. Marc O. Schurr.: Robotic devices for advanced endoscopic surgical procedures. Journal of the Robotics Society of Japan. Vol. 18 No. 1 (2000) 16–19 2. Guthart GS, Salisbury JK.: The Intuitive Telesurgery System. Proceedings of the IEEE International Conference on Robotics and Automation. (2000) 618–621 3. Computer Motion Inc..: Internet Home Page. http://www.computermotion.com. 4. Y.Nakamura, M.Hayashibe.: Laser-Pointing Endoscope System for Natural 3D Interface between Robotic Equipments and Surgeons. Proceedings of Medicine Meets Virtual Reality 2001. (2001) 348–354 5. Eric Larsen, Stefan Gottschalk, Ming Lin and D.Manocha.: Fast proximity queries with swept sphere volumes. Technical report TR99-018, Department of Computer Science, UNC Chapel Hill, (1999) 6. M. Hayashibe, Y. Nakamura.: Laser-Pointing Endoscope System for Intraoperative 3D Geometric Registration. 2001 IEEE International Conference on Robotics and Automation. (2001) 1543–1548 7. M. Hayashibe, Y. Nakamura, H. Shimizu and M. Okada. A Laser-Pointing Endoscope System Providing the Operational Support of Surgical Robot. Proc. of the 32nd International Symposium on Robotics 2001, pp.636-641, 2001. 8. A. Hattori, N. Suzuki, M. Hashizume, T. Akahoshi, K. Konishi, S. Yamaguchi, M. Shimada, M. Hayashibe. Development of Data Fusion System for Robotics Surgery(da Vinci). Journal of Japan Society of Computer Aided Surgery., Vol.18 No.1Cpp.45-48, 2002D 9. Yulun Wang. The Evolving Role of Robotics In Surgery. Journal of the Robotics Society of Japan.CVol.18 No.1Cpp.45-48, 2000D 10. H.Haneishi, Y.Yagihashi, Y.Miyake. A new method for distortion correction of electronic endoscope images. IEEE Trans. Med. Imag., Vol. 14, pp. 548–555, September 1995. 11. H. Haneishi, T. Ogura, Y. Miyake. Profilometry of a gastrointestinal surface by an endoscope with laser beam projection. Optics Letter., Vol.19 No.9, pp. 601–603, 1994.
Integrated Approach for Matching Statistical Shape Models with Intra-operative 2D and 3D Data M. Fleute1 , S. Lavall´ee1 , and L. Desbat2 2
1 PRAXIM, 4 Av. Obiou, 38 700 La Tronche, France TIMC Laboratory, University Joseph Fourier, Grenoble, France
Abstract. This paper presents an approach to the problem of intraoperative reconstruction of 3D anatomical surfaces. The method is based on the integration of intra-operatively available shape and image data of different dimensionality such as 3D scattered point data, 2.5D ultra sound data, X-ray images etc. by matching them to a statistical shape model, thus providing the surgeon with a complete surface representation of the object of interest. Previous papers of the authors describe the matching of either 3D or 2D data to a statistical model and clinical applications. The here presented work combines former published ideas with a new approach for the complex task of shape analysis required for the computation of the statistical model, thus providing a generic approach for intra-operative surface reconstruction based on statistical models. The method for shape extraction/analysis is based on a generic model of the object and is used to segment training shapes and to establish point to point correspondence simultaneously in a set of CT images. Reconstruction experiments are performed on a statistical model of lumbar vertebrae. Results are provided for 3D/3D, 2D/3D and hybrid matching with simulated data and for 3D/2D matching for a cadaveric spine.
1
Introduction
One key issue in CAS systems is the availability of patient specific 3D models of the anatomical structures on which the surgery is performed. For many applications the computation of detailed (and precise) 3D gray scale information based on pre-operative imaging is not mandatory, i.e. reconstruction of the organ shape is sufficient. Therefore it is desirable to be able to infer 3D-information from intra-operative data only to facilitate the navigation within the patient and thus allowing to abandon CT data acquisition (pre- or intra-operatively) at least for many standard surgical applications. In many existing CAS-systems optical localizers or laser scanners are used to acquire scattered point data on patients’ bone surfaces in order to register physical space with pre-operative images. Furthermore X-ray images are the dominating image modality in the operating room. In previous papers authors presented approaches aiming at reconstructing 3D anatomical surfaces for multiple purposes such as surgical planning and visualization relying on such intraoperative data only [FL98,FL99]. In order to infer the complete 3D shape of T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 364–372, 2002. c Springer-Verlag Berlin Heidelberg 2002
Integrated Approach for Matching of Statistical Shape Models
365
the object of interest it is necessary to incorporate a priori knowledge into the reconstruction algorithm. The statistical shape model introduced by Cootes and Taylor describes the average shape and characteristic shape variation of a set of training samples that are defined by a corresponding set of boundary points. Along with its simple representation, the capability of dealing with arbitrary shape topologies favours the use of this model. The shape reconstruction is carried out by fitting the data to a statistical shape model based on a generalization of the Iterative closest point algorithm (ICP) [BM92]. Two-dimensional statistical shape models have been in use for years now and are well established. Two crucial problems have prevented them so far from being widely used in the 3D case. First, the necessary segmentation of a sufficient high number of training shapes for the statistical analysis, usually available in the form of a CT exam, is a cumbersome and tedious task using today’s available manual or semi-automatic segmentation tools. Second, it is necessary to establish point to point correspondence between all training shapes. This is a nontrivial task and becomes manually infeasible. Few known automatic methods address this problem and rely on already segmented images. The present work proposes a new approach to address both above mentioned problems simultaneously. A generic model of the object is used to segment the training shapes in CT images and to establish point to point correspondence (semi-landmark positioning). A volumetric coarse to fine deformation method based on free form deformations is used to match the generic model to the image data. It is further shown in this work that the combination of both data sources - X-ray images and 3D point data - allowing to perform hybrid registration with the statistical shape model, might be a very interesting option for certain applications. Figure 1 shows a flow diagram of the chosen global approach whose different components are presented in the following sections.
2
Automatic Shape Extraction from CT Images for Statistical Model Computation
Prior to establishing correspondence between the training shapes which is required for the statistical analysis, the training shapes must be segmented from the raw 3D data (CT-data). This segmentation is usually achieved in a slice by slice fashion, which is rather circumstantial, time-consuming and labor-intensive. To overcome these inconveniences the template matching and the data segmentation can be coupled together, i.e. performed simultaneously. Based on a general shape template a representation of the external external surface is inferred from the model to the data providing a dense set of corresponding points between the template and the data, which can be used for the subsequent statistical analysis. It is worth to note that the matching will be based only on object boundaries, in contrast to the work in [ea01a] for instance, where the gray level information of the whole volume is used to drive the deformation. However the resulting deformation will be volumetric thus allowing for automatic inference of other structures embedded in the volume. In our
366
M. Fleute, S. Lavall´ee, and L. Desbat
Fig. 1. Flow diagram for Shape Recovery by Nonrigid Registration of a Statistical Shape Model with intra-operative data. The statistical shape model is computed offline and can be matched intra-operatively with 2D or 3D data. It is also possible to use both type of data simultaneously (hybrid matching)
approach we use a volumetric coarse to fine registration method based on FreeForm Deformations [SP86] in order to match the generic model to the image data. The method performs a least square minimization of the distances between model boundary points and matched feature points. The deformation defined by the computed displacement field is described as a warping of the space containing the surface model, based on 3D tensor-product deformation splines. For increased efficiency, the Free-Form Deformations are applied in a multi-resolution framework. The result is a rapid and efficient registration algorithm which does not require the prior segmentation (manual or automatic) of features in the data image, and which can work on arbitrarily shaped surfaces. The proposed method is conceptually following [SL96] but introducing important improvements and extensions sharing ideas from [ea99,Pic97,ea00]. The problem can be formulated as a minimization of a cost function
E(p) =
N
[dist(Pi , Tp (Mi ))]2 + R(p),
i=1
where dist is the Euclidean distance between a model point Mi and its corresponding data point Pi in the gray level image. T is a suitable deformation function depending on the parameter vector p and where R defines a regularization term which is applied to T in order to smooth the deformation.
Energy
Integrated Approach for Matching of Statistical Shape Models
367
12 10 8 6 4 2 0
0
100
200
300
400
500
600 Iterations
Fig. 2. Left: The influence of different bone thresholds on the resulting model. The mesh represents the registered model, the solid model represents the manually segmented CT model for comparison : (a) too low threshold , (b) optimal threshold , (c) too high threshold. Right: Evolution of the energy during function minimization
The generic model is represented by a triangle mesh. The original mesh obtained by applying the marching cube algorithm to a manually segmented vertebra CT-exam is decimated and smoothed to avoid overfitting. Two methods have been investigated to attract the model boundary to the object contours in the gray-level image. The first method is based on the bone threshold in the gray level image. Experiments performed on CT images of the vertebral column have shown that good segmentation results can be obtained except for regions where two neighboring vertebrae are articulating and thus the bone gaps are too narrow. Figure 2 shows the effect of varying the bone threshold on the final registered model. In (a) the chosen threshold is too low, in (b) it is optimal (found experimentally) and in (c) it is too high. A second approach based on a statistical gray-level model for each model vertex leads to similar results. The optimization of E(p) is performed using a conjugate gradient algorithm [PFTV92] and a multi resolution representation of T in order to smooth the solution and to speed up the minimization. Initial registration is performed by manual alignment. Figure 2 illustrates the evolution of the energy as a function of the number of iterations. The steps correspond to resolution changes of the deformation grid. A statistical model of lumbar vertebrae has been computed based on the statistical analysis of 30 vertebrae (L1-L4). Fig. 4, left shows the effect of applying ±2 standard deviations of the first two modes of the obtained model to the mean shape. Fig. 4, right shows the captured variability of the statistical vertebra model as a function of the first n eigenmodes.
3
Experiments and Results
3D matching. In a first experiment about 450 points were collected on the dorsal surface of a lumbar vertebra model in an area accessible during spine surgery by sliding a virtual pointer over the model surface. The computed statistical model was then matched to the scattered data. Figure 3 a) shows the mean shape rigidly matched to vertebrae. Figure 3 b) shows the deformed model after applying the nonrigid matching algorithm. One may observe that the overall fit
368
M. Fleute, S. Lavall´ee, and L. Desbat
a)
b)
Fig. 3. 3D/3D registration. a) after rigid registration, (b) after nonrigid registration 100
-2SD
Mean Shape
Mode 1
+2SD
Sum of variability [%]
95 90 85 80 75 70 65 60
Mode 2
55
0
5
10
15
20
25
30
Number of parameters (n)
Fig. 4. Left: First two deformation modes of the shape model (axial view). Right: Captured variability of the statistical vertebra model as a function of the first n eigenmodes in percent
is quite good although only very limited shape information was available (only few points in a restricted area were palpated) . For 5 experiments the mean RMS between the vertebra to be reconstructed and the final model was 1.2mm. 3.1
2D Matching
Experiments based on pure 2D data were performed using 2 orthogonal simulated X-ray views based on 10 different CT-models that were not contained in the population used to build the model. In each simulated projection 400 randomly chosen contour points were maintained thus taking into account the fact that in practice not all contour points will be reliably detectable. All datasets were then registered with the vertebrae shape model using all deformation modes. Fig 5 shows the shape model (mesh) after manual alignment
Integrated Approach for Matching of Statistical Shape Models
369
Fig. 5. Lumbar vertebra after manual alignment (left), after rigid registration (middle) and after non-rigid registration (right) Manually segmented contour points on a lateral X-ray image of a cadaver spine; segmented points are enlarged for visualization) (most right)
(a), after rigid (b) and after non rigid (c) registration together with the CT-model from which the projections used for this experiment were generated. The final average RMS between the deformed shape model and the underlying CT-model (reference) was 0.62 mm. The average overlap between the deformed shape model and the underlying CT-model was 91.12 percent. Another experiment was performed using real data. A cadaveric lumbar spine was attached to a computer controlled turn-table and imaged twice for orthogonal turntable-angles using a prototype of an interventional X-ray imaging system equipped with a digital X-ray detector Prior to the experiment a CT-scan of the cadaver was acquired (voxel size: 0.27 × 0.27 × 1.0mm3 ). The vertebra used for the experiment (L2) was segmented manually for later evaluation. L2 was segmented likewise manually in the two orthogonal X-ray images, see Fig. 5 for the lateral view. Subsequently the extracted contours were registered with the vertebra shape model using all deformation modes. The final average RMS between the deformed shape model and the CT-model was 1.27mm. The poorer result with respect to the experiment carried out with simulated projective data might be due to the fact that the cadaver spine belongs to a 80 year old specimen; the vertebrae show heavy degenerative changes. The computed shape model was based on a younger population, thus such shape variations occurring in older specimen are not gathered by the model. Hybrid matching. Subsection 3 shows good reconstruction results based on relatively few scattered point data acquired in a highly restricted surface area. However, due to insufficient geometrical constraints ( depending on the shape and on the spatial distribution of the acquired points ), it might be mandatory to provide the correct pose parameters prior to the non rigid matching. In [ea01a] authors report that the intra-operative rigid registration of a point cloud that has been acquired on a restricted dorsal surface area of a vertebra with a pre-operative acquired CT-model, can lead to unstable results (rotational uncertainty around the transversal axis) due to insufficient geometrical constraints in the accessible surface area of the vertebra. It is interesting to investigate if an additional X-ray image could provide sufficient extra information to suffi-
370
M. Fleute, S. Lavall´ee, and L. Desbat
ciently constrain the solution. This results in combining the methods proposed in the previous subsections, constituting a hybrid registration approach relying on both 2D projective and 3D point data. This is not only interesting for rigid registration but also for the more difficult problem of non rigid registration. An experiment was carried out using approximately 70 data points acquired on a CT based model in a small area accessible during spine surgery. Subsequently the shape model was fitted to both data - the 3D points and the segmented contours in the X-ray image - simultaneously. The final average RMS between the deformed shape model and the underlying CT-model (reference) was 0.68 mm.This compares well with the results for pure 2D/3D registration using two orthogonal X-ray views presented in the previous subsection.
4
Discussion
Obviously the presented approach can only be applied to healthy organs and shape pathologies that can be captured by statistical analysis of a population. Fractured organs cannot be modeled. However, various interventions could benefit from such methods. In the case of reconstruction of a torn cruciate ligament for instance, there is no pathologic shape variation of the adjacent bones (tibia, femur). Considering pedicle screw placement for spine instrumentation in the case of a vertebra compression fracture for instance, the shape of the vertebrae the screws are attached to (which are adjacent to the fractured one), is not pathologic; at least there is no pathologic shape variation associated to the reason for the surgery - the fracture. The possible benefits range from reduced radiation dose delivered to the patient, over decreased intervention time and overall inpatient time to the fact that the development of CAS systems with a more favorable cost/benefit ratio would be facilitated. This would help increasing the number of cases where sophisticated CAS technology is applicable and affordable.
Acknowledgments The Philips research laboratories in Hamburg, the Helmholtz Institut in Aachen and Johns Hopkins Medical School in Baltimore are gratefully acknowledged for providing CT data. Financial support from the Region Rhone-Alpes and the european projects IGOSII, CRIGOS, MI3 is also gratefully acknowledged.
References BM92.
ea99.
P.J. Besl and N.D. McKay. A method for registration of 3-D shapes. IEEE Transactions on Pattern Analysis and Machine Intelligence, 14(2):239–256, 1992. J. Lotjonen et al. Model extraction from magnetic resonance volume data using the deformable pyramid. Mecial Image Analysis, 3(4):387–406, 1999.
Integrated Approach for Matching of Statistical Shape Models
371
ea00.
J. Montagnat et al. Surface simplex meshes for 3d medical image segmentation. In ICRA, 2000. ea01a. A.F. Frangi et al. Automatic 3d asm construction via atlas-based landmarking and volumetric elastic registration. In Proc. Information Processing in Medical imaging (IPMI’01), pages 78–91, 2001. ea01b. C. Huberson et al. Surgical navigation for spine: Ct virtual imagery versus virtual fluoroscopy about 223 pedicle screws, in 88 patients. In CAOSUSA, pages 203–205, 2001. FL98. M. Fleute and Stephane Lavallee. Building a Complete Surface Model from Sparse Data Using Statistical Shape Models: Application to Computer Assisted Knee Surgery. In W. M. Wells, A. Colchester, and S. Delp, editors, Medical Image Computing and Computer-Assisted Intervention-MICCAI’98, pages 880–887. Springer Verlag, October 1998. FL99. M. Fleute and S. Lavallee. Nonrigid 3-D/2-D Registration of Images Using Statistical Models. In MICCAI’99, pages 138–147, 1999. PFTV92. W. H. Press, B. P. Flannery, S. A. Teukolsky, and W. T. Vetterling. Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press, Cambridge, England, second edition, 1992. Pic97. R. Pichumani. Construction of a Three-dimensional Geometric Model for Segmentation and Visualization of Cervical Spine Images. PhD thesis, Stanford University School of Medicine, 1997. SL96. R. Szeliski and S. Lavallee. Matching 3-D anatomical surfaces with nonrigid deformations using octree-splines. Int. J. of Computer Vision (IJCV), (18)(2):171–186, 1996. SP86. T.W. Sederberg and S.R. Parry. Free-form deformations of solid geometric models. Computer Graphics (SIGGRAPH’86), 20(4):151–160, 1986.
Q1. What is the most important original contribution of the paper? This paper presents an integrated approach to the problem of intra-operative reconstruction of 3D anatomical surfaces. The method allows to use different intra-operatively available sparse shape and image data of different dimension to construct anatomical surfaces based on statistical shape models. A new approach for the automatic shape extraction from CT images necessary for the computation of the statistical model is also presented. Q2. What is the clinical relevance of the work presented? 2. applied science ready for clinical inputs 5. cadaveric or animal studies were used in this work Q3. What is the most closely related work by other groups and how does your work differ? In [1] authors use biplanar X-ray images to reconstruct the 3D shape of scoliotic vertebrae. Our work differs by the fact that it allows to use different data sources simultaneously (scattered point data, ultrasound, X-ray images, etc). Further it is not explained in [1] how the vertebrae were segmented from the CT images and how they are aligned to compute the statistical model. This is a complex task especially when considering large databases which are necessary for a well performing resulting statistical model. Our work presents a highly automated approach for this task.
372
M. Fleute, S. Lavall´ee, and L. Desbat
Q4. Specify if this paper presents an extension of or close relates to some of your previously published work and state precisely the difference. Our work presents an extension to [2] and [3] and [4]. The approach presented in this paper allows to use 3D data ([2]) and 2D data ([3]) simultaneously to reconstruct anatomical surfaces: The idea for the automatic shape extraction method has been presented in [4] but no matching results with real data were provided. Q5. Specify the thematic categories and characterize your work best. Computer Assisted Surgery, Medical Imaging, Data Fusion, X-Ray images, Scattered Point Data, Shape Reconstruction, Shape Analysis, Deformable Model, Statistical Shape Model, Model Based Segmentation, Non-Rigid 3D/3D Registration, Non-Rigid 3D/2D Registration Q6. References, if any, for answering Q1 to Q5. 1. S. Benameur, M. Mignotte, S. Parent, H. Labelle, W. Skalli, J. De Guise. 3D Biplanar reconstruction of scoliotic vertebrae using statistical models. International Conference on Computer Vision and Pattern Recognition, CVPR’ 2001, Kauai Marriott, Hawaii, USA, Vol. 2, pages 577-582, December 2001. 2. M. Fleute, S. Lavall´ee. Building a Complete Surface Model from Sparse Data Using Statistical Shape Models: Application to Computer Assisted Knee Surgery. Medical Image Computing and Computer-Assisted InterventionMICCAI’98. Boston, USA, 1998. 3. M. Fleute, S. Lavall´ee. Nonrigid 3D/2D registration of Images using statistical models. MICCAI’99, Cambridge. 4. M. Fleute, L.Desbat, R.Martin, S. Lavall´ee, M. Defrise, X. Liu, R. Taylor. Statistical model registration for a C-arm CT system. IEEE Medical Imaging Conference 2001, San Diego, USA.
Building and Testing a Statistical Shape Model of the Human Ear Canal Rasmus Paulsen1 , Rasmus Larsen1 , Claus Nielsen2 , Søren Laugesen2 , and Bjarne Ersbøll1 1
2
Informatics and Mathematical Modelling, Technical University of Denmark Richard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby, Denmark {rrp,rl,be}@imm.dtu.dk, http://www.imm.dtu.dk/ Oticon Research Centre Eriksholm, Kongevejen 243, DK-3070 Snekkersten, Denmark {cni,slu}@oticon.dk, http://www.oticon.com/ Abstract. Today the design of custom in-the-ear hearing aids is based on personal experience and skills and not on a systematic description of the variation of the shape of the ear canal. In this paper it is described how a dense surface point distribution model of the human ear canal is built based on a training set of laser scanned ear impressions and a sparse set of anatomical landmarks placed by an expert. The landmarks are used to warp a template mesh onto all shapes in the training set. Using the vertices from the warped meshes, a 3D point distribution model is made. The model is used for testing for gender related differences in size and shape of the ear canal.
1
Introduction
Hearing aids come in a number of different styles. The smallest of these styles is called CIC (Completely In the Canal). A CIC hearing aid consists of an acrylic shell containing microphone, amplifier, loudspeaker, and battery. It is placed completely in the ear canal rendering it invisible to an observer viewing the bearer from the front. This is cosmetically appealing, and a number of acoustical advantages are also given. A CIC is produced for the individual patient based on a silicon mold of the ear canal. It is obvious that the space available inside a CIC hearing aid is severely limited. Hence, both the design of the internal components of the CIC and the placement and orientation of these are very critical as to whether it is actually possible to build a CIC for a given ear. Today the aforementioned designs are based on the experience and skills of the mechanical engineers in the hearing aid industry and a general knowledge about the anatomy and geometry of the ear. It is acknowledged that systematic knowledge of the geometry of ear canals and the variation thereof potentially could be extremely helpful in the mechanical design of new components for hearing aids. To our knowledge no systematic description of the variation of the human ear canal across a population exists. Measurements of the anatomy of a single ear canal have been made for the purpose of prediction of sound-pressure T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 373–380, 2002. c Springer-Verlag Berlin Heidelberg 2002
374
R. Paulsen et al.
level distribution [1]. Manufacturers of hearing aids have made initial testing of rapid prototyping of hearing aid shells using laser scans of ear impressions but have not performed statistical analysis of these. It is obvious that the systematic description of the variation of the shape of the ear canal must be done using statistical methods. In recent years shape analysis has been used in the description, identification and segmentation of biological shapes. A popular method for building shape models is the Active Shape Model method by Cootes et al. [2]. It is dependent of corresponding landmarks placed on the shapes in the training set. Most previous approaches to automated and semiautomated landmark generation and registration of 3D-surfaces use the local surface geometry. Examples of this are the non-rigid registration technique by Feldmar and Ayache [3], the local geometry and surface geodesic approach by Wang et al. [4] and the surface signature technique by Yamany et al. [5]. A method for automatic landmark generation based on a symmetric version of the iterative closest point algorithm is presented in [6]. However, this method is dependent on the global variation of the shape and apparently does not handle boundary areas or areas that are not well defined for all shapes. In this project the ear canals are represented as 3D-surfaces constructed by a laser scanner. The local surface geometry of the ear canals varies much from one individual to another and therefore only very few ridges and extremal points are stable when comparing groups of ear canals. The surfaces of the ear canals are not closed due to the opening of the ear canal and because the ear impressions are terminated in front of the eardrum. Automatic landmark generation and correspondence is therefore difficult to establish with methods based on surface geometry. We have chosen to base our method on the assumption that it is possible to place anatomical landmarks on the ear canal. The anatomical landmarks do not constitute an exhaustive description of the surface of the ear canal and it is therefore necessary to generate a more dense set of landmarks describing the shape. Interpolating landmarks is straightforward in 2D, but in 3D no spatial ordering of landmarks is usually defined and therefore the interpolation is much more difficult. A landmark based approach is found in [7] where a template mesh is deformed to each shape in the training set using a thin plate spline transformation based on the annotated landmarks. This is followed by a regularisation step where each vertex of the template mesh is adapted to the current shape without causing folds. This method is well suited for closed surfaces since it does not incorporate the possibility that patches of the surface are not well defined on all shapes in the training set. A similar approach is found in [8] with the addition that the method is able to handle surfaces with ill-defined areas. This is done by pruning the template mesh to only include the vertices that are well defined for all training shapes. In this method, the training shapes are warped to a template shape, and the correspondence is then made by projecting the vertices from the template shape to the warped training shapes. This is followed by an inverse warping of
Building and Testing a Statistical Shape Model of the Human Ear Canal
375
the adapted template shape, which gives the dense correspondence between the training shapes. Our approach is similar to [8] but instead of warping each shape in the training set to the template mesh, the template mesh is warped to each shape in the training set eliminating the need for an inverse warp. Other methods that could be well suited for the tubular-like ear canals are the spherical harmonics approach by Gerig [9] or the M-Rep approach by Pizer [10]. However, it is not clear how these methods will work on non-closed surfaces. A similar method that supports non-closed surfaces is the Fourier surfaces explored by Staib et al. [11].
2
Method
The data were collected by laser scanning of a set of 29 ear impressions taken from 20 male and 9 female subjects. The surfaces are reconstructed using the Power Crust [12]. Each reconstructed surface contains approximately 20,000 vertices and 35,000 triangles. An example of a scanned ear impression is seen in Fig. 1. An anatomical description of the external ear and a formal naming convention used for ear impressions can be found in [13,14]. 2.1
Annotation of Anatomical Landmarks
Using a custom-made surface annotation tool the third author, who is an expert of the anatomy of the ear canal has annotated the 29 ear canals. 18 anatomical landmarks are placed on each ear canal. The landmarks constitute a sparse correspondence between the surfaces of the ear canals in the training set. The surface of the ear canal is not closed and therefore it is necessary to identify the invalid areas. Planes that separate the valid parts of the surface from the invalid parts are defined for that purpose. In Fig. 1 an ear canal with the anatomical landmarks and separating planes is seen. 2.2
Surface Correspondence Using Thin Plate Spline Warping
The anatomical landmarks do not constitute an exhaustive description of the surface of the ear canal and it is therefore necessary to generate a more dense set of landmarks describing the shape. For that purpose a template mesh is constructed and applied to all shapes in the training set. The mesh constructed by the surface reconstruction contains far more vertices than needed to give a satisfactory shape description. To construct a template mesh that can be used as a basis for the further shape analysis a well defined ear canal surface is chosen and decimated by a standard algorithm [15]. The anatomical landmarks and the template mesh are used to establish a dense surface correspondence between the shapes in the training set. Since the template mesh is made from one of the actual ear canals in the training set anatomical landmarks of this mesh exist.
376
R. Paulsen et al.
The template mesh is applied to each of the shapes in the training set using a Thin Plate Spline (TPS) warp based on the corresponding anatomical landmarks. TPS is a warp function that minimises the bending energy [16]. Since the TPS transform is only exact for the anatomical landmark locations, the vertices of the template mesh will not lie on the surface of the target shape. Finally, moving each vertex in the warped template mesh to the closest point on the target surface completes the dense correspondence. This introduces the risk of so called inversions, where the vertices of the template mesh shift place and cause folds in the mesh. Techniques to avoid this exist [17,7] but it has not been necessary here. When the template mesh is warped to another shape, it can happen that some points are placed outside the valid area on the target shape. When warped to each shape in the training set, the template mesh must only cover valid areas. This is accomplished by investigating whether any points from the template mesh are warped into the areas marked as invalid by the separation planes. The template mesh is then pruned to contain only the points that are warped to valid areas for all shapes in the training set. The template mesh contains 3,000 vertices after decimation and pruning. When the dense correspondence has been established the connectivity of the pruned template mesh can be applied to the points of correspondence on each shape to yield a set of new meshes. It is now possible to dispose of the anatomical landmarks as well as the original meshes of the training set. The set of meshes with dense correspondence is used in the following statistical shape analysis.
2.3
Building the Point Distribution Model
The set of corresponding meshes are aligned by a generalised Procrustes analysis [18]. The pure shape model is built by using a similarity transformation in the Procrustes alignment while a rigid-body transformation is used to build the size-and-shape model [19]. Following the approach found in the Active Shape Model [2] a principal component analysis (PCA) is performed on the Procrustes aligned shapes. Each shape is represented as a vector of concatenated x, y and z coordinates xi = [xi1 , yi1 , zi1 , . . . , xin , yin , zin ]T , i = 1, . . . , s, where n is the number of vertices and s is the number of shapes. The PCA is performed on the shape matrix D = [(x1 −x)| . . . |(xs −x)], where x is the average shape. The eigenvectors can be regarded as translation vectors that when added to the mean shape will deform the shape according to the modes of variation found in the training shapes. A new shape exhibiting the variance seen in the training set is made by adding a combination of eigenvectors to the average shape xnew = x + Φb, where b is a vector of weights controlling the modes of shape variation and Φ = [φ1 |φ2 | . . . |φt ] is the matrix of the first t eigenvectors. An arbitrary shape x aligned to the Procrustes average can be approximated by the shape model by projecting the residuals from the average shape onto the eigenvectors b = ΦT (x − x). The resulting parameter vector b is used in the statistical analysis below.
Building and Testing a Statistical Shape Model of the Human Ear Canal
3 3.1
377
Results General Observations
The three first modes of variation of the pure shape model are seen in Fig. 2 and the average shape is seen in Fig. 1. All the generated shapes look like real ear canals with no deformations or folds in the mesh. It is seen that the mode 1 deformation consists of a bending of the canal and a flattening of the concha part. Mode 2 explains some of shape variation seen in the inner part of the ear canal. Mode 3 is a combination of a flattening and twisting of the inner part of the ear canal and a general shape change of the concha. The distribution of the modes against each other has been examined using pairwise plots, and no obvious abnormalities were found.
Fig. 1. To the left an example of a surface representation of an ear canal with the anatomical landmarks and the planes that separate the valid areas from the invalid areas. The thin structure in the top is the actual canal. The larger lower part is the concha. Only part of the concha is used and therefore a plane through concha is defined. To the right is the average shape from the pure shape model
3.2
Classification of Surfaces
Testing the validity and usability of the shape model is done by examining its ability to reflect gender related differences in the size and shape of the ear canals. It is first examined if there is a systematic gender-related difference in the centroid sizes of the ear canals. This test is performed using the centroid sizes of the dense surface meshes of the training set calculated prior to the Procrustes alignment. The centroid size is the square root of the sum of squared Euclidean distances from each landmark to the center of mass (the centroid) [19]. A standard t-test shows a highly significant difference in size between males and females (p = 0,0003). The gender related difference in size corresponds to 9% of the average centroid size. Testing for a shape difference between genders is done using the vertices from the Procrustes aligned dense surface meshes from the pure shape model, meaning that they are scaled to unit size. In this context, shape is thus unrelated
378
R. Paulsen et al.
(a) Mode 1
(b) Mode 2
(c) Mode 3
Fig. 2. Pure shape model. Each shape has been generated by varying the first three modes of variation between −3 (top) and +3 (bottom) standard deviations
to size. In order to avoid the problem of multiple testing (as would be the case for stepwise selection, for example) the following procedure is adopted. First the dimensionality is reduced by a principal component analysis as described earlier, secondly the number of components to retain is chosen, and finally a multivariate analysis of variance [20] is performed on these components. A typical method for determining the number of principal components to retain is to include just enough components to explain some arbitrary amount (typically 98%) of the variance. This criteria often results in far to many components being included in the further analysis and therefore Horn’s parallel analysis [21] is chosen as a more objective way of deciding on how many components to include. The eigenvalues of the shapes are compared to those obtained for equivalent uncorrelated data, obtained by randomly scrambling each row in the shape matrix D. In this way, the number of modes to retain is 7 as seen on the scree-plot in Fig. 3. A multivariate analysis of variance (which in this case is equivalent to Hotelling’s T2 ) of these 7 principal component scores per shape although not strictly significant (p=0,083) does indicate a shape difference between genders. Univariate analyses on the single principal components show that this is mainly due to the first mode of variation (p = 0,0052). This is also seen in Fig. 3 where centroid size is plotted against mode 1 from the pure shape model. The exaggerated male-female mode shape variation is seen as the shape variation of mode 1 in Fig. 2.
4
Summary and Conclusions
In this paper a method to generate dense surface distribution models of the human ear canal has been described. The generated models show consistency,
1 0 −1 −2
Mode 1 (Standard deviations)
25 5
10
15
20
Real data Randomised
0
Percent of Total Variation
379
2
Building and Testing a Statistical Shape Model of the Human Ear Canal
0
5
10
15 Mode
(a) Scree plot
20
25
23000
25000
27000
29000
Centroid Size
(b) Size and shape plot
Fig. 3. To the left is a plot of the eigenvalues of the shapes from the pure shape model, compared to those for a randomised version of the data (each row of the shape matrix D was scrambled). The lines are crossing approximately where mode = 7. To the right is a plot of centroid size versus mode 1 from the pure shape model. The full dots are females while the plus signs are male. It is seen that both size and mode 1 separates males from females
and a large part of the variation found in the training data is explained by a small number of modes of variation. The method is general, and can be applied to all types of surfaces as long as it is possible to mark the valid areas of the surfaces and to place landmarks. From an anatomical point of view it is interesting that the model is able to differentiate ear canals from males from ear canal from females based on both their size and their shape. The anatomical results based on this small data set are only strong enough to support general conclusions. For more detailed results a larger and more balanced data set is required. The most important results of this paper are the proof-of-concept regarding the ability to build a meaningful statistical shape model of the human ear canal and that it is possible to describe a complex gender-related shape variation using a single parameter.
References 1. Stinson, M.R., Lawton, B.W.: Specification of the geometry of the human ear canal for the prediction of sound-pressure level distribution. J. Acoust. Soc. Am. 85 (1989) 2492–2503 2. Cootes, T., Cooper, D., Taylor, C., Graham, J.: Active shape models - their training and application. Computer Vision and Image Understanding 61 (1995) 38–59 3. Feldmar, J., Ayache, N.: Rigid, affine and locally affine registration of free-form surfaces. International Journal of Computer Vision 18 (1996) 99–119 4. Wang, Y., Peterson, B.S., Staib, L.H.: Shape-based 3D surface correspondance using geodesics and local geometry. Computer Vision and Pattern Recognition 2 (2000) 644–651
380
R. Paulsen et al.
5. Yamany, S.M., Farag, A.A.: Free-form surface registration using surface signatures. In: ICCV (2). (1999) 1098–1104 6. Brett, A., Taylor, C.: A method of automated landmark generation for automated 3D pdm construction. In: British Machine Vision Conference. (1998) 914–923 7. Lorenz, C., Krahnst¨ over, N.: Generation of point-based 3D statistical shape models for anatomical objects. Computer Vision and Image Understanding 77 (2000) 175– 191 8. Hutton, T., Buxton, B., Hammond, P.: Dense surface point distribution models of the human face. In: Proceedings of the Workshop on Mathematical Methods in Biomedical Image Analysis, CVPR (2001) 9. Gerig, G., Styner, M., Jones, D., Weinberger, D., Lieberman, J.: Shape analysis of brain ventricles using spharm. In: Workshop on Mathematical Methods in Biomedical Image Analysis, IEEE Computer Society (2001) 171–178 10. Pizer, S., Thall, A., Chen, D.: M-reps: A new object representation for graphics. Technical report, University of North Carolina at Chapel Hill (1999) 11. Staib, L.H., Duncan, J.S.: Model-based deformable surface finding for medical images. IEEE Trans. Medical Imaging 15 (1996) 720–731 12. Amenta, N., Choi, S., Kolluri, R.: The power crust. In: Proceedings of 6th ACM Symposium on Solid Modeling. (2001) 249–260 13. Alvord, L.S., Morgan, R., Cartwright, K.: Anatomy of an earmold: A formal terminology. J. Am. Acad. Audiol. 8 (1997) 100–103 14. Alvord, L.S., Farmer, B.L.: Anatomy and orientation of the human external ear. J. Am. Acad. Audiol. 8 (1997) 383–390 15. Schroeder, W., Zarge, J., Lorensen, W.: Decimation of triangle meshes. Computer Graphics 26 (1992) 65–70 16. Bookstein, F.: Shape and the information in medical images: A decade of the morphometric synthesis. Computer Vision and Image Understanding 66 (1997) 97–118 17. Andresen, P., Nielsen, M.: Non-rigid registration by geometry constrained diffusion. In: Proceedings of Medical Image Computing and Computer Assisted Intervention. Volume 1679. (1999) 533–543 18. Gower, J.: Generalized Procrustes analysis. Psychometrika 40 (1975) 33–51 19. Dryden, I., Mardia, K.: Statistical Shape Analysis. Wiley, Chichester (1997) 20. Johnson, R.A., Wichern, D.W.: Applied Multivariate Statistical Analysis. PrenticeHall (1982) 21. Horn, J.L.: A rationale and test for the number of factors in factor analysis. Psychometrika 30 (1965) 179–186
Shape Characterization of the Corpus Callosum in Schizophrenia Using Template Deformation Abraham Dubb, Brian Avants, Ruben Gur, and James Gee Departments of Bioengineering, Psychiatry and Radiology University of Pennsylvania Philadelphia, PA, USA 19104 {adubb,avants}@grasp.cis.upenn.edu
[email protected] [email protected]
Abstract. The presence of morphologic differences in the corpus callosum in people with schizophrenia has been the subject of intense investigation for a number of years. Researchers, however, have been unable to produce consistent results through comparison of total and partitioned areas. As an alternative to these indices, we use template deformations to find regional size variations in a population of 100 patients. We generated a set of Jacobian determinant maps using k-cluster segmentation followed by a curve registration algorithm. We performed several comparisons including control versus schizophrenia, control versus schizophrenia within gender and age-related effects of control versus schizophrenia. Statistical plots revealed a substantial area of contraction in the anterior callosum associated with schizophrenia. Gender stratification showed a large contribution of this effect was from females. In addition, patients failed to demonstrate an age-related expansion of the splenium that was present among controls. Our results show that template deformation morphometry can be used to show morphologic differences in the callosa of people with schizophrenia.
1
Introduction
The corpus callosum is the largest white-matter tract in the human brain and serves as the primary means of communication between the two cerebral hemispheres. In many instances, higher cortical function is mediated by this tract as it allows integration of cortical processes from opposite sides of the brain. The corpus callosum has attracted much attention from researchers in several fields including neurology, psychiatry and computer science. There are several reasons for this. First, the corpus callosum is easily identified on a mid-sagittal section of the brain. Its white matter fibers in cross-section stand out in stark relief against the surrounding cingulate gyrus and lateral ventricles. This anatomical property also aids in efficient automated segmentation of the corpus callosum. Second, the actual shape of the callosum is fairly well preserved with only subtle phenotypic variations present in the general population. This factor aids in T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 381–388, 2002. c Springer-Verlag Berlin Heidelberg 2002
382
A. Dubb et al.
template matching and template deformation morphometry (TDM). Finally, the literature provides evidence that the corpus callosum may be involved in a number of disorders including schizophrenia, Alzheimer’s dementia [1] and mental retardation [2]. Many have attempted to relate the actual cross-sectional shape of this tract to such conditions. The role of the corpus callosum in schizophrenia has been the subject of much interest for a number of years. Rosenthal [3] peformed a post-mortem comparison of brains of people with schizophrenia versus control and showed a 1 mm increase in thickness in the callosa of patients. The advent of MRI has allowed researchers to study callosa in vivo in order to discover morphologic differences. Traditionally, researchers have used length, thickness and cross-sectional area as indices to compare with controls. A major difficulty with this approach lies in whether or not to normalize the callosa for brain size. Normalization of the corpus callosum is confounded by the fact that the relationship between callosal size and brain volume is not linear, and brain volume, itself, is decreased in schizophrenia [4]. Even if one attempts to normalize the callosum, debate exists on how best to do this. For example, do we normalize to brain volume, total mid-sagittal area or to the anterior-posterior (AP) dimension of the brain? The difficulty of using area as an index is well demonstrated in a meta-analysis of callosa size in schizophrenia performed by Woodruff [4]. Of the 11 studies that were included in the final analysis, 8 demonstrated greater callosal area in controls while the other 3 yielded the opposite result. Perhaps more importantly, only 3 of the 11 were able to achieve statistical significance. A more recent method for studying the callosum involves dividing the crosssection of the tract into partitions and comparing regional area instead of total area. Downhill [5] used this technique in a comparison of callosa from 27 people with schizophrenia and 30 controls. In this study, the callosum was divided into 30 AP sectors and statistical analysis showed that the genu and splenium to be smaller in schizophrenia. While this approach may seem like a viable alternative to total area, the normalization question is not resolved. Furthermore, selection of a partitioning method is arbitrary since no one has estabilished the best way to divide up the callosum. Given the issues raised with area calculation and the historically inconclusive results associated with this technique, we chose to apply a registration and TDM approach in order to demonstrate local size variations within the callosum [6] [8]. Pettey and Gee [8] describe this approach and apply it to the comparison of male and female callosa. Our current study makes use of a curve registration algorithm over the extracted callosal boundaries, while Pettey directly registered images of the callosa. The underlying premise, however, is identical. TDM offers several advantages over more traditional methods of callosal analysis; TDM provides information on the regional shape of a structure rather than simply overall size, offering a potentially richer and more flexible description of anatomic variations. Furthermore, we believe that TDM can be used to demonstrate phenotypic variation for a wide array of neuroanatomical structures.
Shape Characterization of the Corpus Callosum
2 2.1
383
Methods Subjects and Data Acquisition
The Schizophrenia Center at the University of Pennsylvania maintains a database containing hundreds of cranial MRI’s of both psychiatric patients and healthy volunteers. Our selection criteria specified right handed participants with a diagnosis of schizophrenia or control, for whom a 1 mm slice thickness scan was available. While there were many MRI’s available using 5 mm slice thickness, we opted to exclude these studies on the basis that their inferior pixel resolution would make them poor substrates for TDM analysis. Magnetic resonance imaging scans were acquired on a 1.5 T scanner (Signa; General Electric Co., Milwaukee, WI) with a spoiled gradient recalled pulse sequence using the following parameters: flip angle of 35 ◦ , repetition time of 35 milliseconds, echo time of 6 milliseconds, field of view of 24 cm, 1 repetition, 1 mm slice thickness and no interslice gaps. Transaxial images were in planes parallel to the orbitomeatal line, with resolution of 0.9375 × 0.9375 mm2 . 2.2
Reslicing
Our goal in reslicing the brain volumes was to obtain the most precise midsagittal image possible. This procedure was necessitated by the fact that most participants had tilted their heads slightly to the left or right side during the scan. We adopted the reslicing protocol used by the Schizophrenia Center by performing the following steps: first, the volume was rotated around a manually drawn anterior-posterior axis until the eyeballs were positioned in the same horizontal plane. Next, an approximate mid-sagittal image was obtained by manually drawing the slice plane through mid-line structures at the level of the interventricular septum. Third, the sagittal plane was moved laterally until the falx cerebri and uninterrupted central canal (at the level of the midbrain) were present in the same plane. In some cases, the volume was tilted slightly around the AP axis until both structures were present in a single plane. This mid-sagittal image was then extracted and rotated around its center so that the anterior-commissure and posterior-commissure were horizontal. Figure 1 shows one sample mid-sagittal image processed in this way. 2.3
Segmentation
We employed a k-means clustering algorithm which groups image voxels into k clusters by minimizing the intensity variance within each cluster [10]. Because the performance of this algorithm was highly dependent on the brightness of the input image and number of clusters, we opted to perform cluster segmentation on the same data-set using k = 3, 4, and 5 and scaled at multiple brightness levels. This process yielded 18 sets of segmentation images (3 segmentation levels × 6 brightness settings). For each mid-sagittal image, the best segmentation image from the set of 18 was chosen and modified so that the callosa was completely
384
A. Dubb et al.
Fig. 1. Raw mid-sagittal image (left) and segmentation mask (right). The raw image on the left shows the three anatomical structures which signify a mid-sagittal slice. The frequent presence of the interventricular septum within this slice complicates segmentation. On the right, a successful k-means probability based segmentation image which shows the corpus callosum as a solitary island.
separated from surrounding structures. The modified image was then used to extract the corpus callosum from the mid-sagittal section. Figure 1 shows one sample segmentation image. Once we have generated the set of segmented callosa, our next step was to crop the images and perform registration and deformation to a single template callosum. While selection of a particular registration method probably has a profound impact on the final results, choosing the optimal algorithm remains a subject of debate. Our lab is currently involved in an ongoing study to determine the effect that registration method and template selection may have on final results. In this particular case, we applied a curve-registration algorithm using a male control as a template. 2.4
Callosal Matching
We want to find a vector field that smoothly maps the image I1 to image I2 , such that the borders and interiors of the anatomy are well matched. We base this mapping entirely on the boundary of the callosa for two reasons. First, for our purposes, the callosa is completely defined by the shape of its boundary. Second, a globally optimal solution to the one dimensional correspondence problem can be found with dynamic programming [7]. We then interpolate the boundary correspondence to the interior of the shapes. Our formulation of the matching problem [9] is in terms of the variational calculus. We want to find a monotonic path through the s1 × s2 reparameterization space,where s denotes the arc length parameter, so that the template callosal
Shape Characterization of the Corpus Callosum
385
curve is better matched with a subject curve. Solutions to Laplace’s equation are well known to be maximally smooth. We thus choose to find g: s1 → s2 that minimizes the integral form of Laplace’s equation. This gives the energy functional, 1 1 ∇V (s1 )2 ds1 , (1) E(g) = 2 0 where V (s1 ) = C2 (g(s2 )) − C1 (s1 ).
(2)
We note that this formulation is inherently asymmetric, but that the asymmetry has little consequence in this application. We then use this solution as a boundary condition for solving Laplace’s equation on the reference callosum interior. Discretization and successive over relaxation allows the Euler-Lagrange form of the following energy function to be minimized efficiently, 1 1 E(F ) = (F − V )2 ds1 + ∇F 2 dΩ. (3) 2 s1 2 Ω Here, F is the vector field on the interior, Ω. Note that we minimize the same energy for the boundary and interior. 2.5
Statistical Analysis of the Jacobian
The registration map F may be viewed as a deformation field that takes the pixels of the template callosa to those of the subject callosa. The deformation field is defined as a set of N displacement vectors. The following equation uk = (u1k , u2k )
(4)
gives the displacement vector needed to bring the k th pixel of the template to the corresponding pixel of the patient. Using the vector field, us , which describes the displacement field that relates the atlas to subject callosum, s, we can generate the following transformation equation: Ts (x) = x + us (x).
(5)
Ts (x) gives the corresponding position in subject callosum s for pixel x in the template. In order to describe the expansion or contraction that occurs at each pixel in this transformation, we use the following quantity: ∂Ts (6) ∂x , which is known as the determinant of the Jacobian of the transformation. For ease of notation, we will simply refer to this value as the Jacobian or J. In order to normalize the Jacobian for global variation [8], we scale this quantity at each pixel, k: Jk jk = . (7) k Jk
386
A. Dubb et al.
If we are comparing the Jacobians of controls with patients, we may calculate the t-score at each pixel, k: t+ k =
j c − jks k σk N1s +
where σk =
1 Nc
,
t− k =
j s − jkc k σk N1s +
(Nc − 1)σc2 + (Ns − 1)σs2 . N c + Ns − 2
1 Nc
,
(8)
(9)
jkc and jks are the mean normalized Jacobians for the control and schizophrenia groups at the k th pixel, respectively. σc and σs are the standard deviations of jk for controls and patients, respectively. Nc and Ns are the number of subjects in the control group and schizophrenia group. We calculate the t-score in both directions so that we can differentiate areas of contraction and expansion. The − t-scores, t+ k and tk , may now be converted to p-values using the Student’s t cumulative distribution function and then threshold plotted over the template mask. We performed these series of calculations using several different subpopulations, including control/schizophrenia, control/schizophrenia within sex, and control/schizophrenia within age groups.
3
Results
Our query of the Schizophrenia Center database yielded a total of 290 cranial MRI’s, 190 control and 100 schizophrenic. The average age among these subgroups was 28.0 + 9.9 years and 33.7 + 11.2 years, respectively. A number of findings are present in the statistical plots shown in figure 2. Figure 2a shows the comparison between controls and schizophrenia. There is a substantial area of contraction in the anterior third of callosa in patients as compared to controls. There is a small relatively insignificant area of expansion in the mid-body. Stratification between the sexes show that the anterior callosa are more contracted in female than in male patients. The small area of mid-body expansion seems to be due solely to an effect in males. In contrast, females with schizophrenia have a focal area of expansion in the caudal splenium that does not exist in males. The comparison of age-related effects reveals a pronounced difference.Healthy subjects aged 36 to 55 were found to have a larger splenium than their younger counterparts. This splenial expansion is virtually absent in the corresponding age-related study of patients. Moreover, there is a relatively insignificant area of expansion within the isthmus among older patients that is not present in older controls.
4
Discussion
The goal of this paper was to demonstrate the utility of template deformation morphometry in showing subtle morphologic differences in the corpus callosum
Shape Characterization of the Corpus Callosum
(a) Controls vs. patients
(d) Young controls vs. old controls
(b) Male controls vs. male patients
(e) Young patients vs. old patients
387
(c) Female controls vs. female patients Fig. 2. P-value plots for five different subpopulation comparisons. Figures on the left of each panel show areas of contraction in the study group in comparison to controls, whereas figures on the right show areas of expansion in the study group. Figure 2a compares controls with patients. Figure 2b compares male controls with male patients. Figure 2c compares female controls with female patients. The last two sets of callosa compare age-related effects in controls and then in patients. The left side shows areas of contraction in the older population. Figure 2d compares controls aged 18–35 with controls aged 36–55. Figure 2e compares patients aged 18–35 with patients aged 36–55.
of patients versus controls. Using this method, we discovered localized regions where the shape of the callosa was different in patients. It is noteworthy that our result showing contraction of the anterior third of the callosa in patients is opposite from the findings of Narr [11]. Using a parametric surface modeling approach, Narr et. al. showed that the anterior callosal segment was thicker in patients with schizophrenia. Narr’s study was limited, however, by a relatively small study population (25 patients and 28 controls). The comparison of age-related changes merits a closer look. Bartzokis [12] studied the age-related changes in the frontal and temporal lobes in men. He found that while gray matter decreases with age, men undergo an age-related expansion of white matter up to age 47. In a more recent study, Bartzokis [13] found that this normal age-related white matter expansion was absent in patients with schizophrenia. These findings are consistent with our results showing a relatively decreased degree of splenial expansion in older patients. The question of morphologic changes in the callosum of people with schizophrenia is far from resolved. Methods of analysis continue to dictate results. Through the application of template deformation morphometry, we hope we can produce results that are more objective and robust than those of more traditional methods.
388
A. Dubb et al.
Acknowledgements This work was supported in part by the USPHS under grants NS-33662, LM03504, MH-62100, AG-15116, AG-17586 and DA-14418.
References 1. Pantel, J., Schroder, J., Jauss, M., Essig, M., Minakaran, R., Schonknect, P., Schneider, G., Schad, L. R., Knopp, M. V.: Topography of Callosal Atrophy Reflects Distribution of Regional Cerebral Volume Reduction in Alzheimer’s Disease. Psychiatry Research. 90 (1999) 180-192 2. Marszal, E., Jamroz, E., Pilch, J., Kluczewska, E., Jablecka-Deja, H., Krawczyk, R.: Agenesis of Corpus Callosum: Clinical Description and Etiology. J. of Child Neurology. 15 (2000) 401-405 3. Rosenthal, R., Bigelow, L. B.: Quantitative Brain Measurements in Chronic Schizophrenia. Br. J. of Psychiatry. 121 (1972) 259-64 4. Woodruff, P. W. R., McManus, I. C., David A. S.: Meta-Analaysis of Corpus Callosum Size in Schizophrenia. Neurology, Neurosurgery, & Psychiatry. 58 (1995) 457-461 5. Downhill Jr., J. E., Buchsbaum, M. S., Wei, T., Spiegel-Cohen, J., Hazlett, E. A., Haznedar, M. M., Silverman, J., Siever, L. J.: Shape and Size of the Corpus Callosum in Schizophrenia and Schizotypal Personality Disorder. Schizophrenia Research. 42 (2000) 193-208 6. Davatzikos, C., Resnick, S.: Sex Differences in Anatomic Measures of Interhemispheric Connectivity: Correlations with Cognition in Women but not Men. Cerebral Cortex. 8 (1998) 635-640 7. Amini, A., Weymouth, T., Jain, R.: Using Dynamic Programming for Solving Variational Problems in Vision. IEEE Transactions on Pattern Analysis and Machine Intelligence. 12 (1990) 855-867 8. Pettey, D. J., Gee. J. C.: Sexual Dimorphism in the Corpus Callosum: a Characterization of Local Size Variations and a Classification Driven Approach to Morphometry. (2002) submitted 9. Avants, B., Gee, J. C.: Soft Parametric Curve Matching in Scale-Space. SPIE Medical Imaging. (2002) in press 10. Machado, A. M. C., Gee, J. C.: Atlas Warping for Brain Morphometry. SPIE Medical Imaging. (1998) 642-651 11. Narr, K. L., Thompson, P. M., Sharma, T.: Mapping Morphology of the Corpus Callosum in Schizophrenia. Cereberal Cortex. 10 (2000) 40-49 12. Bartzokis, G., Beckson, M., Lu, P. H., Nuechterlein, K. H., Edwards, N., Mintz, J.: Age-Related Changes in Frontal and Temporal Lobe Volumes in Men: A Magnetic Resonance Imaging Study. Arch. of General Psychiatry. 58 (2001) 461-465 13. Bartzokis G., Nuechterlein, K. H., Lu, P. H., Mintz, J.: Dysregulated Brain Development in Adult Men with Schizophrenia: a Magnetic Resonance Imaging Study. Arch. of General Psychiatry. (2001) submitted
389
3D Prostate Surface Detection from Ultrasound Images Based on Level Set Method Shao Fan 1
Ling Keck Voon 2
Ng Wan Sing 3
1
School of Electrical & Electronic Engineering,
[email protected] 2 School of Electrical & Electronic Engineering,
[email protected] 3 School of Mechanical & Production Engineering,
[email protected] CIMIL ─ Computer Integrated Medical Intervention Laboratory Nanyang Technological University, 50, Nanyang Avenue, Singapore 639798
http://mrcas.mpe.ntu.edu.sg/
Abstract. Accurate detection of prostate boundaries is required in many diagnostic and treatment procedures for prostate diseases. In this paper, a new approach based on level set method to perform 3D prostate surface detection from transrectal ultrasound (TRUS) images is presented. Contrary to many other deformable models, level set method offers several advantages such as minimal need for user input, flexible topology, and straightforward extension to 3D. However, it is subject to “boundary leaking” problem for ultrasound image segmentation due to the poor image quality. In this work, we first develop a fast discrimination method to extract the prostate region, then this region information, instead of the spatial image gradient, is incorporated into the level set method to remedy the “boundary leaking” problem. Various experimental results show the effectiveness of the proposed method.
1
Introduction
Prostate diseases are common in adult and elderly men. Typical symptoms are benign prostatic hyperplasia (BPH) and prostate cancer. With the number of men seeking medical care for prostate disease rising steadily, the need of a fast and accurate prostate boundary detection and volume estimation tool increases correspondingly. Currently, boundary detection and volume measurement are made manually, which is arduous and user dependent. A possible solution is to improve the efficiency by automating the boundary detection and volume estimation process with minimal manual involvement. This paper presents a new approach based on level set method [1] to semi-automatically detect the prostate surface from 3D transrectal ultrasound (TRUS) images. There have been a number of works so far on automatic segmentation of prostate from ultrasound images. A straightforward strategy is using edge detectors, such as Minimum/Maximum filter [2], derivative edge detectors [3, 4], sticks and weak T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 389–396, 2002. c Springer-Verlag Berlin Heidelberg 2002
390
S. Fan, L.K. Voon, and N.W. Sing
membrane fitting [5], to identify all the edges in the image then followed by edge selection and linking to outline the prostate boundary. However, due to the poor quality of ultrasound images, such kind of methods usually leads both to spurious boundaries in highly textured areas and to missed boundaries where prostate boundary is not well-delineated. Richard et al. [6] used texture features to segment 2D images of the prostate gland into prostate and non-prostate regions for forming a 3D image of the prostate from a set of parallel 2D images. Although some progress has been made, they acknowledged that the effect of using texture information is marginal. More efficient way is using deformable models, such as discrete dynamic contour (DDC) [7] and 3D deformable surface [8]. However, the success of their approaches is dependent on the careful initialization of the contour or surface, which requires the user to select points on the prostate boundary. In [9], wavelet-based techniques have been used to attempt to address this problem. Some researchers employed neural networks [10] and feature modeling [11] to segment the prostate from TRUS images. As reported in their work, these methods have good accuracy and robustness. However, neural networks requires extensive teaching sets so that the speed is very slow and feature modeling is only suitable for some particular shape-based prostate images. In this work, we develop a new approach based on level set method to automatically detect the prostate surface from 3D TRUS images. We first develop a fast discrimination method to extract the prostate region. Then this region information is incorporated into the level-set method instead of the spatial image gradient. In the following, we first give a brief description of our method and then we discuss the results.
2
Methods
While deformable models, such as Snakes [12], Fourier Surface [13] and Free-Form Deformation (FFD) [14] have been widely used in medical imaging applications, they have severe limitations: they are unable to handle complex geometry and changing topology without additional machinery, and complex implementation in 3D. To overcome these difficulties, the level set method has been proposed [1]. In this approach, a 2D curve C is represented by a 3D function ψ. The value of the 3D function at point p is defined as a distance d from p to C according to equation (1):
ψ ( p , t = 0) = ± d , 2
(1)
where p∈ℜ are points in the image space, and the plus (minus) sign is chosen if the point p is outside (inside) the 2D curve C(t = 0). In this manner, the 2D curve is represented by the zero level set C(t)={ p | ψ (p, t) = 0} of the level set function ψ. The level set method then evolves the 3D function ψ (p, t) that contains the embedded motion of C(t) instead of the original 2D curve. The evolution of the 3D function ψ can be expressed by means of a partial differential equation (PDE) as
3D Prostate Surface Detection from Ultrasound Images
∂ψ ( p, t ) + F ∇ψ = 0 ∂t
391
(2)
with a given initial condition ψ (p, t = 0), where ∇ψ denotes the gradient of ψ with respect to the spatial coordinates and F is the evolving speed. For numerical solution of equation (2), it is necessary to perform discretization in both space and time. For this purpose we can discretize space coordinates using a uniform mesh of spacing h, with grid nodes denoted by indices ij. Let ψ ijn be the approximation to the solution ψ (ih, jh, n∆t ) , where ∆t is the time step. The expression for ψ ijn+1 can be derived using the forward definite difference method:
ψ ijn +1 = ψ ijn − ∆tF ∇ψ ijn .
(3)
Let K be the mean curvature of the evolving front, then K can be easily obtained from the divergence of the gradient of the unit vector to front, i.e., 2 2 ∇ψ ψ xxψ y − 2ψ xψ yψ xy + ψ yyψ x . K = ∇⋅ = ∇ψ (ψ x2 + ψ y2 ) 3 / 2
(4)
As reported in [1], the speed term F depends on the curvature K and is separated into a constant term F0 and the remainder F1(K), that is
F ( K ) = F0 + F1 ( K ) .
(5)
The constant term F0 causes the model to seek object boundaries and the curvature component F1 controls the regularity of the deforming shape. In practice, speed
F (K ) = 1 − ε K
(6)
is commonly used, where ε is the entropy condition which regulates the smoothness of the curve and ε must be greater than zero [1]. To ensure the propagating curve front will stop in the vicinity of desired object boundary, F is proposed to be pre-multiplied with an image dependent quantity kI [1],
k I ( x, y ) =
1 1 + ∇ Gσ ∗ I ( x, y )
p
,
p = 1, 2
(7)
where Gσ*I denotes image convolved with the Gaussian smoothing filter whose characteristic width is σ. kI has values close to zero in regions of high image gradient (for example, possible edges) and close to unity in regions with relatively constant intensity. Clearly, the key task of this level set method is to design an appropriate speed function F which can drive the evolving front to the desired boundary. Unlike other deformable models, extending the level set method to 3D is straightforward [1]. In that case, ψ is a 4D scalar function, ψ : ℜ3×ℜ+→ℜ, which evolves over time and its mean curvature can be expressed as:
392
S. Fan, L.K. Voon, and N.W. Sing
K=
ψ xx (ψ y2 + ψ z2 ) + ψ yy (ψ x2 + ψ z2 ) + ψ zz (ψ x2 + ψ y2 ) − 2(ψ xyψ xψ y + ψ xzψ xψ z + ψ yzψ yψ z ) . (ψ x2 + ψ y2 + ψ z2 )3 / 2
(8)
Due to the intrinsic features of ultrasound images, such as noises, speckles, shadowing and low contrast, the boundary feature of the object is usually not salient enough and the image gradient information is weak. These cause the “boundary leaking” problem when we apply the level set method to detect the 3D prostate surface as shown in Fig. 1. In this test, F=1-0.375K, the Gaussian smoothing filter characteristic width σ is set to 0.75 and a Gaussian kernel width 5 has been used.
Fig. 1. An example of “boundary leaking” problem of level set method. White curve: the detected boundary of the level set method; Green curve: the manually outlined boundary. Yellow arrow: the boundary leaking caused by shadowing; Red arrow: the boundary leaking caused by low contrast.
Motivated by the region-based strategy for active contour models as reported in [15, 16], we integrated the region information instead of the image gradient into the level set method to improve the model performance. First we develop a fast discrimination method to extract the prostate region according to the intensity likelihood as:
0 R ( x, y , z ) = 1
if 2 < [max(I n ) − min( I n )] < givenThreshold else
(9)
where max(In) and min(In) stand for maximal intensity and minimal intensity respectively in a n3 slide window. We take the lower bound as 2 to exclude the black background in the images. In our algorithms, we take n as 5 empirically and the givenThreshold, set by the user, could be of different values for different images, we denote it RT for simplicity in the following sections.
3D Prostate Surface Detection from Ultrasound Images
393
This function is then incorporated into the level set method and forms a new speed function as:
Fnew = F0 ⋅ (1 − R) − ε K .
(10)
Now Fnew is not related to image gradient any more, it only depends on the prostate region and evolving front curvature. Consequently, Fnew has the following properties: 1. R = 0 , then
Fnew = F0 − ε K .
(11)
This means inside the prostate region, the evolving front will deform according to the speed as expressed in equation (6), no image constraints are included. 2. R = 1 , then
Fnew = −ε K .
(12)
This means outside the prostate region, the evolving front will shrink. Therefore, the interaction of these two properties makes the evolving front eventually attracted to the desired boundary. In our algorithm, F0 = ε = 2 is selected empirically and the Gaussian filter is replaced by the median filter for the latter’s good features of removing speckle noise efficiently and at the same time preserving boundary information. In this work, all the input images to our algorithm are collected from Singapore Gleneagles Hospital by using Voluson 530D.
3
Experimental Results
We applied the proposed method to 8 3D TRUS images to detect the prostate surface. Figure 2 shows one of the results detected by our new approach. The 3D image size is 256×256×256. The initial and final 3D shapes are shown in the right and the slices shown as transverse view, sagittal view and coronal view respectively are selected by the cutting plane as shown in (f). For this 3D image, we set RT to 48. To validate the effectiveness of our new approach, we first compare the above result with the manually outlined contours (drawn by an expert from Singapore General Hospital) in cross-sectional images in figure 3. It can be seen there is good agreement between the detected contours and the manual contours except some divergence in the right sides of sagittal view and coronal view.1 The new method is then applied to other patients’ images. One of the tests is shown in figure 4, where the image size is 250×174×236 and the RT is set to 32. Although the image quality is quite poor, our new approach successfully detected most of the desired boundaries. 1
The validation presented in this paper is purely subjective. Quantitative assessment method is still under development as addressed in section 4.
394
S. Fan, L.K. Voon, and N.W. Sing
(1)
(2)
(3) (a)
(1)
(d)
(2)
(3) (b)
(1)
(e)
(2)
(3) (c)
(f)
Fig. 2. 3D prostate surface and its cross-sectional contours detected by our new approach. (a) Transverse view, (b) Sagittal view, (c) Coronal view [(1) A slice of the original image, (2) Slice with the initial contour overlaid, (3) Slice with the final contour overlaid.], (d) Initial surface, (e) Final surface, (f) Final surface with cutting planes.
(a)
(b)
(c)
(d)
Fig. 3. The comparison between detected contours and manual contours in 2D images. (a) Cross-sectional slices, (b) Detected contours, (c) Manual contours, (d) contour comparison.
3D Prostate Surface Detection from Ultrasound Images
(1)
(2)
(3) (d)
(a)
(1)
(2)
(3) (b)
(1)
395
(2)
(e)
(3) (c)
(f)
Fig. 4. An example of algorithm tests on different patients’ images. (a) Transverse view, (b) Sagittal view, (c) Coronal view [(1) A slice of the original image, (2) Final detected contour, (3) Final detected contour with manual contour overlaid.], (d) Initial surface, (e) Final surface, (f) Final surface with cutting planes.
4
Discussion
In our algorithm, parameters such as F0, ε and slide window size n for region discrimination are pre-determined experimentally and keep unchanged in various implementations over different images. In these implementations, we take a small sphere as the initial surface put at the image center to start the detection procedure, thus only one parameter RT should be set by users and consequently the human involvement is minimized. However, because the prostate region discrimination in our algorithm is still simple and coarse, the final detected results are sensitive to the choice of RT. This problem might be solved by including the statistical information of the image intensity distribution. Therefore, an important future work on this research is to investigate the image intensity distributions. Besides, we also plan to introduce some a priori knowledge of the prostate shape to constrain the surface deformation to improve the algorithm accuracy [17]. Assessment method should be included too to evaluate the experimental results quantitatively.
396
S. Fan, L.K. Voon, and N.W. Sing
References 1.
2.
3.
4.
5.
6. 7. 8.
9.
10. 11.
12. 13. 14. 15. 16. 17.
Sethian, J.A.: Level Set Methods and Fast Marching Methods - Evolving Interfaces in Geometry, Fluid Mechanics, Computer Vision, and Materials Science. Cambridge University Press (1996) Aarnink, R.G., Pathak, S.D., de la Rosette, J.J.M.C.H., Debruyne, F.M.J., Kim, Y., Wijkstra, H.: Edge detection in prostatic ultrasound images using integrated edge maps. Ultrasonics 36 : 635–642, 1998 Lee J.Y., Chen C.H., Hsien H.B., Yang D.L., Sun Y.N.: 3D reconstruction of prostate from transrectal ultrasound images. in Proceedings of the International Conference & Exhibition on Electronic Measurement & Instrumentation, Shanghai, China (1995) 19–26 Chen, C.H., Lee, J.Y., Yang, W.H., Chang, C.M., Sun, Y.N.: Segmentation and reconstruction of prostate from transrectal ultrasound images. Biomed.Eng. Appl. Basis. Biomm., 8(3) : 287–292, 1996 Pathak, S.D., Chalana, V., Haynor, D.R., Kim, Y.: Edge-Guided Boundary Delineation in Prostate Ultrasound Images. IEEE Transactions on Medical Imaging, 19(12): 1211–1219, 2000 Richard, W.D., Keen, C.G.: Automated texture-based segmentation of ultrasound images of the prostate. Comput. Med. Imag. Graph., 20(3) : 131–140, 1996 Ladak H.M., Mao Fei, Wang Yunqiu, Downey D.B., Steinman D.A., Fenster A.: Prostate segmentation from 2D ultrasound images. Med. Phys., 27(8) : 1777–1788, 2000 Ghanei, A., Soltanian-Zadeh, H., Ratkewicz, A., Fang-Fang Yin: A three-dimensional deformable model for segmentation of human prostate from ultrasound images. Med. Phys., 28(10): 2147–2153, 2001 Knoll, C.J., Alcaniz-Raya, M.L., Monserrat, C., Grau Colomer, V., Juan, M.: Multiresolution segmentation of medical images using shape restricted snakes. SPIE Med. Imag., 3661 : 222–233, 1999 Prater, J.S., Richard, W.D.: Segmenting ultrasound images of the prostate using neural networks. Ultrason. Imag., 14 : 159–185, 1992 Wu R.Y., Ling K.V., Ng W. S.: Automatic Prostate Boundary Recognition in Sonographic Images Using Feature Model and Genetic Algorithm. Journal of Ultrasound in Medicine, 19(11): 771–782, 2000 Kass, M., Witkin, A., Terzopoulos, D.: Snakes: active contour models. International Journal of Computer Vision, 1(4) : 312–331, 1988 Staib, L.H., Duncan, J.S.: Model-based deformable surface finding for medical images. IEEE Transactions on Medical Imaging, 15(5) : 720–731, 1996 Sederberg, T.W., Parry, S.R.: Free-form deformation of solid geometric models. Computer Graphics (Proceedings of SIGGRAPH), 20(4) : 151–160, 1986 Ronfard, R.: Region-based strategies for active contour models. International Journal of Computer Vision, 13(2): 229–251, 1994. Poon, C.S., Braun, M.: Image segmentation by a deformable contour model incorporating region analysis. Phys. Med. Biol., 42(9) : 1833–1841, 1997 Leventon M., Faugeraus O., Grimson W., Wells W.: Level set based segmentation with intensity and curvature priors. in Workshop on Mathematical Methods in Biomedical Image Analysis Proceedings (2000) 4–11
A Bayesian Approach to in vivo Kidney Ultrasound Contour Detection Using Markov Random Fields Marcos Mart´ın and Carlos Alberola ETSI Telecomunicaci´ on. Universidad de Valladolid Cra. Cementerio s/n, 47011 Valladolid, Spain {marcma,caralb}@tel.uva.es
Abstract. Automatic detection of structures in medical images is of great importance for the implementation of tools that can obtain accurate measurements for an eventual diagnosis. In this paper, a new method for the creation of such tools is presented. We focus on in vivo kidney ultrasound, a target in which classical methods fail due to the inherent difficulty of such an imaging modality and organ. The proposed method operates on every slice by detecting kidney contours under a probabilistic Bayesian framework. We make use of Markov Random Fields ideas to model the problem and find the solution. A computer easy-to-use interface to the model is also presented.
1
Introduction
Neonatal hydronefrosis is a disease of great relevance in the fetus and newborn children. It consists of an enlargement of the renal pelvis and calyces. Its early diagnosis is a common task thanks to the use of echography, both during the pregnancy or in the newborn and is becoming the more frequent prenatal urologic diagnosis. Echographical analyses permit determining whether this or other urological diseases are present; the current inspection process is as follows: after scanning an adequate slice, the specialist manually adjusts —helped with cursors— an ellipse to the guessed external boundary of the kidney. The system approximates the kidney volume as the volume of the ellipsoid generated by rotating the sketched ellipse about its main axis. The pelvis volume is determined similarly. From those approximations, the specialist reaches a diagnosis using tabulated tendency data. An automatic or semiautomatic segmentation tool will be, in our opinion, of valuable importance, not only for the determination of the kidney and pelvis contours by means of a more accurate method than the one described above, but also for automatically propagating those contours to the rest of the slices of the volume scanned. With such an approach, the volume estimates are expected
The authors acknowledge the Comisi´ on Interministerial de Ciencia y Tecnolog´ıa for research grants 1FD97-0881 and TIC2001-3808-C02-02, and Junta de Castilla y Le´ on for research grant VA91/01.
T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 397–404, 2002. c Springer-Verlag Berlin Heidelberg 2002
398
M. Mart´ın and C. Alberola
to be better in terms of accuracy, and also in relaxing the need of the specialist to find a slice with sufficient quality. In this paper, we describe how such a tool can be developed. After reviewing the literature about this and related topics, we highlight the deficiencies of the methods previously proposed. This will be done in Sect. 2. In Sect. 3 we describe our proposal. Sect. 4 is dedicated to show how the computer application is used. It will be obvious from the description that no technical knowledge is needed to make use of the application. Finally, Sect. 5 describes some experiments carried out on real echograchical data to validate the method here proposed.
2
State of the Art
Several approaches are described in the literature to improve measurements of the volume enclosed within a kidney with respect to that obtained by the ellipse method. In [15] the authors describe a wholly manual segmentation method. In that approach, the contours are sketched manually in a slice-by-slice basis on the ultrasound data acquired from an in vivo kidney. From those contours, the volume is calculated by means of voxel counting. Such a measurement is proven to be better that the one obtained with the ellipse method. In [12] the authors propose an in vitro semiautomatic segmentation method; kidney contours are fairly obvious due to the organ-liquid interface in which the kidney is submerged. Classical approaches to image segmentation are only valid under extremely controlled situations. In most cases, high level techniques which exploit prior knowledge about the shape of the structures to be segmented out are the only feasible solution. Active contours (snakes) are one of the most successful approaches [9]. The solution contour is determined by finding optimal contours in a neighborhood of the initial guess. The optimality of the solution in a real setting together with the dependence of the initial guess is matter of discussion. One alternative to the optimization methods proposed in [9] is to reformulate the problem in a Bayesian probabilistic framework and make use of Markov Random Fields (MRFs) [6]. In that framework, prior distributions will model our knowledge about the contours and data-driven likelihood terms will describe the image statistics related to the contour in search. The maximum a posteriori (MAP) estimation under the MRF models give rise to the optimal contour. Several proposals that make use of this philosophy have already been reported. In [4] an automatic algorithm has been developed for the detection of the left ventricular (LV) cavity boundaries in sequential 2D echocardiograms. This work is pioneer in defining a contour model in polar coordinates as shown in Fig. 1(a). From a given center (Cx , Cy ) the plane is discretized in polar coordinates which give rise to J contour points ρj at equispaced angular positions θj —the radii—. Each contour point ρj can only be one out of K values rk . In the approach [4] a region of interest is defined beginning with an ellipse adjusted by using the Hough transform. The field is defined by means of image borders, contour smoothness, maximal volume and temporal continuity. The optimal contour
Bayesian Approach to in vivo Kidney Ultrasound Contour Detection
(a)
399
(b)
Fig. 1. (a) Classical contour model. (b) Proposed contour deformation.
is found by using the Simulated Annealing (SA) algorithm [6] on the induced MRF. The parameters are determined experimentally. The approach [5] is very similar to , but it is developed for infrared images. The prior model is rather heuristic and the likelihood is based on the image gradient. The model weights are fixed by means of a proof-and-error method. A further step is [3], in which the model of [4] is modified for the detection of the LV boundaries in angiographic images. The prior model is based on contour smoothness and the likelihood assumes a Gaussian distribution. The approach is Bayesian, they use the Iterated Conditional Modes (ICM) [1] algorithm for optimization, and the parameters are estimated from image data. In [13] the contour model is represented in Cartesian coordinates, where the number of nodes is random. The optimal contour is given by a Bayesian estimation assuming a fractal measurement as prior model and likelihood based on Gaussian distribution and image gradient. The model is not strictly a MRF; instead, a Markov chain is constructed whose limiting distribution is the MAP estimation —a modified version of SA algorithm is proposed. Several results are presented for LV echocardiograms and brain MRI. The time taken by the algorithm is unacceptable in a clinical setting. In [2] the authors make use of the results of [3,4] for the detection of the internal and external boundaries of LV cavity in sequences of echocardiograms. The probabilistic model captures the heart morphology and the physical image generation mechanisms. They assume Rayleigh distribution for the image with data-driven parameters. For the MAP estimation they developed a suboptimal iterative multigrid dynamic programming (IMDP) algorithm. Recently, the model proposed in [2] has been used for segmenting 3D intravascular ultrasonic images [7]. Two previous approaches to our model have already been published. In [10] a sequence of echographies from an in vitro fetus were segmented using active contours with several new energy terms. The optimization approach was based on the relaxation labeling method [8] which provides suboptimal solutions. In [11] the problem of object detection in speckle images was addressed. A regularized maximum likelihood (ML) method was used to determine the contour of the object. A Beta distribution was used to model the image intensity distribution.
400
M. Mart´ın and C. Alberola
That distribution was compared with the others proposed in the literature [14] and, in addition to its simplicity, results obtained were satisfactory. In both approaches the contour representation was the one shown in Fig. 1(a). The segmentation problem for in vivo kidney ultrasound volumes is rather involved and must take into account several aspects not dealt with in the above mentioned proposals. The kidney is a soft organ and its form and position may vary with time. The kidney interior is by no means homogeneous due to several distinguishable structures. There is not a clear difference between the kidney and the surrounding tissues. Only some parts of the kidney contour show a significative image gradient. In some scans, the kidney might be partially occluded. All these issues highlight the need of a novel method that may solve the problem.
3
The Model
We propose a MRF of deformations with respect to a predefined template of a kidney contour. Specifically, the template is manually or automatically adjusted to the image and then it is smoothly deformed following the outstanding boundaries in the ultrasound image and guided by an empirical speckle distribution model as well. The smoothness constraint is imposed by the range of allowable deformation values and also by the prior distribution, which forces first and second order derivatives smoothness. The final contour is given as the MAP estimation. 3.1 The Prior Function A manually adjusted template (ρ, θ) in polar coordinates is assumed to be available (see Sect. 4). Assume it consists of J points at fixed angles θj from a given center. From that template we define a deformation zone as shown in Fig. 1(b). The deformation vector will be denoted with dρ. This vector is assumed to be a sample of a prior MRF dω. Each dρj can take on values in the finite set dΛ. The field of sites will be denoted by S = {1, . . . , J} and the configuration space with dΩ = dΛS ; therefore, each configuration dρ belongs to dΩ. The finite set dΛ has K equispaced points in the interval [−drmax , drmax ]. We have defined a homogeneous and periodic prior neighborhood system ∂ which is given by ∂(j) = {j − 2, j − 1, j + 1, j + 2}. The prior MRF distribution Π defined for dω will be induced by a neighborhood potential V with two types of functions. The first type is Uj (dρ) = ϑ1 Ψ (dρj−1 − dρj+1 ); these functions make use of two-site cliques and impose first-order-derivative smoothness. The second type is Vj (dρ) = ϑ2 Ψ (dρj−1 − 2dρj + dρj+1 ). In this case, three-site cliques are used and these functions impose second-order-derivative smoothness [9]. The function Ψ (x) must be monotonic for x ≥ 0 and even. Prior local field characteristics are
=
Π (dωj = dρj /dω ∂(j) = dρ∂(j) ) 1 exp{−Uj−1 (dρ)−Uj+1 (dρ) − Vj−1 (dρ)−Vj (dρ)−Vj+1 (dρ)}, Zj
with Zj a normalizing constant.
(1)
Bayesian Approach to in vivo Kidney Ultrasound Contour Detection
3.2
401
The Likelihood Function
The ultrasonic image will be denoted by I. From that image we determine a non-linearly compressed gradient image denoted by B. The probability density function of the data conditioned to the deformation will be f(I, B/dω = dρ) ∝
J
f(I(kj , j)/dωj = dkj )f(B(kj , j)/dωj = dkj ),
(2)
j=1
where the indexes kj are chosen to achieve that {dρ1 , . . . , dρJ } = {drk1 , . . . , drkJ }. I(k, j) and B(k, j) are subimages of I and B, respectively, whose set of pixels β(k, j) are near the contour deformation at angle position j and distance position k. We define a log-likelihood function as 1 LB (k, j) = − B(m, n) ∝ ln f(B(k, j)/dωj = drk ). (3) |β(k, j)| (m,n)∈β(k,j)
Using a Beta distribution [11] with parameters depending on the pixel positions with respect to the contour deformation we also define the log-likelihood function (1−αi (j)) ln(I(m, n))+(1−αi (j)) ln(1−I(m, n)) 1 2 i |β (k, j)| (m,n)∈β i (k,j)
LI (k, j) = +
(1−αe (j)) ln(I(n, m))+(1−αe (j)) ln(1−I(m, n)) 1 2 |β e (k, j)| e
(m,n)∈β (k,j)
∝ ln f(I(k, j)/dωj = drk ),
(4)
where β i (k, j) and β e (k, j) are partition subsets of β(k, j) with the internal and external set of pixels respect to the contour deformation point j at position k, respectively. The function LI (k, j) has 4J Beta shape parameters (α1i (j), α2i (j), α1e (j), α2e (j)) which can be estimated directly from the ultrasound image I [11]. 3.3
The Posterior Model
Bayes theorem allows us to write P(dω = dρ/I, B) ∝ Π (dω = dρ)
J
f(I(kj , j)/dωj = dkj )f(B(kj , j)/dωj = dkj ),
j=1
(5) which is proportional to p
Π (dω = dρ) =
J 1 Π (dω = dρ) exp(−ϑ3 LI (kj , j)−ϑ4 LB (kj , j)), Zp j=1
(6)
where Π p is a posterior MRF distribution induced by a posterior neighborhood potential V p defined on the same ∂ and Z p is a normalizing constant. Posterior
402
M. Mart´ın and C. Alberola
potential functions will be of three types: Ujp (dρ) = Uj (dρ), Vjp (dρ) = Vj (dρ), and Wjp (dρ) = ϑ3 LI (kj , j) + ϑ4 LB (kj , j), with ϑp = (ϑ1 , ϑ2 , ϑ3 , ϑ4 ) the vector of posterior parameters. The potential functions Wjp are for one-site cliques and impose image restrictions [9]. Posterior local field characteristics are Π p (dωj = dρj /dω ∂(j) = dρ∂(j) ) =
1 p p p p exp{−Uj−1 (dρ)−Uj+1 (dρ)−Vj−1 (dρ) − Vjp (dρ)−Vj+1 (dρ)−Wjp (dρ)}, Zjp (7)
with Zjp a normalizing constant. The solution contour is the one that maximizes this posterior, which can be found with the SA algorithm [6].
4
Running the Application
The method proposed can be dealt with by means of an easy-to-use computer application; this environment allows the end user to interact with the data, supervise results, manually adjust parameters if desired and so forth.
(a)
(b)
(c)
(d)
Fig. 2. (a) A canonical template is drawn. (b)-(d) Template adjustment to the slice.
As previously described, the model needs a template from which deformations start taking place. This model can be created once and stored for future use. Note that the user does not need to draw a template every time a segmentation is needed, but just once. This model could even be built-in; however, we have chosen the former option just to give more flexibility to the end user. The template definition is very simple: on a chosen slice, the center, together with several points around the kidney contour, are clicked in with the mouse. This set of points is interpolated to get a smooth contour. Fig. 2(a) shows the
Bayesian Approach to in vivo Kidney Ultrasound Contour Detection
403
progress of this operation. This is the only information needed to create the polar representation of the template that will be used in the segmentation process. As far as segmentation is concerned, the procedure is fairly straightforward as well; the user has to choose a candidate slice from which the segmentation process is to be triggered. Now, the application superimposes a normalized version of the stored template (see Fig. 2(b)). Then, the user manually adjusts two hot points in the template to the actual contour of the kidney (see Figs. 2(c)-(d)). At this point, the segmentation procedure described in the paper is launched. To automatically evolve to further slices, the original template is affinely deformed according to the contour solution of the current slice and superimposed into the consecutive slice, thus, forcing smoothness along the depth coordinate.
5
Some Experimental Results
Two experiments will illustrate and validate the segmenting capability of the model proposed. In the first one, we have employed a series of 2D echographies of an adult healthy kidney. The template is adjusted to slice number 101 (out of 126). We have used J = 70 rays, number of points per ray K = 15, and the deformation zone consisting of an interval given by drmax = 20 pixels. The vector of posterior parameters of the energy function is ϑp = (25, 25, 5, 10). The SA algorithm has been run with 250 iterations. Fig. 3(a) shows slice number 89 and Fig. 3(d) shows the resulting segmented contour. Fig. 3(b) shows slice number 122 and Fig. 3(e) the corresponding segmented contour.
(a)
(b)
(c)
(d)
(e)
(f)
Fig. 3. (a)-(f) Experimental results.
In the second experiment we have used the data belonging to another adult healthy patient. The template has been adjusted to slice number 30. The condition of the experiments are as before but for the number of iterations of the SA algorithm, which is 200, and for the vector of posterior parameters of the energy
404
M. Mart´ın and C. Alberola
function, which are now ϑp = (20, 20, 7, 12). Fig. 3(c) shows slice number 20, in which a severe occlusion makes the kidney contour unclear for visual inspection. Fig. 3(f) shows the segmented contour.
6
Conclusions and Further Works
The method here proposed is a step towards an eventual objective procedure that would make diagnosis less dependent of the ecographist skills. Prior knowledge about the average kidney together with empirical knowledge about the echographies have been put together in a probabilistic framework. Segmentation has been therefore posed as an estimation problem for which the existence of an optimal solution is guaranteed. Several issues are until unaddressed, though. Internal structures of the kidney are still unsegmented. However, once the external contour is found, internal structures are confined in space, which makes a posterior search more tractable.
References 1. Besag, J.: On the Statistical Analysis of Dirty Pictures (with Discussion). J.Roy. Stat. Soc. B48 (1986) 259–302. 2. Dias, J., Leitao, J.: Wall Position and Thickness Estimation from Sequences of Echocardiographic Images. IEEE TMI 15 (1996) 25–38. 3. Figueiredo, M., Leitao, J.: Bayesian Estimation of Ventricular Contours in Angiographic Images. IEEE TMI 11 (1992) 416–429. 4. Friedland, N., Adam, D.: Ventricular Cavity Boundary Detection from Sequential Ultrasound Images Using Simulated Annealing. IEEE TMI 8 (1989) 344–353. 5. Friedland, N., Rosenfeld, A.: Compact Object Recognition Using Energy-FunctionBased Optimization. IEEE TPAMI 14 (1992) 770–777. 6. Geman, S., Geman, D.: Stochastic Relaxation, Gibbs Distributions and the Bayesian Restoration of Images. IEEE TPAMI 6 (1984) 721–741. 7. Haas, C., et al.: Segmentation of 3D Intravascular Ultrasonic Images Based on a Random Field Model. Ultr. Med. Biol. 26 (2000) 297–306. 8. Hummel, R., Zucker, S.: On the Foundations of Relaxation Labeling Processes. IEEE TPAMI 5 (1983) 267–287. 9. Kass, M., et al.: Snakes: Active Contour Models. I. J. Comp. Vis. 2 (1988) 321–331. 10. Mart´ın, M., et al.: Energy Functions for the Segmentation of Ultrasound Volume Data using Active Rays. Proc. IEEE ICASSP, Istanbul, Turkey (2000) 2274–2277. 11. Mart´ın, M., et al.: Maximum Likelihood Contour Estimation Using Beta-Statistics in Ultrasound Images. Proc. IEEE ISPA, Pula, Croatia (2001) 207–212. 12. Matre, K., et al.: In Vitro Volume Estimation of Kidneys Using 3D Ultrasonography and a Position Sensor. Eur. J. Ultr. 10 (1999) 65–73. 13. Storvik, G.: A Bayesian Approach to Dynamic Contours through Stochastic Sampling and Simulated Annealing. IEEE TPAMI 16 (1994) 976–986. 14. Tuthill, T., et al.: Deviation from Rayleigh Statistics in Ultrasonic Speckle. Ultr. Imag. 10 (1988) 81–89. 15. Yu, C.H., et al.: Fetal Renal Volume in Normal Gestation: A 3D Ultrasound Study, Ultr. Med. Biol. 26 (2000) 1253–1256.
Level Set Based Integration of Segmentation and Computational Fluid Dynamics for Flow Correction in Phase Contrast Angiography Masao Watanabe1,2 , Ron Kikinis1 , and Carl-Fredrik Westin1 1
Laboratory of Mathematics in Imaging, Brigham and Women’s Hospital Harvard Medical School, Boston MA, USA {watanabe,kikinis,westin}@bwh.harvard.edu 2 Dept. of Mechanical Engineering, Kyushu University, Fukuoka, Japan
[email protected]
Abstract. A novel approach to correct flow data from phase contrast angiography (PCA) is presented. The method is based on combining computational fluid dynamics (CFD) and segmentation in a level set framework. The PCA-MRI velocity data is used in a partial differential equation (PDE) based level set method for vessel segmentation, and a second level set equation solving for a physically meaningful flow. The second level set is implemented using the ghost fluid method, where the MR data defines initial and boundary conditions. The segmentation and CFD systems are simultaneously integrated to provide a robust method yielding a physically correct velocity and optimal vessel geometry. The application of this system to both synthetic and clinical data is demonstrated and its validity is discussed.
1
Introduction
In 3D phase contrast angiography (PCA) sequences, the velocities of blood flow in three orthogonal directions are mapped to phase differences, which is controlled by a variable known as “velocity encoding” or venc [1]. This sequence results in phase wrapping in areas of flow with greater speed than the venc. As an additional complication, signal quality typically deteriorates because of phase dispersion from turbulence and vortices stemming from pulse or vessel branching. These artifacts impede accurate flow quantification, especially with respect to flow direction. The need to establish vessel diameter is also critical; but because PCA-MRI sequences produce very weak MR signals in the neighborhood of vessel wall, signal resolution in this region predictably degenerates. Computational approaches based on MR segmentation have previously been applied in arterial biomechanics [2], hemodynamics of carotid artery bifurcations [3], and general vascular segmentation using level set methods [4]. Combined computational fluid dynamics (CFD) and MRI studies have been conducted on the reconstruction of blood flow patterns in a human carotid bifurcation [5]. These studies generally employ complicated, unstructured CFD grid systems T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 405–412, 2002. c Springer-Verlag Berlin Heidelberg 2002
406
M. Watanabe, R. Kikinis, and C.-F. Westin
constructed from medical images [6], and use the MRI data only for the segmentation, i.e., grid system generation. The available velocity information in PCA-MRI is in general not applied in CFD studies of blood flow. We have developed a numerical scheme to model blood flow in vessels by solving the incompressible Navier-Stokes equation with vessel geometry segmented by a partial differential equation (PDE) based fast local level set method [7]. By implementing the level set Ghost Fluid Method (GFM) [8], we have effectively enforced a zero-velocity boundary condition on the vessel wall (the zero level set) without smearing physical properties near the wall. This approach has enabled us to use a simple structured computational grid. The improvement of velocity fields is verified for both synthetic and clinical data.
2
Numerical Formulation
For the purpose of characterizing blood flow in vessels, we have developed a numerical scheme to model incompressible fluid flow in a tubular flow path, bounded by a solid, rigid, wall. This approach enables us to define the solid tubular wall boundary as the interface between the incompressible fluid and the rigid solid, and to solve a stationary interface problem by applying the proven level set method. The level set method was originally developed by Osher and Sethian [9] as a simple and versatile method for computing and analyzing the motion of an interface in two or three dimensions, such as, for example, computing two-phase Navier-Stokes incompressible flows [10]. However, the original level set method smears out both the density and the viscosity across the interface, in order to prevent spurious oscillatory solutions at the interface. As explained in [11], the original GFM was developed to solve this problem by populating cells next to the interface with “ghost values”, and extrapolating values across the interface. In this work, we couple the incompressible Navier-Stokes equation solver with high accuracy, combining the level set scheme with a projection method developed by Sussman et al. [12], with the GFM developed by Fedkiw et al. [8]. This has resulted in a stable zero-velocity boundary condition on the vessel wall. 2.1
Governing Equations
At each time step, we solve the following dimensionless evolution equations for the velocity and pressure, ∇·u=0, (1) 1 2 ∂u ∇ u, + u · (∇u) = −∇p + (2) ∂t Re where t is time, u is velocity, p is pressure. The dimensionless parameter, Re, used in Eq. (2) is the Reynolds number (Re = ρLU/µ), where L and U are the characteristic length and velocity, respectively, ρ and µ are density and viscosity of blood, respectively. We used the values 1.055 × 10 3 kg/m3 for ρ, and 4.50 × 10 −3 kg/m s for µ.
Level Set Based Integration of Segmentation
2.2
407
Discretization and Time Integration
We follow the discretization methodology and time integration procedure that Sussman et al. developed [12]. This scheme implements essentially a non-oscillatory third order method to evaluate the convection term and the fractional time step projection method, while enforcing that the continuity equation (1) is satisfied. These methods guarantee stability in high velocity fields, robustness in complicated geometries, and give high accuracy without smearing the solution. 2.3
Solid Wall Conditions Using Ghost Cells
The ghost cells are defined in the solid-side neighborhood of the fluid-solid interface (i.e. vessel wall) [8]. We can modify pressure in the ghost cells by using the isobaric fix technique [8], by defining the unit normal at every grid point as N = ∇φ/|∇φ| and then solving a partial differential equation for “constant extrapolation” in the normal direction. The equation is ∂p + N · ∇p = 0 . (3) ∂τ We have developed the zero-velocity fix on the solid wall, by simple extension of the isobaric fix technique. We consider the new variable v ; v = u/φ . First, the constant extrapolation of v is calculated in the direction of the normal to the solid wall in the neighborhood of the wall. Then the zero velocity fix is completed by setting u = vφ. The partial differential equation governing v to be solved is the same as Eq. (3). These equations need to be solved only for a few τ steps to populate a narrow band of ghost cells. 2.4
Vessel Segmentation
The vessel segmentation was carried by applying the PDE-based local level set method [7] to T1W PCA-MRI velocity data. The reinitialization technique presented in [10] was used, where the following Hamilton-Jacobi type equation; ∂d + S(d0 )(|∇d| − 1) = 0 , ∂τ is solved to steady state, with the initial conditions: −2.0∆l if ||um || ≥ δ d(x, 0) = d0 (x) 2.0∆l if ||um || < δ
(4)
(5)
where um is the T1W PCA-MRI velocity vector, ∆l is the order of ∆x, and δ is the threshold number. It is sufficient for the level set function (defined as the distance function) to be calculated in the narrow band [7]. It is observed that solving Eq. (4) with initial conditions (5) provides for thinner vessel geometry, not only because PCA-MRI signal tends to be significantly weak in the neighborhood of vessel wall, but also because it is unable to retain the initial position of the interface. We then resolve Eq. (4) with d(x, 0) = d1 (x) − ξ · ∆l
(6)
408
M. Watanabe, R. Kikinis, and C.-F. Westin
as the initial conditions instead of Eq. (5), where d1 is the solution of Eq. (4) with Eq. (5). The choice of ξ, typically 0.0 < ξ < 1.0, will be discussed in a later section. The distance function d obtained by this procedure is employed as the level set function φ determining the flow path geometry for the flow field calculation. 2.5
Boundary Conditions
Calculations have been performed within a rectangular parallelepiped extracted from the original PCA-MRI data. On the surface of the calculation domain, we need to specify the boundary conditions for both velocity and pressure. The velocity boundary conditions are set equal to the PCA-MRI velocity data. Pressure boundary conditions for the cross section with maximum inlet velocity are set to zero. For the other inlet and all the outlet cross sections, the pressure gradient normal to the calculated boundary surface are set to vanish.
3
Results
3.1
Flow in a Tube: Poiseiulle Flow
We first calculated the flow field in a straight tube with circular cross section of constant radius. If pressure gradient along the tube is constant and known, the flow is known as a Poiseiulle flow [14]. We chose this flow to verify the validity of the zero-velocity fix procedure on the solid wall.
(a) Velocity
(b) Pressure
Fig. 1. Comparison between theoretical and numerical results for Poiseiulle flow.
Figure 1(a) shows the comparison of the velocity distribution between the theoretical (solid line) and the numerical (open circles) results. Clearly, both curves agree well, confirming that an enforcement of the zero-velocity condition on the tube wall is a reasonably accurate model. These results strongly suggest that the treatment of both isobar and zero-velocity fixes are valid and effective. Figure 1(b) shows the pressure distribution along the tube. Our numerical scheme employs the zero gradient condition for the outlet pressure, hence the pressure gradient cannot be constant. Discrepancies from theoretical result are
Level Set Based Integration of Segmentation
(a) Initial
409
(b) Calculated
Fig. 2. Numerical simulation with contaminated initial condition for Poiseiulle flow (left) and calculation result (right).
therefore inevitable. However, since these discrepancies are sufficiently small, we can verify that our method works well when simulating flow with tubular geometry. We next evaluated our method with both boundary and initial conditions contaminated with noise. Figure 2(a) shows the synthetic velocity field generated by adding Gaussian white noise to the three components of the velocity field. Figure 2(b) shows the calculated result after 10 time steps. It can clearly be observed that the velocity field is improved with respect to the flow direction, except for the inlet and outlet boundaries where no improvement can be achieved with the current approach. Furthermore, the vessel boundary has not been altered as it would have been with direct smoothing, for example. 3.2
PCA-MRI Data
In this section, we present results of applying our method to clinical data. The size of the data set is 256 × 256 × 60, with a field of view (FOV) of 240 mm, slice thickness of 1.5 mm, and velocity encoding of 40 cm/s. Figure 3 shows the MIP of this image.
(a) Original MR data
(b) Close up
Fig. 3. Maximum Intensity Projection (MIP) of a PCA-MRI data set.
410
M. Watanabe, R. Kikinis, and C.-F. Westin
(a) PCA-MRI
(b) Calculated
Fig. 4. Comparison between PCA-MRI and CFD velocity field for common carotid artery.
After the segmentation procedure described in the previous section, φξ in Eq. (6) is calculated with various ξ. This parameter controls the vessel wall location. The velocity field, uξ , for a given φξ and uM can then be calculated. We assume that the most appropriate ξ, for a given set of PCA-MRI velocity data, minimizes the discrepancy between the PCA-MRI data and the calculated results. We employ the following expression for this discrepancy, e; e= ||uM − uξ ||/ ||uM || . The first data set is a section of the common carotid artery (shown as “A” in Fig. 3(b)). Table 1. Level set correction term for a section of the common carotid artery. ξ e
0.0 0.2 0.3 0.4 0.5 0.6 0.7 0.8 1.0 1.5 0.580 0.534 0.516 0.490 0.468 0.451 0.456 0.469 0.502 0.738
The flow field in a bending vessel geometry is calculated. This flow is chosen for the test of both stability and robustness of the method, since the geometry is rather easily segmented. In Table 1, the effect of the level set correction term, ξ, on the velocity calculations is shown. We chose ξ = 0.6, since the discrepancy e is minimum, and calculated results are shown in Fig. 4. It can be observed that the flow field is significantly improved, especially the direction of velocity vectors which are now naturally aligned along the vessel direction. Notice also that speeds in the original data set, ||uM ||’s, tend to be greater than those in the calculation, ||uξ ||’s, around both elbow and outlet regions. Considering the continuity equation (1), and the velocity distribution around the elbow region shown in Fig. 4(a), it is most possible that the segmentation process provided a thicker vessel diameter around the elbow region. This result also suggests that modification of outlet boundary conditions may provide an improved velocity distribution.
Level Set Based Integration of Segmentation
(a) PCA-MRI
411
(b) Calculated
Fig. 5. Comparison between PCA-MRI and CFD velocity field for bifurcation of basilar and vertebral arteries.
The second data set is the bifurcation region of the basilar artery and vertebral arteries (shown as “B” in Fig. 3(b)). This is chosen in order to test a more complicated flow than the previous one. The effect of the level set correction term, ξ, on the resulting flow is listed in Table 2. Table 2. Level set correction term for bifurcation. ξ e
0.0 0.1 0.2 0.3 0.4 0.5 0.6 0.574 0.487 0.482 0.494 0.500 0.509 0.550
It should be emphasized that the optimal value of ξ depends on both vessel geometry and the PCA-MRI signal intensity distribution. The numerical results with ξ = 0.2 are shown in Fig. 5. Notice that the result is smooth and stable, even with the relatively small amount of data points in our example. Close to the bifurcation, erroneous velocity can be observed, possibly due to phase wrapping, in Fig. 5(a). These errors are successfully suppressed as shown in Fig. 5(b). Considering the strength of velocity from the right vessel at the bifurcation, it is also possible that the vessel diameter is overestimated, and this information can be employed for the re-segmentation of the vessel.
4
Conclusion
A novel correction procedure of PCA-MRI velocity data has been developed, by coupling an incompressible Navier-Stokes equation solver with projection level set GFM, to a PDE-based fast local level set vessel segmentation method.
412
M. Watanabe, R. Kikinis, and C.-F. Westin
Applying this procedure to both synthetic and clinical data, significant improvement of the blood velocity field, such as a smooth velocity distribution aligned along the vessels, and removal of burst or error vectors, could be observed. This procedure also provides possibilities for improved vessel segmentation. The authors are aware of the necessity of more quantitative validation of this procedure, for example, using flow phantom.
Acknowledgments This work is supported in part by the International Academic Exchange Foundation of the Faculty of Engineering, Kyushu University (MA), from CIMIT and NIH grant P41-RR13218 (RK, CFW).
References 1. Leidholdt, E.M.Jr., Bushverg, J.T. Seibert, J.A., Boone, J.M.: The Essential Physics of Medical Imaging, Williams and Wilkins, (1994). 2. Vorp, D.A., Steinman, D.A., Ethier, C.R.: Computational Modeling of Arterial Biomechanics, Comput. Sci. Eng., Vol.3 (5) (2001) 51-64. 3. Milner, J.S., Moore, J.A., Rutt, B.K., Steinman, D.A.: Hemodynamics of Human Carotid Artery Bifurcations: Computational Studies with Models Reconstructed from Magnetic Resonance Imaging of Normal Subjects, J. Vascular Surgery, Vol.28 (1) (1998) 143-156. 4. Lorigo L.M., Faugeras O.D., Grimson W.E.L., Keriven R., Kikinis R., Nabavi A., Westin C.F.: CURVES: Curve Evolution for Vessel Segmentation. Medical Image Analyis. 2001;5:195-206. 5. Long, Q., Xu, X.Y., Ariff, B., Thom, S.A., Hughes, A.D., Stanton, A.V: Reconstruction of Blood Flow Patterns in a Human Carotid Bifurcation: A Combined CFD and MRI Study, J. Mag. Res. Imag., Vol.11 (2000) 299-311. 6. Cebral, J.R., Lohner,R.: From Medical Images to CFD Meshes, Proceedings of the 8th International Meshing Roundtable, (1999) 321-331. 7. Peng, D., Merriman, B., Osher, S., Zhao, H., Kang, M.: A PDE-Based Fast Local Level Set Method, J. Comput. Phys. Vol.155 (1999) 410-438. 8. Fedkiw, R.P., Aslam, T., Merriman, B., Osher, S.: A Non-Oscillatory Eulerian Approach to Interfaces in Multimaterial Flows (The Ghost Fluid Method), J. Comp. Phys., Vol.152. (1999) 457-492. 9. Osher, S., Sethian, J.A.: Fronts Propagating with Curvature Dependent Speed: Algorithms Based on Hamilton-Jacobi Formulations, J. Comput. Phys. Vol.79 (1988) 12-49. 10. Sussman, M., Smereka, P., Osher, S.: A Level Set Approach for Computing Solutions to Incompressible Two-Phase Flos, J. Comput. Phys. Vol. 114 (1994) 146-154. 11. Osher, S., Fedkiw, R.P.: Level Set Methods: An Overview and Some Recent Results, J. Comput. Phys. Vol.169 (2001) 463-502. 12. Sussman, M., Almgren, A. S., Bell, J. B., Colella, P., Howell, L.H., Welcom, M. L.: An Adaptive Level Set Approach for Incompressible Two-Phase Flows, J. Comput. Phys., Vol.148 (1999) 81-124. 13. Fedkiw, R.P.: Coupling an Eulerian Fluid Calculation to a Lagangian Solid Calculation with the Ghost Fluid Method, J. Comp. Phys. (in press) (2002). 14. Batchelor, G.K.: An Introduction to the Fluid Dynamics, Cambridge, (1969).
Comparative Exudate Classification Using Support Vector Machines and Neural Networks Alireza Osareh1, Majid Mirmehdi1, Barry Thomas1, and Richard Markham2 1
Department of Computer Science, University of Bristol, Bristol, BS8 1UB, UK _SWEVILQENMHFEVV]a$GWFVMWEGYO 2 Bristol Eye Hospital, Bristol, BS1 2LX, UK QEVOLEQ$KMJJSVHGSYO
Abstract. After segmenting candidate exudates regions in colour retinal images we present and compare two methods for their classification. The Neural Network based approach performs marginally better than the Support Vector Machine based approach, but we show that the latter are more flexible given criteria such as control of sensitivity and specificity rates. We present classification results for different learning algorithms for the Neural Net and use both hard and soft margins for the Support Vector Machines. We also present ROC curves to examine the trade-off between the sensitivity and specificity of the classifiers.
1
Introduction
Intraretinal fatty (hard) exudates (EXs) are a visible sign of diabetic retinopathy and also a marker for the presence of co-existent retinal oedema. If present in the macular area, oedema and exudates are a major cause of visual loss. Automated early detection of the presence of EXs can assist ophthalmologists prevent the spread of the disease more efficiently. We are working towards an automatic computer assisted system for classification of Diabetic Retinopathy (DR). Identifying the proportion of the colour retinal image that contains exudates is one of our key objectives. In this paper, we briefly present our retinal image segmentation process using Fuzzy C-Means clustering (FCM) to segment candidate EX regions [1], and then concentrate on a comparative analysis of their classification. We apply various configurations of neural networks (NN) and compare their classification performance to Support Vector Machines (SVMs) using hard and soft margins. A NN based on the backpropagation learning method classified the segmented EXs with an overall diagnostic accuracy of 93.4% with 93.0% sensitivity and 94.1% specificity in terms of lesion classification. Similarly, our SVM classifier achieved an overall classification of 90.4% with 83.3% sensitivity and 95.5% specificity. The most common parameter estimation algorithms used to estimate the parameters of a NN are based on the Empirical Risk Minimization (ERM) principle, which can achieve minimum risk on the training set. This is one of the reasons why NNs can get stuck in local saddle points and therefore, they are susceptible to many training problems including overfitting and convergence. In contrast, SVMs follow the Structural Risk Minimization (SRM) principle that results in a classifier with the least expected risk on the test set and hence good generalisation. Also, unlike NNs, SVMs always converge to the same solution for a given data set, regardless of initial conditions. However, in this application, while NNs performed slightly more accurately for EX classification, SVMs are more suitable for sensitivity and specificity control and do not suffer from overfitting problems and hence generT. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 413–420, 2002. © Springer-Verlag Berlin Heidelberg 2002
414
A. Osareh et al.
alise better. Overall, our results show the performance of these two different classifiers are close and comparable. In [2], Wang et al. applied a Bayesian statistical classifier based on colour features to differentiate yellowish lesions (including EXs) from the dark objects. They achieved a global 100% sensitivity and 70% specificity measured on whether EXs were present anywhere in the image. They did not measure lesion-based performance which represents how accurate the system can distinguish EXs among the others lesions. Sinthanayothin [3] identified EXs in graylevel images based on a recursive region growing technique. The sensitivity and specificity reported was 88.5% and 99.7%, however, these measurements were based on 10x10 windows. Gardner et al. [4] used a NN to identify the EX lesions in greylevel images. The authors reported a sensitivity of 93.1%. Again this was the result of classifying whole 20x20 regions rather than a pixel-level classification. One novelty of our proposed method here is that we locate EXs at pixel resolution in colour images and evaluate the performance of the system applying both lesion-based and image-based criteria. Section 2 briefly outlines our automatic method for identification of the EX pathologies in colour retinal images. Section 3 reviews the features used in the classification stage to distinguish EX candidates from other segmented regions. In Section 4, SVMs and NN classifiers will be compared in how they perform in classifying the EX regions. The paper is concluded in Section 5.
2
Fuzzy C-Means Segmentation
In this study, we used 142 colour retinal images obtained from a non-mydriatic retinal camera with a 45° field of view. The image resolution was 760x570 at 24bit RGB. We obtained candidate EX regions by performing FCM segmentation directly on our colour retinal images. This involved two important pre-processing steps. Typically, there is wide variation in the colour of fundus from different patients and is strongly correlated to the person’s race and iris colour. In the first step, we performed a normalization of our colour images (see Figure 1(c)). In the next pre-processing step, local contrast enhancement was performed to distribute the values of the pixels around the local mean to facilitate later segmentation (see Figure 1(d)).
(a)
(b)
(c)
(d)
Fig. 1. Colour normalisation and local contrast enhancement: (a) reference image, (b) typical retinal image (including EXs), (c) colour normalised version, (d) after contrast enhancement
Hard segmentation methods take crisp decisions about regions. However, the regions in an image are not always crisply defined. Fuzzy approaches provide a mechanism to represent and manipulate uncertainty and ambiguity and allow pixels to belong to multiple classes with varying degrees of membership. We segment our retinal
Comparative Exudate Classification Using Support Vector Machines
415
images using a two-stage colour segmentation algorithm based on Gaussian-smoothed histogram analysis and Fuzzy C-Means clustering [5] comprising a coarse and a fine segmentation step. At the coarse stage, an initial classification is performed by interval analysis of the zero-crossings of the histogram second derivative at multiple scales n each colour band. This results in the number of classes (K) and the centre for each cluster. In the fine stage, FCM assigns any remaining unclassified pixels to the closest class based on the minimization of the objective function:
J m (P , V ) =
c
n
∑ ∑ (µ ik )
m
i = 1 k =1
xk − vi
2
(1)
where P is a fuzzy partition of the data (xk ,k=1,…n) and V is a vector of cluster centres (vi , i=1,…c). Also, µik represent the membership value of xk to cluster i. These memberships must be between 0 and 1, and µik must sum to 1 for all i. The parameter m is a weight that determines the degree to which partial members of a cluster affect the clustering result. The fuzzy partitioning is carried out through an iterative optimisation in order to find both prototypes vi and membership functions µik to minimise Jm. Here, m = 2 and the algorithm was iterated until the Euclidean distance between two successive membership values reached 0.5 where FCM could distinguish three different clusters. Figure 2(a) shows the colour segmentation of Figure 1(d). Figure 2(b) shows the candidate EX regions overlaid on the original image.
(a)
(b)
(c)
Fig. 2. Colour image segmentation: (a) FCM segmented image, (b) candidate EX regions overlaid on the original image, and (c) final classification
The full segmentation approach was straightforward to implement, fast and, had fixed parameters, but most importantly it allowed us to segment colour images. To assess the accuracy of the proposed segmentation technique, an expert clinician marked the EX regions in all 75 of our retinal images. Accurate, manual, pixel-level registration of small pathologies like EXs is very difficult due to the wide variability in their colour and size. The FCM-based technique could segment 97% of all the EXs based on the groundtruth. Only extremely faint EXs were not identified. It is worth noting that false positive EX candidates can arise due to general retinal reflections.
3
Feature Extraction
To classify the segmented regions into EX or non-EX classes we must represent them using relevant and significant features. Clinically, ophthalmologists use colour to differentiate various pathological conditions. Similarly coloured objects like cotton-
416
A. Osareh et al.
wool spots and EXs are differentiated with further features such as size, edge strength, shape and texture. The feature set should be selected such that the between-class discrimination is maximised while the within-class discrimination is minimised. Indeed, in order to avoid the curse of dimensionality it is desirable for the feature set to be as small as possible. We selected 18 features after a comparative study of the discriminative attributes of a much larger set. We experimented with a number of colour spaces including RGB, HSI, Lab and Luv and found that colour spaces which separate luminance and chrominance are more suitable. The extracted feature set comprised mean Luv and standard deviation of the Luv values inside a candidate region, mean Luv and standard deviation of the Luv values around the region, Luv values of region centroid, region size, region compactness, and region edge strength. To evaluate the usefulness of the selected features, the within-class matrix (Sw) and between-class scatter matrix (Sb) were computed. The value J = trace(Sb/Sw) was determined based on the sequential forward selection search strategy and used as a measure of feature-set efficiency [1]. As expected, features that provide colour information seem to contribute significantly more than the other features. The class separability can naturally improve by including additional features, but at the expense of extra features and classifier complexity.
4
Classification
4.1
Support Vector Machine Classification
SVMs have been successfully applied to a wide range of pattern recognition problems and the reader is referred to [6] for details. Here, we investigate them for classifying the segmented EX lesions. SVMs are based on the SRM principle, in contrast to the ERM principle in NNs, to minimize the error on the training data. SRM minimizes a bound on the test error thus allowing SVMs to generalise better than NNs. For a separable classification task, the idea is to map the training data into a higher-dimensional feature space using a kernel function where a separating hyperplane (w,b), with w the weight vector and b the bias, can be found which maximises the margin or distance from the closest data points. The optimum separating hyperplane can be represented based on kernel function:
n f ( x ) = sign ∑ α i yi K ( x, xi ) + b i =1
(2)
where n is the number of training examples, yi is the label value of example i, K represents the kernel, and αi coefficients must be found in a way to maximise a particular Lagrangian representation. Subject to the constraints αi≥0 and ∑αiyi=0, there is a Lagrange multiplier αi for each training point and only those training examples that lie close to the decision boundary have nonzero αi. These examples are called the support vectors. However, in real-world problems data are noisy and in general there will be no linear separation in the feature space. The hyperplane margins can be made more relaxed by penalising the training points the system misclassifies. Hence, the optimum hyperplane equation can be define as
y i (w . x i + b ) ≥ 1 − ξ i , ξ i ≥ 0
and the following equation is minimized in order to obtain the optimum hyperplane
(3)
Comparative Exudate Classification Using Support Vector Machines
2
w
+C
417
n
∑ξ
i
(4)
i =1
where ξ introduces a positive slack variable that measure the amount of violation from the constraints. The penalty C is a regularisation parameter that controls the trade-off between maximizing the margin and minimizing the training error. This approach is called soft margins [7]. Therefore, specifying a SVM requires two parameters: the kernel function and the regularisation parameter C. For training the SVM classifier, the Kernel-Adatron technique using a Gaussian kernel was used [6]. Segmentation of our 75 colour images (comprising 25 normal and 50 abnormal images) resulted in 3860 segmented regions consisting of 2366 EXs and 1494 non-EXs. These regions were labelled by a consultant ophthalmologist to create a fully marked groundtruth dataset. To obtain the optimal values for the Gaussian kernel (σ) and C we experimented with different SVM classifiers using a range of values. Ten-fold cross-validation was applied to find the best classifier based on validation error. The performance of the selected SVMs was quantified based on its sensitivity, specificity and the overall accuracy. In the first experiment, with no restrictions on the Lagrange multipliers (hard margin), we achieved an overall accuracy of 88.6% with 86.2% sensitivity and 90.1% specificity for σ=0.3. Figure 3(a) illustrates the generalisation performance of this classifier against varying values of σ as well as the number of support vectors in each case. This result represents a good performance over positive and negative cases. To illustrate the effect of the soft margins, we trained the SVMs with σ fixed at 0.3 and for a wide range of C values, which was applied as an upper bound to αi (Figure 3(b)). The best overall accuracy, using the soft margin technique (hereafter referred to as SVM*), increased to 90.4% at C=1.5. However, in many medical diagnosis tasks, the overall accuracy is not the most appropriate measure since the balance between false positives and false negatives is very important. Moreover, sometimes the data set within a class is limited and so is statistically underrepresented with respect to other classes. Consequently, controlling the performance of a system on a particular class of the data is very important. To do that we applied different misclassification cost C+ and C- (giving asymmetric soft margins) for each of the two classes (EXs and non-EXs) to adjust the cost of false positives vs. false negatives. This modifies (4) to the following optimisation problem [8]: 2
w + C+
∑ξ + C ∑ξ i
i: yi =1
−
(5)
i
i: yi =−1
subject to (3). Figure 3(c) illustrates the effect of a wide range of upper bounds C+ on the αi of the positive class (i.e. EXs) while there is no restriction on the αi of the negative class which means (C- = ∞). For example, as C+ decreases, the number of false negatives is increased but at the expense of a decrease in the number of false positives. Therefore, specificity is increased while sensitivity is reduced. In this application, the maximum overall accuracy obtained at C+=8.0 is found to be 89.8% with sensitivity decreased to 85.0% and specificity increased to 93.1% compared against the hard-margin results. The opposite effect can be achieved by considering an upper bound C- on the αi of the negative class. In this case, maximum overall accuracy achieved at C-=8.5 is 85.2% with sensitivity increased to 95.1% and specificity decreased to 78.5%. Figure 2(c) shows a typical EX-classified image.
418
A. Osareh et al.
4.2
Neural Network Classification
To assess the performance of the SVMs we also classified our segmented EXs using neural networks [9] with a different number of algorithms and architectures. Again a 10–fold cross-validation technique was used for estimating the generalisation error of all classifiers. We experimented with two different learning methods, standard BackPropagation (BP) and Scaled Conjugate Gradient (SCG) descent. We investigated a single hidden layer with a range of 2 to 35 hidden units to find an optimum. The network with the smallest validation error was selected as the best classifier and then the selected architecture was tested against an unseen test set. In this way a NN classifier using BP learning performed best in terms of the overall generalisation performance. Table 1 summarises the NN and SVM results. These are the best results from a selection of configurations used for training the classifiers.
3500
100
3000
80
2500
2000
40
1500 1000
20
No. of SVs
Performance
2500
60
Performance
3000
80
500
0
0 0
0.5
1 1.5 2 2.5 K er n el ( S i gma)
3
3.5
S ensitivity
S pecif icity
No. of S Vs
1500 40
1000
20
500
0
4
Over all Accur acy
2000
60
No. of SVs
S VM * P er f o r mance
S VM P er f o r mance
100
0 0
0.5
1
1.5 2 2.5 3 U pper B ou n d ( C)
Over all Accur acy S pecif icity
(a)
3.5
4
S ensitivity No. of S Vs
(b) Controlling the Sensitivity of SVM
100
3500 3000 2500
60
2000
40
1500 1000
20
Overall Accuracy S pecificity
S ens itivity No. of S Vs
No. of SVs
Performance
80
500
0
0 0
1
2
3
4
5 C+
6
7
8
9
10
(c) Fig. 3. Generalisation performance of the SVM classifier against (a) different σ values, (b) different C values, (c) different C+ values
Although the diagnostic accuracy of the NN classifiers is slightly better than the SVMs, the classifier performances are very close and there is a good balance between sensitivity and specificity in all the cases. However, in most medical applications the overall accuracy is not a sufficient measure to choose the optimal configuration.
Comparative Exudate Classification Using Support Vector Machines
419
Table 1. Performances of different classifiers for lesion-based classification (as %s) Classifier
Threshold
SVM σ=0.3 SVM* σ=0.3 ,C=1.5 SVM* σ=0.3 ,C+=8.0 SVM* σ=0.3 ,C-=8.5 NN-BP (15 hidden) NN-SCG (15 hidden)
(T=0.0) (T=0.0) (T=0.0) (T=0.0) (T=0.50) (T=0.45)
Accuracy
Sensitivity
88.6 90.4 89.8 85.2 93.4 92.8
Specificity
86.2 83.3 85.0 95.1 93.0 97.9
90.1 95.5 93.1 78.5 94.1 85.2
In order to assess and analyse the behaviour of the classifiers throughout a whole range of the output threshold values, ROC [10] curves shown in Figure 4 have been produced (with true-positives plotted against the false-positives describing the tradeoff between sensitivity and specificity). The bigger the area under the ROC curve, the higher the probability of making a correct decision. The BP and SCG classifiers show a higher performance with areas 0.966 and 0.962 respectively. The SVM (without soft margins) and SVM* (soft margins with C=1.5) show slightly lower performance over the entire ROC space with areas 0.907 and 0.924. So far we have discussed pixel-by-pixel based lesion classification. We can also use our trained classifiers to evaluate the effectiveness of our proposed approach by assessing the image-based accuracy of the system. A population of 67 different retinal images were considered (40 abnormal and 27 normal). Each retinal image was evaluated using the NN-BP and SVM* (C=1.5) classifiers separately and a final decision was made to show whether the image has some evidence of Diabetic Retinopathy. As Table 2 illustrates the NN-BP classifier could identify affected retinas with 95.0% sensitivity while it recognised 88.9% of the normal images, i.e. the specificity. In the SVM* case the diagnostic accuracy of abnormal images was 87.5% sensitivity and 92.0% specificity for the normal cases. However, as in the lesion-based case, the SVM sensitivity and specificity rates can be easily manipulated by varying the value of C+ and C- to obtain different results. ROC Curves for the Selected Classifiers
True Positive Rate
1 0.8 0.6 0.4 0.2
BP (Az=0.966) S VM (Az=0.907)
0 0
0.1
0.2
0.3
0.4
0.5
S CG (Az=0.962) S VM* (Az=0.924) 0.6
Flase Positive Rate
0.7
0.8
0.9
1
Fig. 4. ROC curves for the classifiers in Table 1 (Az refers to the area under ROC curve)
EXs usually appear in groups and therefore missing some very faint EXs is not very important. However, when there are only a few new and very faint EXs in the retina the identification task will be more difficult. When we manually analysed the
420
A. Osareh et al.
system’s decisions on normal images we found that in most cases where a normal image had been wrongly identified as abnormal only very few false positives had been detected (of the order of 2 or 3 individual false lesions). In such cases we can use proximity information to make a final decision, since EXs usually appear in dense groups rather than randomly scattered across the image. Table 2. System Performance for assessing the evidence of DR Classifier
BP-NN
SVM*
5
Image Type
No. of Patients
Abnormal
40
Normal Abnormal Normal
Detected as Abnormal
Detected as Normal
X=Sensitivity Y=Specificity
38
2
X=95.0%
27
3
24
Y=88.9%
40
35
5
X=87.5%
27
2
25
Y=92.0%
Conclusion
In this study we investigated SVM and NN classifiers to obtain good class separability between EX and non-EX classes. The results by the two classification approaches are very similar, however, we believe that SVMs are a more practical solution to our application: they have a significant advantage compared to NNs as they can achieve a trade-off between false positives and false negatives using asymmetric soft margins, they always converge to the same solution for a given data set regardless of initial conditions, and finally, they remove the danger of overfitting. Acknowledgements The first author is on a scholarship funded by the Iranian Ministry of Science, Research and Technology. The authors also thank the UK National Eye Research Centre for their support.
References 1. Osareh, A., Mirmehdi, M., Thomas, B.T., Markham, R.: Classification and Localisation of Diabetic Related Eye-Disease. Proc 7th European Conf. on Computer Vision (2002) 502-516 2. Wang, H., Hsu, W., Goh, K., Lee, M.: An Effective Approach to Detect Lesions in Colour Retinal Images. IEEE Conf. on Computer Vision and Pattern Recognition (2000) 181-187 3. Sinthanayothin, C.: Image Analysis for Automatic Diagnosis of Diabetic Retinopathy. PhD Thesis, King’s College London (1999) 4. Gardner, G., Keating, D., Williamson, T., Elliott, A.: Automatic Detection of Diabetic Retinopathy Using an ANN: A Screening Tool. BJO 80 (1996) 940-944 5. Lim, Y., Lee, S.: On the Colour Image Segmentation Algorithm Based on the Thresholding and the Fuzzy C-Means Techniques. Pattern Recognition 23. 9 (1990) 935-952 6. Cristianini, N., Taylor, J.S.: An Introduction to Support Vector Machines. Cambridge University Press (2000) 7. Burges, C.J.C.: A Tutorial on Support Vector Machines for Pattern Recognition. Data Mining and Knowledge Discovery 2. 2 (1998) 121-167 8. Veropoulos, K., Campbell, C., Cristianini, N.: Controlling the Sensitivity of Support Vector Machines. Proc Int. Joint Conf. on Artificial Intelligence (1999) 55-60 9. Bishop, C.M.: Neural Networks for Pattern Recognition. Oxford University Press. (1992) 10. Centor, R.M.: Signal Detectability: The use of ROC Curves and their Analysis. Medical Decision Making, (1991) 102-106
A Statistical Shape Model for the Liver Hans Lamecker1, Thomas Lange2, and Martin Seebass1 1
Konrad-Zuse-Zentrum Berlin, Germany 2 Charité Berlin, Germany {
[email protected],thomas.lange}@charite.de,
[email protected] Abstract. The use of statistical shape models is a promising approach for robust segmentation of medical images. One of the major challenges in building a 3D shape model from a training set of segmented instances of an object is the determination of the correspondence between them. We propose a novel geometric approach that is based on minimizing the distortion of the mapping between two surfaces. In this work we investigate the accuracy and completeness of a 3D statistical shape model for the liver built from 20 manually segmented individual CT data sets. The quality of the shape model is crucial for its application as a segmentation tool.
1
Introduction
The variable functional anatomy [1] of the liver requires individual pre-operative planning for hepatic surgery. The segmentation of the liver is essential for this process. A model-based approach is promising for fast, automated and robust 3D-image segmentation [2], because it uses global shape constraints for approximating the object contour even where it is not clearly defined in the image data. Some attempts have been made to extend the original 2D-technique to 3D. The main obstacle is the lack of a suitable 3D shape comparison, due to the difficulty of defining an optimality criterion for a good correspondence. Once criteria have been established efficient optimization schemes have to be devised. Fleute et al. [3] establish correspondence by elastic registration of a template shape with all other shapes based on minimizing euclidean distance. In [4] the correspondences are defined using an information theoretic minimal description length approach. Here the issue of optimality is addressed by optimizing the compactness of the model. The method has been applied to 2D-examples and an extension to 3D shapes with sphere-like topology is outlined. In [5] the shapes are represented by their expansion into a series of elliptical harmonics. The principal component analysis is performed on the parametric description in contrast to the point distribution model used in [2]. Only shapes with sphere-like topology can be treated. A method for 3D shape correspondence using local geometry and geodesics is described in [6] without its application to statistical shape models. It is evaluated computing residual surface distances. Thompson et al. [7] compute a mapping from each shape onto a sphere using a deformable model approach. Correspondence between two spheres, constrained by matched anatomical feature lines, is established by means of a warping algorithm. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 412–427, 2002. © Springer-Verlag Berlin Heidelberg 2002
422
H. Lamecker, T. Lange, and M. Seebass
In this work we describe a shape model for the liver using a novel approach for computing the surface correspondence, which easily extends to arbitrary topology. We investigate the compactness and completeness of a liver model with an application for segmentation in mind.
2
Building a Statistical Shape Model
In order to build a statistical shape model a set of segmentations of the shape of interest is required. From these segmentations we reconstruct triangulated surfaces for representing shape. For shape comparison and statistical analysis finding a good correspondence between the surfaces is crucial. Using the correspondence a principal component analysis of the shapes can be performed. 2.1
Extraction of 3D-Shapes from CT-Data
The shape of interest has to be labelled (segmented) in all CT-images in the training set. Due to a lack of fully automated methods manual and semi-automatic approaches (live wire, region growing, etc.) are used. With the Amira Visualization and Modelling Software [8] a typical segmentation of a liver in CT images (5mm slice thickness) takes about 45 minutes. Applying the marching cubes method to the labelled images yields a triangulated surface with a stepped structure (see Fig. 1, left) due to the anisotropy of the voxels. To obtain a superior representation of the true anatomical 3D-shape we interpolate the labelling by inserting two intermediate slices between each two consecutive slices of the original image stack. A variational implicit function approach [9] produces smooth results (cf. fig. 1, right).
Fig. 1. Triangulated surface of the liver: before (left) and after interpolation (right)
For reasons of efficiency the surface is simplified [10] by reducing the number of triangles. A typical surface-mesh of the liver consists of about 25000 triangles and 12500 nodes. 2.2
Solving the 3D-Correspondence Problem
A correspondence between two 3D-shapes is a one-to-one mapping between the points on both surfaces. We propose a novel geometric approach that is based on minimizing the distortion of the mapping given a few user-defined feature points. The method extends a morphing algorithm presented in [11]. The user defines the feature points by decomposing the surface into patches, each topologically equivalent
A Statistical Shape Model for the Liver
423
to a disk. The decomposition need not only be topologically equivalent on all patches but also represent similar semantic regions in order to get a meaningful correspondence. The patch boundaries are constructed by specifying only a few points on the surface and then computing the shortest path between them. We introduced a metric on the surface that favors paths along lines of high curvature in order to define characteristic contours (cf. fig. 2, left). Our goal is to map any patch of one surface onto the corresponding patch on another surface minimizing the local distortion (a related idea is pursued in [7]). We define distortion as local scaling and shearing. Each patch is mapped onto a disk by applying Floater’s [12] shape-preserving mapping (cf. fig. 2). This mapping approximates the geodesic polar mapping locally, which preserves arc-length in each radial direction and hence minimizes distortion. The coordinates of the parameterized points on the disk are computed by solving a sparse linear system of equations. Two corresponding disks give a one-to-one mapping from one patch to another [11], resulting in transferring one triangulation onto the other surface. To achieve continuity across patch borders, border points are mapped according to their arc-length in the particular patch of the original surface.
Fig. 2. Surface decomposition into patches along lines of high curvature (left) and one parametrized patch (right)
In our example, the surface of the liver is divided into four patches. The manual interaction time for the patchification is about 5 minutes per surface. The computation of the parametrization takes about 10 seconds. 2.3
Registration of Surfaces and Principal Component Analysis
We are now able to compare two 3D-shapes using the computed correspondence. In order to compute the mean of two shapes, they have to be aligned first. We use two different methods for this surface registration. One is the alignment of the centers of gravity of the shapes by a mere translation (TRA). The other is a rigid (MLS) transformation computed by a mean least squares fit of the displacements of the corresponding points. We apply principal component analysis, which is a standard method for analysing variability over a set of training data [2], to the set of corresponding liver surfaces.
424
3
H. Lamecker, T. Lange, and M. Seebass
Results
Based on 20 individual CT-datasets taken from a spiral CT with contrast enhancement we built an atlas following the procedure described in chapter 2. The CT-data had a slice thickness of 5mm and were taken with different scanners. The training data comprised livers of 8 men and 12 women. 3.1
Measuring Surface Similarity
We visually perceive two shapes to be similar if no large areas on the surface deviate much from one another. Instead of just using purely global or local measures such as the mean surface distance (computed from the one sided Hausdorff-distance) or the maximal distance we adopt an intermediate measure of “region similarity”. Two shapes are defined to be “region similar by X mm” if they deviate less than X mm on 90 % of the surface. 3.2
Shape Modes
Fig. 3 shows the first three modes of the principal component analysis. The modes of the model are sorted in decreasing magnitude of their corresponding eigenvalues. In the first column, the mode corresponding to the largest eigenvalue λ0 is varied between − 2
λ0
and + 2
λ0
, in the second column the same is done for the second
mode and so on. The result shows the large variability included in the liver model.
Fig. 3. First three eigenmodes of the liver atlas (columnwise)
3.3
Compactness of the Model
The compactness of a model is its ability to describe the variability of a shape using as few modes as possible. To figure out the compactness of our model, we first ask how much cumulative relative variance of the model needs to be taken into account when describing shapes.
A Statistical Shape Model for the Liver
425
For each surface in the model we determine the number of modes, needed for creating a “ region similarity by 3mm” to the original one. It turns out that on average 12 modes meet this criterion. The average distance lies in between 0.9 and 1.5mm, while the maximal distance ranges between 5.1 and 8.1mm. In fig. 4 the absolute distance between the original and the projected surface is colorcoded onto the surface.
Fig. 4. Distance from original surface: 12 modes (left) vs. 10 modes (right). Black indicates a distance of 0mm and white a distance of 4mm and above
Fig. 5 (right) shows the cumulative relative variance plotted against the number of modes in the models, one for the TRA alignment and the other for the MLS alignment. These curves describe the compactness of our models. The curve for the MLS alignment indicates that the first 12 modes in the model contain 95% of the model’s variance. Moreover we see, that the TRA model is more compact than the MLS model, while the absolute variance is larger for the TRA model. Next we determine the dimensionality of the projection space making up for more than 95% of cumulative relative variance. We plot the number of shape modes below 95% relative variance against the number of training data. For a complete model, the dimensionality should converge to a fixed value with increasing number of data in the training set. In Fig. 5 (left) convergence cannot be observed. 100
Number of Shapes
19
90
Dimension Cum. Rel. Variance in %
17 15 13 11 9 7
80 70 60 50 40
center registration
30
5
MLS registration
20
10
12
14
16
Number of Shapes
18
20
0
2
4
6
8 10 12 Num. of mode s
14
16
18
20
Fig. 5. Number of shapes against dimensionality (left), compactness of two models using two different alignment strategies (right)
3.4
Representation of Arbitrary Shapes by the Model
As a prerequisite for the application of the shape model in image segmentation we determine how accurate any shape, that is not included in the model, can be approxi-
426
H. Lamecker, T. Lange, and M. Seebass
mated by the modes of the model. We perform a leave-one-out test on one arbitrary example surface not contained in the model to show the model’s generalization capability. The shape not contained in the model is projected onto the model using the computed correspondence. We then determine the “region similarity by 5mm” between the surface and its projection. We change the threshold from 3mm to 5mm since we expect larger deviations than in the compactness examination. The results are shown in Fig. 6. 18
40
16
% Area of surface with error > 5mm
35
14 Error [mm]
30 25 20 15 10
12
Mean
10
Maximal
8 6 4
5
2
0
0
2
4
6
8 10 12 Number of modes
14
16
18
2
4
6
8 10 12 Number of Modes
14
16
18
Fig. 6. Results of the leave-one-out-test: area of surface with error >5mm (left), mean and maximum surface distance (right)
4
Discussion and Conclusion
We have analyzed the variations of a 3D statistical shape model. One major challenge for building a shape model is the determination of a good correspondence of two surfaces. We have presented a novel method based on a patchification and the geometric idea of minimizing the distortion (local scaling and shearing of the surface). The algorithm discussed here gives only an approximate solution to this objective. As an improvement we intend to relax the surface nodes after the initial parametrization, constraining only those nodes that represent distinct features (landmarks, nodes along lines of high curvature, etc.). This relaxation should improve the correspondence and hence the representation of new shapes by the model. The correspondence method can be extended to arbitrary topologies without difficulties. This certainly reflects one major advantage of the method presented here. Yet some more manual effort for decomposing the surface into patches is needed for more complicated topologies. We are working on an automatic extraction of anatomic feature lines. The results of the compactness analysis show that this correspondence method is suitable for 3D shape analysis of complex and variable anatomical shape. Fig. 5 (left) indicates that the dimensionality of the model has not reached convergence. This implies that a larger training set is needed. Of course, a liver model might never be complete, though complete enough for image segmentation. Our experiments with the leave-one-out test suggest that the essential features of an arbitrary shape will be accounted for. For the task of accurate image segmentation the model must be extended, as fig. 6 exemplifies. Comparison between different strategies for establishing correspondence would be of high interest. As mentioned in the beginning, judging the quality of a correspondence is not trivial and may depend on the application.
A Statistical Shape Model for the Liver
427
We have developed an efficient and intuitive approach for creation of 3D-shape models from segmented training data. This provides a basis for automated 3D image segmentation incorporating a-priori knowlegde.
Acknowledgements Thomas Lange is supported by the Deutsche Forschungsgemeinschaft (DFG) project “Intraoperative Navigation” 201879. Martin Seebass and Hans Lamecker are supported by DFG collaborative research project “Hyperthermia: Clinical Aspects and Methodology” SFB 273.
References 1. Fasel, J., Selle, D., Evertsz, C., et al.: Segmental Anatomy of the Liver: Poor Correlation with CT, Radiology, 206:151-156, 1998 2. Cootes, T.,Hill, A., Taylor, C., and Haslam, J.: Use of Active Shape Models for Locating Structures in Medical Images, Image and Vision Computing, 12: 355-366, 1994 3. Fleute, M., Lavallée, S., Julliard, R.: Incorporating a Statistically Based Shape Model into a System for Computer-Assisted Anterior Cruciate Ligament Surgery, Medical Image Analysis, 3(3): 209-222, 1999 4. Davies, R., Cootes, T., Taylor, C.: A Minimum Description Length Approach To Statistal Shape Modelling, IPMI 2001, LNCS 2082: 50-63 5. Kelemen, A. , Szekely, G., Gerig, G. : Three-dimensional Model-based Segmentation of Brain MRI, IEEE Trans. on Medical Imaging, 18(10):828-839, 1999 6. Wang, Y., Peterson, B.S., Staib, L.H.: Shape-Based 3D Surface Correspondence Using Geodesics and Local Geometry, CVPR ’00, 2: 644-651 7. Thompson, P.M., Toga., A.W.: Detection, Visualization and Animation of Abnormal Anatomic Structure with a Deformable Probabilistic Brain Atlas Based on Random Vector Field Transformations, Medical Image Analysis, 1(4): 271-294, 1996/97 8. Amira – Visualization and Modelling System, http://www.AmiraVis.com 9. Turk, G., O´Brian, J.F.: Shape Transformation Using Variational Implicit Functions, SIGGRAPH 99: 335-342, 1999 10. Garland, M., Heckbert, P.S.: Surface Simplification Using Quadratic Error Metrices, SIGGRAPH 97: 209-216, 1997 11. Zöckler, M., Stalling, D., Hege, H.-C.: Fast and Intuitive Generation of Geometric Shape Transitions, The Visual Computer, 16(5): 241-253, 2000 12. Floater, M.S.: Parameterization and Smooth Approximation of Surface Triangulations, Computer Aided Geometric Design, 14(3):231-250, 1997
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics Rasmus Larsen, Klaus Baggesen Hilger, and Mark C. Wrobel Informatics and Mathematical Modelling, Technical University of Denmark Richard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby, Denmark {rl,kbh,mcw}@imm.dtu.dk, http://www.imm.dtu.dk Abstract. The contribution of this paper is the adaptation of data driven methods for non-Euclidean metric decomposition of tangent space shape coordinates. This basic idea is to take extend principal components analysis to take into account the noise variance at different landmarks and at different shapes. We show examples where these non-Euclidean metric methods allow for easier interpretation by decomposition into biologically meaningful modes of variation. The extensions to PCA are based on adaptation of maximum autocorrelation factors and the minimum noise fraction transform to shape decomposition. A common basis of the methods applied is the assessment of the annotation noise variance at individual landmarks. These assessments are based on local models or repeated annotations by independent operators.
1
Introduction
For the analysis and interpretation of multivariate observations a standard methods has been the application of principal component analysis (PCA) to extract latent variables. Cootes et al. applied PCA to the analysis of tangent space shape coordinates [1]. For various purposes different procedures for PCA using non-Euclidean metrics have been proposed. The maximum autocorrelation factor (MAF) transform proposed by Switzer [2] defines maximum spatial autocorrelation as the optimality criterion for extracting linear combinations of multispectral images. Contrary to this PCA seeks linear combinations that exhibit maximum variance. Because imaged phenomena often exhibit some sort of spatial coherence spatial autocorrelation is often a better optimality criterion than variance. We have previously adapted the MAF transform for analysis of tangent space shape coordinates [3]. In [4] the noise adjusted PCA or the minimum noise fraction (MNF) transformations were used for decomposition of multispectral satellite images. The MNF transform is a PCA in a metric space defined by a noise covariance matrix estimated from the data. For image data the noise process covariance is conveniently estimated using spatial filtering. In [5] the MNF transform is applied to texture modelling in active appearance models [6]. Bookstein proposed using bending energy and inverse bending energy as metrics in the tangent space [7]. Using the bending energy puts emphasis on the large scale variation, using inverse bending energy puts emphasis of small scale variation. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 428–435, 2002. c Springer-Verlag Berlin Heidelberg 2002
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics
2 2.1
429
Methods Minimum Autocorrelation Factors
Let the spatial covariance function of a multivariate stochastic variable, Z k , where k denotes spatial position and ∆ a spatial shift, be Π(∆) = Cov{Z k , Z k+∆ }. Then by letting the covariance matrix of Zk be Σ and defining the covariance matrix Σ ∆ = D{Z k − Z k+∆ }, we find Σ ∆ = 2Σ − Π(∆) − Π(−∆)
(1)
Then the autocorrelation in shift ∆ of a linear combination of Zk is Corr{wTi Z k , wTi Z k+∆ } = 1 −
1 wTi Σ ∆ wi . 2 wTi Σwi
(2)
The MAF transform is given by the set of conjugate eigenvectors of Σ ∆ wrt. Σ, W = [w1 , . . . , wm ], corresponding to the eigenvalues κ1 ≤ · · · ≤ κm [2]. The resulting new variables are ordered so that the first MAF is the linear combination that exhibits maximum autocorrelation. The ith MAF is the linear combination that exhibits the highest autocorrelation subject to it being uncorrelated to the previous MAFs. The autocorrelation of the ith component is 1 − 12 κi . 2.2
Minimum Noise Fractions
As before we consider a multivariate stochastic variable, Z k . We assume an additive noise structure Z k = S k + N k , where S k and N k are uncorrelated signal and noise components, with covariance matrices Σ S and Σ N , respectively. Thus Cov{Z k } = Σ = Σ S + Σ N . By defining the signal-to-noise ratio (SNR) as the ratio of the signal variance and the noise variance we find for a linear combination of Z k SNRi =
wT Σ S wi V {wTi S k } wT Σwi = Ti = Ti −1 T V {wi N k } wi Σ N wi wi Σ N wi
(3)
So the minimum noise fractions are given by the set of conjugate eigenvectors of Σ wrt. Σ N , W = [w1 , . . . , wm ], corresponding to the eigenvalues κ1 ≥ · · · ≥ κm [4]. The resulting new variables are ordered so that the first MNF is the linear combination that exhibits maximum SNR. The ith MNF is the linear combination that exhibits the highest SNR subject to it being uncorrelated to the previous MNFs. The SNR of the ith component is κi − 1. The central problem in the calculation of the MNF transformation is the estimation of the noise with the purpose of generating a covariance matrix that approximates Σ N . Usually the spatial nature of the data is utilized and the noise is approximated by the difference between the original measurement and a spatially filtered version or a local parametric function (e.g. plane, quadratic function).
430
2.3
R. Larsen, K.B. Hilger, M.C. Wrobel
MNF and MAF for Shape Decomposition
We have previously [3] shown how to adapt MAF to shape decomposition by utilizing the ordering of landmarks (variables) instead of ordering of pixels (observations) by transposing the data matrix. Furthermore, it was shown that Molgedey-Schuster’s [8] independent components (ICA) is equivalent to MAF. If the matrices in Equations (2) and (3) are singular the solution must be found in the affine support of the matrix in the denominator, e.g. by means of a generalized singular value decomposition.
3
Materials
We demonstrate the properties of the techniques that we propose on two datasets. The first dataset consists of 2D annotations of the outline of the right and left lung from 115 standard PA chest radiographs. The chest radiographs were randomly selected from a tuberculosis screening program and contained normal as well as abnormal cases. The annotation process was conducted by identification of three anatomical landmarks on each lung outline followed by equidistant distribution of pseudo landmarks along the 3 resulting segments of the outline. In Fig. 1(b) the landmarks used for annotation are shown. Each lung field is annotated independently by two observers - Dr. Bram van Ginneken and Dr. Bart M. ter Haar Romeny. The dataset was supplied to us by Dr. Bram van Ginneken. For further information the reader is refered to the Ph.D. thesis of van Ginneken [9]. The second dataset consist of 4D landmarks of a set of surfaces of human mandibles (the lower jaw) registered over time. The surfaces were extracted in a previous study by Dr. Per R. Andresen from CT scans of 7 Apert patients imaged from 3-5 times from age 3 months to 12 years. The mandibles are assumed to exhibit normal growth. The scans were performed for diagnostic and treament planning purposes and supplied by Dr. Sven Kreiborg (School of Dentistry, University of Copenhagen, Denmark) and Dr. Jeffrey L. Marsh (Plastic and Reconstructive Department for Pediatric Plastic Surgery, Washington University School of Medicine at St. Louis Children’s Hospital, St. Louis, Missouri, USA). The surface extraction and registration was carried out using matching of the extremal mesh followed by a geometry-constrained diffusion procedure described in [10,11]. The surfaces contain approximately 14.000 homologous points.
4 4.1
Results Lung Dataset
We intend to use the annotation by two independent observers to estimate the annotation uncertainty. Initially the lung annotations are aligned to a common reference frame by concatenating the annotations of the two observers and performing a generalized Procrustes analysis (GPA) [12,13]. Now we can compute
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics 2
1
40
36 39
3 1
37
5
0.9
5
38
4
36
6
35
0.8
7
10 0.7
8
15
33 32
9
0.6 20
34
31
10
0.5
30
11 0.4
25
0.3
29
12 13
0.2
15
35 0.1 40 5
10
15
20
25
30
35
16
20
21
(a)
1
4 5
32 31
0.8
7
30
19
10 0.7
8
29
0.6
15
9
28
10
27 26
0.5 20
11
0.4
12
22
13 14 21
25
0.3 0.2
30
15
20
0.1
19
18
0.9
5
6
23
24 25 26 22 23
18
17
40
3
33
24
27
14
2
35 34
25
28
30
1
16
35
17
5
10
15
(b)
20
25
30
35
(c)
1 0.9
5
1 0.9
5
0.8 10
431
0.8 10
0.7
0.7
15 0.6 20
15
0.6
0.5
0.5
20 0.4
25
0.3
30
0.2
0.4 25 0.3 30
0.2
35 0.1
0.1
35
40 5
10
15
20
25
(d)
30
35
40
5
10
15
20
25
30
35
(e)
Fig. 1. Landmarks of the left and right lung. Landmark numbers are shown in the middle. The right lung is annotated by 40 landmarks, and the left lung by 36. The anatomical landmarks on the right field are points 1, 17, and 26, on the left field the anatomical landmarks are points 1, 17, and 22. (a),(c) Inter-observer difference canonical correlations between landmarks for the right and left lungs. (d),(e) Interneighbour landmark difference canonical correlations between landmark for the right and left lung.
the differences between the two sets of annotations and estimate an inter-observer covariance matrix of the landmark coordinates. Obviously we would like to view the intercorrelation per landmark and not per coordinate. Rotation of the frame of reference will shift the correlation between x and y coordinates which may cause some confusion. In order to overcome this problem for each pair of landmarks we estimate the maximum correlation between linear combinations of their coordinates. These are the canonical correlations [14]. In Fig. 1 we see these correlations for the right and left lung. The inter lung correlations are neglible. For both set of lungs we see a high degree of correlation along the curved top outline of the lungs. For both lungs landmark 1 is the top point. Again for both lungs there is no or little correlation across the two anatomical landmarks that delimit the bottom segment of the outlines. The inter-observer covariance matrix defines one sensible metric to use when decomposing the shape variability. This would put less emphasis of landmarks with high annotation variance and more emphasis on landmarks with low annotation variance, and result in a minimum noise fraction transform. As an alternative to assessing the interobserver differences we may consider the covariance of the difference of neighbouring landmarks. The correlation structure of these are also shown in Fig. 1. Here the partitioning of landmarks in three segments
432
R. Larsen, K.B. Hilger, M.C. Wrobel PC1
PC2
PC3
PC4
PC5
PC6
PCC1
PCC2
PCC3
PCC4
PCC5
PCC6
EPC1
EPC2
EPC3
EPC4
EPC5
EPC6
MAF1
MAF2
MAF3
MAF4
MAF5
MAF6
REL1
REL2
REL3
REL4
REL5
REL6
Fig. 2. The 6 most important principal components (PC), principal components on a standardized dataset (PCC), annotation noise adjusted principal components (EPC), maximum autocorrelation factors (MAF), and relative warps (REL). The blue curve is the mean shape, and the green and red curves represent ±5 standard deviations as observed in the training set.
for each lung is more pronounced. Using this covariance as metric corresponds to the MAF transform. In Fig. 2 the 6 most important principal components (PC), principal components on a standardized dataset (PCC), annotation noise adjusted principal components (EPC), maximum autocorrelation factors (MAF), and relative warps (REL) are shown. The relative warps use the bending matrix of the estimated mean shape as metric. The PCs and PCCs are fairly similar, but the EPCs, MAFs, and RELs are different. The latter three all represent uses of metrics that are significantly different from the Euclidean one. The first EPC is a an aspect ratio variation, and the following 5 EPC’s seeems to be a mix of the first PCs. The first MAF is also an aspect ratio variation, and the following MAF’s also have evident large scale interpretataions. In particular, MAF4 is the relative size of the lungs. The relative warps also give various large scale variations but they are not as easily interpretable as the MAFs.
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics 140
−300
0
−100
50
0
60
140
−10
5 15
−10
0
1 140
140
1
3
Subject
3
Subject
5
60
5
0
433
5 15
15 0
MAF3
1
3
5
20
30
−150 0
150
(a)
0
2
4
0
2
LogAge
0
LogAge
4
4
−100
−10
PCA3
0 −15
−150 0
MAF2
2
150
−300 50
PCA2
MAF1
−10
PCA1
0
20
Size
20
Size
30
0
60
Age
30
0
60
Age
1
3
5
20
30
−15
0
15
0
2
4
(b)
Fig. 3. Scatter plots of the 3 first PCs and age, log age, and centroid size. A strong correlation between the shape variation in PC1 and MNF1 to size is demonstrated. Because size has been filtered out of the shape decomposition in the Procrustes analysis these components can be interpreted as shape change due to growth. The lower order components exhibit variation between individuals.
4.2
Mandible Dataset
A major objective for the analysis/decomposition of the mandible dataset is the construction of a growth model that allows prediction of mandible size and shape from early scans (1-3 months). When performing pediatric cranio-facial surgery prediction of growth patterns is extremely important. Growth modelling will also add to basic understanding as well as have teaching implications. Here we will demonstrate the use of the MNF transformation for decomposition of a 3D dataset as an alternative to PCA. The mandibles are aligned using a generalized 3D Procrustes analysis [15] and projected into tangent space. Each mandible is represented by a triangulated surface based on the 14000 landmarks. This triangulation allows us to determine the neighboring landmarks easily. We estimate the noise covariance matrix in Equation (3) as the covariance matrix of the deviations from the mean displacements between landmark coordinates and planes fitted locally to all landmarks in a neighbourhood. In the example shown we have used a 4th order neighbourhood. In Fig. 3 pairwise scatter plots of the first three components and age, log age, and centroid size are shown for PCs as well as MNFs. For the PCs we see that there is strong relationship between PC1, age and size. This means that PC1 relates to mandible growth, as was also concluded and utilized in [10]. PC2 and PC3 does not correlate to age or size but contain variation between individuals. For the MNFs we see that we have captured two uncorrelated modes of variation namely MNF1 and MNF2 that relate to size and age. MNF3 is a contrast between the three younger mandible scans of subject number 5 and the rest of the mandibles. In Figs. 4 and 5 the first two PCs and MNFs are shown. In each plot a
434
R. Larsen, K.B. Hilger, M.C. Wrobel
greenish meanshape and a goldish positive or negative deviation are shown. For PC1 we see a contrast between young, broad, flat mandibles with small condyles and elder, slimmer, higher mandibles with large condyles and erupted teeth. For MNF1 and MNF2 we see different patterns of growth.
(a) PC1 ’-’
(b) PC1 ’+’
(c) PC2 ’-’
(d) PC2 ’+’
Fig. 4. Principal components 1 and 2 shown as ±2 standard deviations across the training set.
(a) MNF1 ’-’
(b) MNF1 ’+’
(c) MNF2 ’-’
(d) MNF2 ’+’
Fig. 5. Minimum noise fractions 1 and 2 shown as ±2 standard deviations across the training set.
5
Conclusion
We have demonstrated a series of data driven methods for constructing nonEuclidean metric linear decompositions of the tangent space shape variability in 2D and 3D. We have demonstrated ways of constructing such a metric based on repeated measurements as well as by use of the spatial nature of the outline and surface models considered. It turns out that the MAF and MNF transforms are superior in terms of interpretability for decompoing large scale variation. These methods are tools for determining un-correlated biological modes of variation.
Acknowledgements The work was supported by the Danish Technical Research Council under grant number 26-01-0198 which is hereby gratefully acknowledged. The authors thank Dr. Bram van Ginneken for use of the lung annotation data set. The authors also thank Dr. Sven Kreiborg and Tron Darvann (School of Dentistry, University of Copenhagen, Denmark) for providing insight into the study of mandibular growth.
Statistical 2D and 3D Shape Analysis Using Non-Euclidean Metrics
435
References 1. T. F. Cootes, G. J. Taylor, D. H. Cooper, and J. Graham, “Training models of shape from sets of examples,” in British Machine Vision Conference: Selected Papers 1992, (Berlin), Springer-Verlag, 1992. 2. P. Switzer, “Min/max autocorrelation factors for multivariate spatial imagery,” in Computer Science and Statistics (L. Billard, ed.), pp. 13–16, Elsevier Science Publishers B.V. (North Holland), 1985. 3. R. Larsen, H. Eiriksson, and M. B. Stegmann, “Q-MAF shape decomposition,” in Medical Image Computing and Computer-Assisted Intervention - MICCAI 2001, 4th International Conference, Utrecht, The Netherlands, vol. 2208 of Lecture Notes in Computer Science, pp. 837–844, Springer, 2001. 4. A. A. Green, M. Berman, P. Switzer, and M. D. Craig, “A transformation for ordering multispectral data in terms of image quality with implications for noise removal,” IEEE Transactions on Geoscience and Remote Sensing, vol. 26, pp. 65– 74, Jan. 1988. 5. K. B. Hilger, M. B. Stegmann, and R. Larsen, “A noise robust statistical texture model,” in Medical Image Computing and Computer-Assisted Intervention - MICCAI 2002, 5th International Conference, Tokyo, Japan, 2002. 8 pp. (submitted). 6. T. F. Cootes, G. J. Edwards, and C. J. Taylor, “Active appearance models,” in Proceedings of the European Conf. On Computer Vision, pp. 484–498, Springer, 1998. 7. F. L. Bookstein, Morphometric tools for landmark data. Cambridge University Press, 1991. 435 pp. 8. L. Molgedey and H. G. Schuster, “Separation of a mixture of independent signals using time delayed correlations,” Physical Review Letters, vol. 72, no. 23, pp. 3634– 3637, 1994. 9. B. van Ginneken, Computer-Aided Diagnosis in Chest Radiographs. PhD thesis, Image Sciences Institute, University Medical Center Utrecht, Utrecht University, Utrecht, the Netherlands, 2001. 184 pp. 10. P. R. Andresen, F. L. Bookstein, K. Conradsen, B. K. Ersbøll, J. L. Marsh, and S. Kreiborg, “Surface-bounded growth modeling applied to human mandibles,” IEEE Transactions on Medical Imaging, vol. 19, Nov. 2000. 1053–1063. 11. P. R. Andresen and M. Nielsen, “Non-rigid registration by geometry-constrained diffusion,” Medical Image Analysis, vol. 5, no. 2, pp. 81–88, 2001. 12. J. C. Gower, “Generalized Procrustes analysis,” Psychometrika, vol. 40, pp. 33–50, 1975. 13. C. Goodall, “Procrustes methods in the statistical analysis of shape,” Journal of the Royal Statistical Society, Series B, vol. 53, no. 2, pp. 285–339, 1991. 14. H. Hotelling, “Relations between two sets of variables,” Biometrika, vol. 28, pp. 321–377, 1936. 15. J. M. F. T. Berge, “Orthogonal Procrustes rotation for two or more matrices,” Psychometrika, vol. 42, pp. 267–276, 1977.
Kernel Fisher for Shape Based Classification in Epilepsy N. Vohra1 , B. C. Vemuri1 , A. Rangarajan1 , R.L. Gilmore2 , S.N. Roper3 , and C. M. Leonard4 1
CISE Department, University of Florida, Gainesville, FL 32601, USA Dept. of Neurology, University of Florida, Gainesville, FL 32601, USA Dept. of Neurosurgery, University of Florida, Gainesville, FL 32601, USA Dept. of Neuroscience, University of Florida, Gainesville, FL 32601, USA
2 3 4
Abstract. In this paper, we present the application of Kernel Fisher in the statistical analysis of shape deformations that might indicate the hemispheric location of an epileptic focus. The scans of two classes of patients with epilepsy, those with a right and those with a left medial temporal lobe focus (RATL and LATL), as validated by clinical consensus and subsequent surgery, were compared to a set of age and sex matched healthy volunteers using both volume and shape based features. Shape based features are derived from the displacement field between the left and right hippocampii of a healthy subject/patient. The results show a significant improvement in distinguishing between the controls and the rest (RATL and LATL) using only the shape as opposed to volume based features. We also achieve a reasonable improvement in the efficiency to distinguish between RATL and LATL based on shape in comparison to volume information. It should be noted that automated identification of hemispherical foci of epilepsy has not been previously reported.
1
Introduction
Statistical analysis of shape deformations, such as those likely to occur in epilepsy and other neurological disorders, necessitate both global and local parameter based characterization of the object under study. The most popular method to achieve the same has been size and volume based analysis. However, this captures only one of the aspects necessary for complete characterization while shape based analysis gives much more information, which can be combined with the former to help understand the anatomical structures better. In this paper, we focus on developing an automatic technique which can aid in distinguishing between controls and patients with epilepsy and can indicate the hemispheric location of an epileptic focus (right medial temporal lobe or left medial temporal lobe) in the patients. It should be noted that the work does not attempt to determine the precise coordinates of the epilepsy focus in the patients. 1.1
Literature Review
To use the classification techniques such as Support Vector Machines and Fisher Discriminant for solving the problem of statistical analysis of anatomical shape T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 436–443, 2002. c Springer-Verlag Berlin Heidelberg 2002
Kernel Fisher for Shape Based Classification in Epilepsy
437
differences between different populations has been the focus of various researchers in recent past. The choice of feature vectors to capture maximum information plays an important part of the study. Gerig and Styner [3] proposed the use of both volume measurements and shape based features (Mean Square Distance) to detect group differences in hippocampal shape in schizophrenia. The class differences are then accounted for by using SVM followed by performance evaluation using the leave-one-out technique. From the results reported, it can be concluded that shape alone could not capture the class differences. This failure can be attributed either to weak shape features or the fact that the natures of the groups under study is such that shape alone cannot represent the entire class character. Joshi et al. [4] used high dimensional transformations of a brain template to compare the hippocampal volume and shape characteristics in schizophrenia and control subjects. Linear discriminant analysis was used to measure the performance of the selected feature vector. Use of fluid flow model is computationally expensive. Recently, Golland et al. [5] have studied hippocampal shape differences between schizophrenia patients and normal controls. The shapes were represented using implicit functions and classified using an SVM with a Gaussian kernel. Marginal increases over volume based methods were reported using this technique. 1.2
Overview of Our Algorithm
In this paper, we demonstrate the application of the kernel Fisher algorithm for shape based classification of hippocampal shapes in controls and epilepsy. Given a pair of sparse sets of data points corresponding to the outlines of the left and right hippocampii of a subject, appropriate shape features are extracted by first fitting a model to the data sets using a deformable pedal surface [2]. This is followed by a rigid and a non-rigid registration of the left and right hippocampii using the Iterative Closest Point algorithm [1] and level-set method [7] respectively. The local deformations obtained by non-rigid registration are then fed into the kernel fisher classifier to capture the statistical difference between the three known classes in epilepsy. The choice of kernel Fisher as the classifier has been motivated by the fact that it can separate the classes in a very high or infinite dimensional space using a linear classier and is simple to implement. The kernel Fisher training algorithm used in this work does not require non-linear optimization like SVM and hence is computationally more efficient. Note that the optimization that the SVM tries to solve is a quadratic programming problem with constraints and is known to be NP-complete. Kernel Fisher has shown results comparable to SVM in various other applications [6]. 1.3
Organization of the Paper
The rest of the paper is organized as follows: Section 2 describes the snake pedal model used for fitting a model to given data points followed by the procedure of selection of shape based features. In Section 3, we summarize the kernel Fisher method as a classifier. Sections 4 presents the experimental results followed by conclusions in Section 5
438
2
N. Vohra et al.
Shape Extraction and Features
In this section we will discuss the schemes used to segment the region of interest followed by the methods employed for rigid and non-rigid registration of the corresponding hippocampii for a given subject. 2.1
Overview of the Shape Modeling Scheme
In order to segment the region of interest in the given image, we use a deformable pedal surface described in [2]. Pedal curves/surfaces are defined as the loci of the foot of the perpendiculars to the tangents of a fixed curves/surface from a fixed point called the pedal point [2]. A large class of shapes can be synthesized by varying the position of the pedal point which exhibits both global and local deformations. Physics-based control is introduced by using a snake to represent the position of this varying pedal point. Thus the model is called as ”snake pedal” and allows for interactive manipulation through forces applied to the snake. The model also allows representation of global deformations such as bending and twisting without introducing additional parameters. To fit a model to a given set of data points in 2D/3D, a non-linear optimization scheme using LevenbergMarquardt (LM) method in outer loop for estimating global parameters and the Alternating Direction Implicit (ADI) method in the inner loop for estimating the local parameters of the model is employed [2]. 2.2
Shape Registration
Shape registration, in general, is required at both global and local levels. In the present work we use the Iterative Closest Point algorithm proposed in [1] to determine the rotation and translation between a subject’s left and right hippocampus. The choice of ICP algorithm is motivated by the fact that snake-pedal based model fitting yields an extrinsic parameterization which is not suitable for use in finding the corresponding points between the left and right hippocampii. The corresponding left and right hippocampus of a subject may have a global scaling factor which is accounted for by approximating the shapes of the smallest ellipsoid that encloses the hippocampus and then equalizing their corresponding eigen values. The problem of finding the non-rigid estimation can be formulated as a motion estimation task, in particular, estimation of the displacement field between the two given shapes. We use the level-set formulation described in [7] to estimate the displacement field that leads to the following governing equation: → − ∆u → − → − ∆(Gσ ∗ D1 ( V (X))) + λ ∆v (1) Vt = [d2 (X) − d1 ( V (X))] → − ∆(Gσ ∗ D1 ( V (X))) +α ∆w → − → − with V (X, 0) = 0 where d1 and d2 denote the signed distance images of the source and target shapes, ∆ denotes the Laplacian operator, Gσ is Gaussian kernel and α is a small
Kernel Fisher for Shape Based Classification in Epilepsy
439
positive number called as stabilizing factor. The above differential equation can be solved by using the numerical implementation described in [7]. Note that the signed distance images can be obtained by using the Fast Marching Method (FMM) described in Sethian [8].
3
Kernel Fisher
The classification problem can then be approached in two ways, namely supervised and unsupervised, with the discriminant function being linear or non-linear. The classical approach begins with the optimal Bayes classifier by assuming the normal distribution for the classes which using the linear discriminant analysis leads to the Fisher algorithm. The Fisher approach is based on projecting d-dimensional data onto a line with the hope that the projections are well separated by class. Thus, the line is oriented to maximize this class separation. However, the features in the input space may not possess sufficient discriminatory power for separation of class via linear projection techniques. This problem can be tackled by mapping the input data into a very high dimensional space and using a linear classifier in this new feature space thereby giving an implicit non-linear classification in the input space. This is the basic idea behind the kernel Fisher algorithm. Let φ be a non-linear mapping to some feature space F. Thus, separation in the new feature space can be found by maximizing: J(w) =
φ w T SB w
(2)
φ wT SW w
φ φ where w ∈ F, SB and SW are defined as follows φ φ = (mφ1 − mφ2 )(mφ1 − mφ2 )T , SW = (φ(x) − mφi )(φ(x) − mφi )T (3) SB i=1,2 x∈χi
li with mφi = l1i j=1 φ(xij ). Eqn. (2) can be solved by formulating it in terms of dot-products (φ(x)·φ(y)) of the training patterns [6] which can then be evaluated using Mercer kernels (k(x, y) = (φ(x) · φ(y))) [6]. As explained in [6] using the kernel theory, (2) can be rewritten as J(α) =
αT M α αT N α
(4)
where M = (M1 − M2 )(M1 − M2 )T with (Mi )j N=
=
li 1 k(xj , xik ) li k=1
Kj (I −
1lj )KjT ,
Kj : l × lj matrix s.t. (Kj )nm = k(xn , xjm ) (5)
j=1,2
I : identity matrix , 1lj : lj × lj matrix with all entries 1/lj
440
N. Vohra et al.
and l α are the coefficients corresponding to the training patterns s.t. w = i=1 αi φ(xi ) [6]. The optimum direction of projection can be found by taking the leading eigenvector of N −1 M . This approach is called as Kernel Fisher Discriminant (KFD) [6]. The projections of a new vector x onto w can be obtained by (w · φ(x)) =
l
αi k(xi , x)
(6)
i=1
The proposed setting is ill-posed due to the estimation of l -dimensional covariance structures from l samples which can cause the matrix N to be nonpositive [6]. The problem can be solved by adding a multiple identity matrix to N [6] such that Nµ = N + µI
4
(7)
Experimental Results and Validation
In this section, we present the experimental results obtained by testing the performance of Linear Fisher and kernel Fisher on hippocampal shapes of 25 control subjects, 11 LATL and 14 RATL patients. Given the points sets for the left and right hippocampii of a healthy subject/patient, we begin with model fitting using the Snake Pedal Model. The superimposed mesh of 21 x 40 on the model gives a new point set of size 840 x 3 (each point is in 3D). The fitted point sets are first registered globally using ICP algorithm and ellipsoid based technique. This is followed by local registration using level-set method which uses signed distance images of size 128 x 128 x 128 obtained by the Fast Marching method. The displacement field obtained for the fitted point sets is then used to form two types of shape based features. The first type is called sign of displacement and and the other is called direction of the displacement vector. The sign of the displacement is defined as follows. Given the displacement vector for a point on the zero-set of the source image, determine the cube in which the displaced point falls in the source image. Depending on the sign of the vertices (since each vertex was assigned +/- sign while forming the distance image) of the enclosing cube, assign a sign to the magnitude of the displacement. The direction vector is obtained by finding the unit vector corresponding to the displacement vector at each point on the zero set. We have chosen this as our feature vector since we believe that the displacement vector direction allows us to capture the differences between the two classes. However, the issue of including the magnitude information of the displacement field at each point on the zero set is an important one and we hope to investigate it in future work. The feature vector for the sign of displacement is of length 762 while the direction vector is of length 762 x 3 (each point has x,y,z component of displacement). These numbers are derived from the fact that there are 840 points on the zero set and the first and last row of the 21 x 40 mesh represents the north and south pole as described in [2]. In the present
Kernel Fisher for Shape Based Classification in Epilepsy
441
study, we did not include feature pruning. Since the feature vector dimensionality far exceeds the number of retrospective patient studies, feature selection and pruning using principal component analysis (PCA) or related strategies can play an important role in improving the generalization performance. Our preliminary forays into feature pruning are very promising and we plan to vigorously pursue this line of investigation in future work. The shape based results are also compared to the ones obtained by using the volume information only with L/R and (L-R)/(L+R) as the feature vector where L and R are the volume of left and right hippocampii.
(a) Linear Fisher
(b) Linear Fisher
(c) Kernel Fisher
Fig. 1. CTRL vs Rest (a) Fvec: Volume based, (b) Fvec: Sign of displacement, (c) Fvec: Sign of displacement
Table 1. Controls vs Rest (Controls=24, Rest=25)
Classifer Linear Fisher KF-Poly (d=2) KF-RBF
Volume Training 64.43% 55.2% 64.93%
Testing 61.22% 55% 61.22%
Sign of Displacement Training Testing 95.92% 87.76% 100% 91.84% 100% 89.98%
Direction Vector Training Testing 96.5% 85.71% 100% 95.9% 100% 93.8%
Fig. 1 shows the results of the Linear Fisher and Kernel Fisher (using radial basis function as the kernel) with a feature vector based on volume and shape for controls vs patients. It can be seen from Fig. 1a that using just the volume information does not distinguish between the subjects who need surgery and who do not. Fig. 1b and Fig. 1c show considerable improvement with a feature vector as sign of displacement in particular with Kernel Fisher as the classifier. Plots using direction vector are similar to the ones obtained using sign of displacement. Table 1 summarizes the training set accuracy and the cross-validation accuracy using leave-one-out for the feature vectors and classifiers considered. Given the classification between controls and patients, the next task is to identify the side of focus. This is again done by comparing the shape features for RATL and LATL. Fig. 2 shows the separation between the two classes using volume and shape features. It is clear that volume (Fig. 2a) cannot distinguish between them easily. The sign of displacement using linear Fisher (Fig. 2b) also does not show a good separation. However, kernel Fisher with shape features
442
N. Vohra et al.
(Fig. 2c) is able to capture it much better. Table 2 summarizes the training and leave-one-out accuracy for RATL and LATL using volume and shape features.
(a) Linear Fisher
(b) Linear Fisher
(c) Kernel Fisher
Fig. 2. LATL vs RATL. (a) Fvec: Volume based, (b) Fvec: Sign of displacement, (c) Fvec: Sign of displacement Table 2. LATL vs RATL (LATL=11, RATL=14)
Classifer Linear Fisher KF-Poly (d=2) KF-RBF
Volume Training 66.88% 66.88% 66.88%
Testing 64% 64% 64%
Sign of Displacement Training Testing 74.88% 64% 100% 72% 100% 72%
Direction Vector Training Testing 88% 68% 100% 72% 100% 68%
It can be seen that we are able to distinguish between controls and patients with a much higher accuracy as compared to distinguishing between RATL and LATL. This can be due to various reasons. The shape differences among the pathologies may be highly correlated, hence making it difficult to separate them out. Also the number of data samples for the patients with pathology is quite small which hinders a sufficient representation of the population. This can also be seen in high training accuracy but low test accuracy among patients.
5
Discussion and Conclusion
Our entire approach is predicated on using shape-based features for discriminating between normal controls and subjects diagnosed with epilepsy as well as indicating the hemispheric location of epileptic focus in the patients. It should be noted that the work does not attempt to determine the precise coordinates of the focus of epilepsy in the patients. Since the feature vectors may not be linearly separable, we embarked upon a kernel Fisher strategy in which the patterns are first mapped to an infinite dimensional space before computing the Fisher discriminant. The choice of kernel is crucial for achieving good generalization. This issue requires much more empirical testing and validation in order to determine the best kernel for the task. Unfortunately, the deeper and more fundamental relationship between the feature vector density function and the
Kernel Fisher for Shape Based Classification in Epilepsy
443
choice of the kernel mapping function cannot be empirically explored due to data being available only for a small number of subjects. The choice of feature vector is key to achieving good training and generalization performance. We have shown that the sign of the displacement vector can capture some of the shape differences between the two classes of subjects. Based on our empirical results, we conclude that control subjects and subjects of pathology can be discriminated using shape features. However, the same shape features are less successful in inter-hemispheric discrimination between subjects of pathology. We expect to improve the classification performance in this area by i) increasing the number of patient studies, ii) better feature selection and pruning and iii) improving the classifier.
Acknowledgment This research was in part funded by the NSF grant IIS-9811042 and NIH RO1RR13197.
References 1. Besl P.J., McKay N.D.: A Method of Registration of 3-D Shapes. IEEE Trans. of Pattern Analysis and Machine Intelligence, Vol. 14, No. 2 (1992) 239-255 2. Vemuri B.C., Guo Y.: Snake Pedals: Compact and Versatile Geometric Models with Physics-based Control. IEEE Trans. on Pattern Analysis and Machine Intelligence, Vol 22, No. 5 (2000) 445-459 3. Gerig G., Styner M., Shenton M.E., Lieberman J.A.: Shape vs Size: Improved understanding of the Morphology of Brain Structures. MICCAI, (2001) 24-32 4. Csernansky J.G., Joshi S., Wang L., Haller J.W., Gado M., Miller J.P., Grenander U., Miller M.I.: Hippocampal morphometry in schizophrenia by high dimensional brain mapping. Neurobiology, Vol. 95, Issue 19 (1998) 11406-11411 5. Golland P., Grimson W.E.L, Shenton M.E., Kikinis R.: Small Sample Size Learning for Shape Analysis of Anatomical Structures. MICCAI, LNCS 1935 (2000) 72-82 6. Mika S., R¨ atsch G., Weston J.: Fischer Discriminant Analysis with Kernels. Neural Networks for Signal Processing IX, IEEE (1999) 41-48 7. Vemuri B., Ye J., Chen Y., Leonard C.: A Level-set based Approach to Image Registration. Workshop on Mathematical Methods in Biomedical Image Analysis, June11-12 (2000) 86-93 8. Sethian J.A.: Level Set Methods and Fast Marching Methods:Evolving Interfaces in Computational Geometry, Fluid Mechanics, Computer Vision and Material Science. Cambridge University Press (1999)
A Noise Robust Statistical Texture Model Klaus B. Hilger, Mikkel B. Stegmann, and Rasmus Larsen Informatics and Mathematical Modelling, Technical University of Denmark DTU, Richard Petersens Plads, Building 321, DK-2800 Kgs. Lyngby {kbh,mbs,rl}@imm.dtu.dk http://www.imm.dtu.dk Abstract. This paper presents a novel approach to the problem of obtaining a low dimensional representation of texture (pixel intensity) variation present in a training set after alignment using a Generalised Procrustes analysis. We extend the conventional analysis of training textures in the Active Appearance Models segmentation framework. This is accomplished by augmenting the model with an estimate of the covariance of the noise present in the training data. This results in a more compact model maximising the signal-to-noise ratio, thus favouring subspaces rich on signal, but low on noise. Differences in the methods are illustrated on a set of left cardiac ventricles obtained using magnetic resonance imaging.
1
Introduction
Over the past few years, models capable of synthesising complete images of objects have proven very useful when interpreting images. One example is the Active Appearance Models (AAMs) [1,2]. Applications of AAMs include recovery and variation analysis of anatomical structures in medical images, such as magnetic resonance images (MRIs) [3], radiographs [4,5] and ultrasound images [6]. Images can be synthesised in many ways, e.g. [7] uses a linear combination of shape-compensated training images. To reduce dimensionality, AAMs uses a Principal Component (PC) analysis of the training set to synthesise new images. By maximising the variance only, the PC is modelling any noise present in the training set along with the uncontaminated hidden image data. In this paper, we propose to extend the AAM framework by augmenting the image representation with noise characteristics. This is accomplished by applying the Minimum Noise Fraction (MNF) transformation [8]. The ancestor of AAMs, the Active Shape Models [9] have previously been extended by means of a variant of MNF in the analysis of shapes, see [5]. Here, we extend this work to pixel intensities, henceforth denoted texture. The MNF extracts important otherwise occluded information in the correlation structures of the data, and aims at obtaining a low dimensional model representation. As opposed to the PC transform, the MNF transform takes the spatial nature of the image into account. Whereas the PC transform only requires knowledge of the dispersion (covariance) matrix, the MNF transform requires an estimate of the dispersion matrix of the noise structure as additional information. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 444–451, 2002. c Springer-Verlag Berlin Heidelberg 2002
A Noise Robust Statistical Texture Model
445
The MNF transform was originally proposed as a transformation for ordering multispectral data in terms of image quality with applications for noise removal. This paper is organised as follows. Section 2 summarises AAMs and describes the applied statistical models. Section 3 describes the data analysed, and Section 4 presents a comparative study of the PC and MNF. In Section 5 we summarise and give some concluding remarks.
2
Methods
In the following AAMs are summarised along with a description of the traditional AAM texture model; the PC transform, and the proposed alternative; the MNF transform. 2.1
Active Appearance Models
Active Appearance Models [1,2] establish a compact parameterisation of object variability, as learned from a training set by estimating a set of latent variables. From these quantities new images similar to the training set can be generated. Objects are defined by marking up each example with points of correspondence over the set either by hand, or by semi- to completely automated methods. Exploiting prior knowledge about the local nature of the optimisation space, these models can be rapidly fitted to unseen images, given a reasonable initialisation. Shape and texture variability is conventionally modelled by means of PC transforms. Let there be given P training examples for an object class, and let each example be represented by a set of N landmark points and M texture samples. The P shape examples are aligned to a common mean using a Generalised Procrustes analysis. The Procrustes shape coordinates are subsequently projected into the tangent plane of the shape manifold, at the pole denoted by the mean shape. The P textures are warped into correspondence using a suitable warp function and subsequently sampled from this shape-free reference. Typically, this geometrical reference is the Procrustes mean shape. Let s and t denote a synthesized shape and texture and let s and t denote the corresponding means. New instances are now generated by the adjusting PC scores, bs and bt in s = s + Φs bs
,
t = t + Φt bt
(1)
where Φs and Φt are eigenvectors of the shape and texture dispersions estimated from the training set. To regularise the model and improve speed and compactness, Φs and Φt are truncated, usually such that a certain amount of variance in the training set is explained. To obtain a combined shape and texture parameterisation, c, the values of bs and bt over the training set are combined W s bs W s ΦT s (s − s) . (2) b= = bt ΦT t (t − t) Notice that a suitable weighting between pixel distances and pixel intensities is done through the diagonal matrix W s . To recover any correlation between shape and texture a third PC transform is applied
446
K.B. Hilger, M.B. Stegmann, and R. Larsen
b = Φc c
(3)
obtaining the combined appearance model parameters, c, that generate new object instances by Φc,s . (4) s = s + Φs W −1 Φ c , t = t + Φ Φ c , Φ = c,s t c,t c s Φc,t The object instance, (s, t), is synthesised into an image by warping the pixel intensities of t into the geometry of the shape s. Given a suitable measure of fit the model is matched to an unseen image using an iterative updating scheme based on a fixed Jacobian estimate [10,11] or a reduced rank regression [2]. 2.2
Principal Components Transformation
Consider a set of P texture vectors {ti }P i=1 laid out as a set of P shape-free images with grey levels ri (x), i = 1, · · · , P , where x is the coordinate vector denoting the grid point of the sample. Let r(x) = [r1 (x) · · · rP (x)]T and assume first and second order stationarity, i.e. E{r(x)} = 0 and D{r(x)} = Σ. The PC transformation thus chooses P linear transformations zi (x) = aTi r(x), i = 1, · · · , P such that the variance for zi (x) is maximum among all linear transforms orthogonal to zj (x), j = 1, · · · , i − 1. The variance in the ith PC is given by Var{aTi r} = λi = aTi Σai .
(5)
We see that the basis for the PCs is identified as the conjugate eigenvectors of the dispersion matrix. Let λ1 ≥ · · · ≥ λP ≥ 0 be the eigenvalues with the corresponding conjugate eigenvectors A = [a1 · · · aP ]. Above, the PC problem is solved in Q-mode. Using the Eckart-Young’s Theorem the R-mode solution becomes Φt = RT Λ−1/2 A, where R = [r 1 · · · r M ] with r j containing spatially corresponding intensities over the training set, and Λ a diagonal matrix of the eigenvalues. 2.3
Minimum Noise Fractions Transformation
Consider the random signal variable r(x) from above. Assuming that an additive noise structure applies r(x) = s(x) + n(x) with Corr{s(x), n(x)} = 0, the dispersion structure can be separated into D{r(x)} = Σ = Σ s + Σ n .
(6)
The Minimum Noise Fractions transformation chooses P linear combinations zi (x) = aTi r(x), i = 1, · · · , P which maximise the signal-to-noise ratio (SNR) for the ith component SNRi =
aTi Σai V{aTi s(x)} = − 1. V{aTi n(x)} aTi Σ n ai
(7)
and the problem reduces to solving a generalized eigenproblem, Σai = λi Σ n ai . Let λ1 ≥ · · · ≥ λP be the eigenvalues of Σ with respect to Σ n with the corresponding conjugate eigenvectors a1 , · · · , aP . Then zi (x) is the ith MNF. A high
A Noise Robust Statistical Texture Model
447
order component has a high noise fraction and thus little signal. A low order component has a high SNR, hence the name Minimum Noise Fraction transform. The central issue in obtaining good MNF components is the estimation of the dispersion matrix of the noise. Using the difference between a pixel and its neighbours as a noise estimate, MNF maximises the spatial autocorrelation. Let ∆T = [∆1 ∆2 ] represent a spatial shift. Introducing Σ ∆ = D{r(x) − r(x + ∆)} which, when considered as a function of ∆, is a multivariate variogram and assuming a proportional covariance model [12] the covariance of the noise can be estimated by Σ n = Σ ∆ /2. When the covariance structure for the noise is proportional to the identity matrix, the MNF transform reduces to the PC transform. In [13] several other models are presented for estimating image noise. When maximising autocorrelation the MNF analysis qualifies as an Independent Components Analysis (ICA) similar to the Molgedy-Schuster algorithm [14], see [5]. A comparative study of the PC and MNF can be found in [15,16].
3
Data
Short-axis, end-diastolic cardiac MRIs were selected from 28 subjects. MRIs were acquired using a whole-body MR unit (Siemens Impact) operating at 1.0 Tesla. The chosen MRI slice position represented low and high morphologic complexity, judged by the presence of papillary muscles. Images were acquired using an ECG-triggered breath-hold fast low angle shot (FLASH) cinematographic pulse sequence. Slice thickness=10 mm; field of view=263x350 mm; matrix 256x256. The endocardial and epicardial contours of the left ventricle were annotated manually by placing 33 landmarks along each contour, see Figure 3.
4
Results and Discussion
Noise is added to the training data simulating different SNRs, i.e. different quality of the MRIs due to inter-patient, inter-operator variation etc. This is done in order to examine the robustness of the texture representation in the MNF basis compared to the PC basis. Gaussian noise is applied with a standard deviation randomly chosen to produce training images with an SNR down to 6 dB. This knowledge of the noise structure is not used in the subsequent analyses. 4.1
Learning Based Image Representation
To examine the robustness of the MNF transform, 101 leave-one-out studies were carried out. One on the uncorrupted and 100 on the noise degraded shape-free sets of 28 MRIs. Results of the cross-validation analyses are presented in Figure 1. The left plot corresponds to uncorrupted MRIs and the right to a randomly chosen analysis on a degraded training set. The curves with o/x symbols marks the performance of the MNF/PC models and provides the mean squared texture error (MSE) as a function of the model rank. For the scenario without the performance of the MNF and the PC transform is very similar. Notice, however that the MNF is better for almost all number of modes. The general trend for the noise degraded data is reflected in the MSE curves in Figure 1 (right). The MNF
448
K.B. Hilger, M.B. Stegmann, and R. Larsen
and PC are competing for low rank models, but for an intermediate number of modes the MNF outperforms the PC transform. The MNF thus does a better job of separating important signal from noise in the training data. Figure 2 shows the PC and the MNF eigenvectors (the Φt ’s) of the mean shape aligned 28 noise degraded cardiac data for which the leave-one-out texture representation curve in Figure 1 (right) was generated. All images in Figure 2 are stretched between mean ±3 std. The top four rows correspond to the PC eigenvectors and the four bottom rows to the MNFs. The components are ordered row-wise according to the amount of variance/SNR they explain. The last component in both shows the mean texture sample, t. Notice that the MNF gives a better ordering of components in terms of texture quality. A higher degree of speckle noise is present in all PC components compared to the MNF components. Moreover, the last components of the PC analysis appear to include a relatively higher amount of auto-correlated signal. This explains the better performance of the MNF representation in the cross-validation study. MNF PC
1100
1400
1050
1380 1360
MSE (texture)
1000
MSE (texture)
MNF PC
1420
950
900
1340 1320 1300 1280
850
1260 800 1240 750
1220 5
10 15 Texture modes
20
25
5
10 15 Texture modes
20
25
Fig. 1. Leave-one-out study on cardiac MRIs. Without noise (left). With noise (right).
4.2
Cardiac Segmentation
Hitherto, the PC and the MNF transform have been evaluated w.r.t. representation. To assess the transforms capabilities in a de facto segmentation setting, a cross-validation study was carried out on the cardiac data set. To maximise the effective size of the training set, validation was performed using a leave-one-out evaluation on the set of 28 short-axis cardiac MRIs. A total of 56 AAMs were built on noise-contaminated versions of the 28 cardiac
A Noise Robust Statistical Texture Model
449
Fig. 2. PC (top) and MNF (bottom) decomposition of noise degraded cardiac MRIs.
Fig. 3. Example annotation of the left ventricle using 66 landmarks (left). Segmentation result on noise contaminated cardiac MRI (right).
450
K.B. Hilger, M.B. Stegmann, and R. Larsen
MRIs; i.e. 28 PC AAMs and 28 MNF AAMs. In both transforms the largest 14 texture modes were included into the models. This model rank was chosen as half the maximum basis size producing a cut-off point where an average of 85% of the total amount of variation is explained. Each model was initialised on the image that was left out, in its mean configuration (i.e. mean shape and mean texture) and displaced ±8 pixels from the ground truth centre in image coordinates. From this position the AAM search was started. Refer to Figure 3 (right) for an example segmentation. Two performance measures were evaluated: normalised texture MSE (MSE) and mean point-to-point distance between corresponding landmarks of the model and the ground truth. Segmentations with a pt.-pt. distance larger than ten pixels were deemed outliers and removed. The PC/MNF AAMs yielded a mean normalised MSE of 3.55±3.35 / 3.43±2.67 and a pt.-pt. landmark error of 5.03± 1.60 / 4.79 ± 1.51 pixels, respectively. In the two PC/MNF runs 2 / 1 outliers were removed. Thus, a modest improvement in both performance measures and corresponding uncertainties is observed for the MNF AAMs. Notice the rather high MSE standard deviations, due to the large inhomogeneity in the noise characteristics.
5
Conclusion
We have shown that a more compact representation of texture can be obtained by extending the PC to the MNF transformation in the AAM framework. The novel approach shows better performance in leave-one-out representation studies both on original and on noise degraded cardiac MRIs. Thus, by separating important signal from noise the MNF transform generalises better than the PC transform. The MNF texture representation is applied in a leave-one-out AAM segmentation study in comparison to applying a conventional PC basis of equal rank. Even though the MNF extension only affects the texture- and not the shape representation, and the texture model rank is chosen relatively high compared to the amount of noise present in the data; improvements in both landmark and texture error and corresponding uncertainties are observed for the MNF AAMs. In contrast to the PC analysis, the new approach by maximizing SNR is invariant to linear transformations such as scaling of the individual components in the training set. As a consequence, the MNF decomposition is expected to be useful in future AAM studies involving data fusion of multiple features of different nature measured at different scales. This includes derived physiological measures, textual quantities, and multiple imaging modalities. Moreover, the MNF analysis in itself can be applied as a data driven method probing for uncorrelated modes of biological variation in non-Euclidean space, and thus constitute a useful tool in exploratory analysis of medical data.
Acknowledgments MRIs were provided by M.D., Jens Chr. Nilsson and M.D., Bjørn A. Grønning, Danish Research Centre of Magnetic Resonance, H:S Hvidovre Hospital.
A Noise Robust Statistical Texture Model
451
References 1. Edwards, G., Taylor, C.J., Cootes, T.F.: Interpreting face images using active appearance models. In: Proc. 3rd IEEE Int. Conf. on Automatic Face and Gesture Recognition, IEEE Comput. Soc (1998) 300–5 2. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. In: Proc. European Conf. on Computer Vision. Volume 2., Springer (1998) 484–498 3. Mitchell, S., Lelieveldt, B., van der Geest, R., Bosch, H., Reiver, J., Sonka, M.: Multistage hybrid active appearance model matching: segmentation of left and right ventricles in cardiac mr images. Medical Imaging, IEEE Transactions on 20 (2001) 415–423 4. Stegmann, M.B., Fisker, R., Ersbøll, B.K.: Extending and applying active appearance models for automated, high precision segmentation. In: Proc. 12th Scandinavian Conf. on Image Analysis. Volume 1. (2001) 90–97 5. Larsen, R., Eiriksson, H., Stegmann, M.B.: Q-MAF shape decomposition. In: Medical Image Computing and Computer-Assisted Intervention - MICCAI 2001, 4th International Conference, Utrecht, The Netherlands. Volume 2208 of Lecture Notes in Computer Science., Springer (2001) 837–844 6. Bosch, H., Mitchell, S., Lelieveldt, B., Nijland, F., Kamp, O., Sonka, M., Reiber, J.: Active appearance-motion models for endocardial contour detection in time sequences of echocardiograms. Proceedings of SPIE 4322 (2001) 257–268 7. Jones, M., Poggio, T.: Multidimensional morphable models: a framework for representing and matching object classes. International Journal of Computer Vision 29 (1998) 107–31 8. Green, A.A., Berman, M., Switzer, P., Craig, M.D.: Transformation for ordering multispectral data in terms of image quality with implications for noise removal. IEEE Transactions on Geoscience and Remote Sensing 26 (1988) 65–74 9. Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active shape models – their training and application. Comp. Vision and Image Understanding 61 (1995) 38–59 10. Cootes, T.F., Edwards, G.J., Taylor, C.J.: Active appearance models. IEEE Trans. on Pattern Recognition and Machine Intelligence 23 (2001) 681–685 11. Cootes, T.F., Taylor, C.J.: Statistical Models of Appearance for Computer Vision. Tech. Report. Oct 2001, University of Manchester (2001) 12. Switzer, P., Green, A.A.: Min/max autocorrelation factors for multivariate spatial imagery. Technical Report 6, Dept. of statistics, Stanford University (1984) 13. Olsen, S.I.: Estimation of noise in images: An evaluation. Graphical Models and Image Processing 55 (1993) 319–323 14. Molgedey, L., Schuster, H.G.: Separation of a mixture of independent signals using time delayed correlations (1994) 15. Nielsen, A.A.: Analysis of Regularly and Irregularly Sampled Spatial, Multivariate, and Multi-temporal Data. PhD thesis, Department of Mathematical Modelling, Technical University of Denmark, Lyngby (1994) 16. Hilger, K.B.: Exploratory Analysis of Multivariate Data, Unsupervised Image Segmentation and Data Driven Linear and Nonlinear Decomposition. PhD thesis, Informatics and Mathematical Modelling, Technical University of Denmark, Kgs. Lyngby (2001) 186 pp.
A Combined Statistical and Biomechanical Model for Estimation of Intra-operative Prostate Deformation Ashraf Mohamed1,2 , Christos Davatzikos1,2 , and Russell Taylor1
2
1 CISST NSF Engineering Research Center Department of Computer Science, Johns Hopkins University
[email protected] http://cisstweb.cs.jhu.edu/ Center for Biomedical Image Computing, Department of Radiology Johns Hopkins University School of Medicine {ashraf,hristos}@rad.jhu.edu http://cbic.jhoc1.jhmi.edu/
Abstract. An approach for estimating the deformation of the prostate caused by transrectal ultrasound (TRUS) probe insertion is presented. This work is particularly useful during brachytherapy procedures, in which planning for radioactive seed insertion is performed on preoperative scans, and significant deformation of the prostate can occur during the procedure. The approach makes use of a patient specific biomechanical model to run simulations for TRUS probe insertion, extract the main modes of the deformation of the prostate, and use this information to establish a deformable registration between 2 orthogonal cross-sectional ultrasound images and the preoperative prostate. In the work presented here, the approach is tested on an anatomy-realistic biomechanical phantom for the prostate and results are reported for 5 test simulations. More than 73% of maximum deformation of the prostate was recovered, with the estimation error mostly attributed to the relatively small number of biomechanical simulations used for training.
1
Introduction
Transrectal ultrasound (TRUS) guided brachytherapy is one of the common therapy alternatives for prostate cancer. The goal of the procedure is to insert a number of radioactive seeds at specific locations into the prostate tissue by using surgical needles. The locations and number of seeds within the prostate gland are decided by means of a surgical planning software that makes use of preoperative volumetric scans of the prostate, typically CT or MRI. During brachytherapy, several factors can cause deformation of the prostate gland from its preoperative shape. These factors include insertion of the ultrasound probe inside the rectum, insertion of the surgical needles, edema, and change in the patient’s posture between the preoperative and the intraoperative conditions [1]. This deformation of the prostate from the preoperative condition induces uncertainties in the radioactive seeds insertion locations, which are the result of planning on preoperative images. Thus, deformation of the prostate can T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 452–460, 2002. c Springer-Verlag Berlin Heidelberg 2002
A Combined Statistical and Biomechanical Model
453
affect the dose distribution in and around the prostate and therefore adversely affect the outcome of the procedure [2]. The goal of the work reported here is to describe a framework for estimating the deformation of the prostate based on sparse data from 2D ultrasound images that can be obtained during a typical brachytherapy procedure. Such estimator can update the preoperative plan to account for deformations thereby reducing the uncertainty in the radioactive seed insertion locations. Our approach builds upon [3], in which a general framework for statistical estimation of intra-operative deformations was presented. The problem of estimation of the deformed shape of the prostate at each probe location can therefore be cast as a deformable 2D/3D registration problem [4], i.e. registration of a 3D preoperative image to two cross-sections of the deformed volume of the same patient. To obtain the main modes of deformation of the prostate under TRUS probe insertion, a patient-specific biomechanical model is constructed from segmented preoperative images. The main modes of deformation are extracted by performing Principal Component Analysis (PCA) on a small number of deformed shapes resulting from simulations for TRUS probe insertion which are run on the biomechanical model. Each of the simulations corresponds to specific insertion angles and insertion depth of the TRUS probe. In order to further simplify the estimator, we derive an analytic representation of the principal modes of deformation (coefficients of the principal modes) as a function of the probe insertion angles and insertion depth. Our goal is to develop a fast, statistically based model, which can be used in real-time to track deformations. This model is trained on computationally intensive biomechanical simulations, which are performed preoperatively. In Section 2, the construction of the prostate phantom, the deformable prostate model, and the estimator are described. In the preliminary work of this paper, the proposed approach is tested on 5 biomechanical simulations that are not used for training. Simulated ultrasound prostate contours are obtained from the deformed prostate and are used by the estimator to estimate those deformed shapes. The results reported in Section 3 indicate good accuracy in the estimation of deformed shapes. In Section 4, we discuss how the current work can be extended to deal with real patient data and to deal with different subjects.
2
Methods
In this section, the construction of the estimator of the deformed prostate shapes is detailed. First, the biomechanical model used for simulation of prostate deformation is described. This biomechanical model is patient specific and is constructed from the patient’s segmented preoperative CT or MRI scan. In the work presented here, a biomechanical model of an anatomy-realistic prostate phantom is used instead. This provides a means for validating the estimates of the deformed prostate by comparing them to the true deformed shapes of the prostate, which are not usually available for real patients unless intra-operative imaging is used. We use the patient’s specific biomechanical model to run a
454
A. Mohamed, C. Davatzikos and R. Taylor
number of biomechanical simulations with different insertion depths and entry angles of the TRUS probe, thereby constructing a number of deformed shapes of the same prostate. The simulations involved the entry angles of the probe to account for misalignment between the axes of the rectum and the probe. From the simulated deformed shapes, we extract the principal modes of deformation for that prostate under TRUS probe insertion. Noting the dependency between the modes of deformation and the parameters of the simulations (the insertion depth and angles of the TRUS probe) a functional approximation was sought by fitting a 3rd degree Bernstein polynomial. Therefore any deformed prostate shape can be described in terms of the principal modes of deformation and their corresponding principal components which are in turn directly related to the insertion angles and insertion depth of the probe. From transaxial and sagittal sections of the prostate obtained through the TRUS probe deformable 2D/3D registration is established between the estimated deformed prostate shapes and the images obtained intra-operatively. 2.1
Biomechanical Model
A patient specific biomechanical model is needed for estimation of the range of deformations of the prostate. Finite element biomechanical meshes can be automatically generated from segmented images of the patient (e.g. [5,6]). For the model to be able to capture the deformation of the prostate accurately, it should include structures such as the prostate, surrounding tissues, the rectum, and the surrounding bony structures (sacrum and pubic arch) that control the boundary conditions. In this preliminary work reported here, we used an anatomy-realistic 3D phantom of the prostate and the surrounding structures for reasons stated earlier. A side view and a 3D rendered view of the phantom are shown in Figure 1. The phantom is composed mainly of a block of soft tissue of dimensions 12 cm × 16 cm × 12 cm. The prostate is modeled as an egg shaped structure of dimensions 3 cm × 3 cm × 3.5 cm. The rectum is modeled as a straight cylinder with circular cross section of radius 0.5 cm that runs 0.25 cm below the lower surface of the prostate. The surfaces of the sacrum, pubic arch, were generated using a spline curve extruded in 3D. The sacrum and the pubic arch are assumed to be pinned and therefore they define the boundary conditions for the problem. No other boundary conditions are imposed on any other structure. A tetrahedral mesh was automatically generated for the prostate and surrounding soft tissue within Abaqus CAE Finite Element (FE) environment [7] which is also used for solving the biomechanical FE model. Even though there is evidence that most soft tissues exhibit non-linear material behavior, a linear material model was used in many studies dealing with biomechanical behavior of soft tissue (e.g. [5]). A linear material model is typically chosen because it produces faster results compared to a non-linear material model and it is easier to implement. The values of material parameters, (e.g. Young’s modulus and Poisson’s ratio for a linear material model) vary widely from one tissue type to another and from one person to another for the same soft tissue type, especially with the presence of tissue anomaly such as cancer. To our
A Combined Statistical and Biomechanical Model
(a)
455
(b)
Fig. 1. The biomechanical prostate phantom. (a) 2D profile. (b) 3D wireframe with no-displacement boundary conditions imposed on the sacrum and the pubic arch.
knowledge the material parameter values for the prostate have not been determined experimentally or otherwise for the in-vivo human prostate. In [1] a linear material model was used for the prostate with different stiffness values for the central gland and the peripheral zone. A linear material model is only valid for small deformations and therefore offers limited accuracy in problems that involve large deformation. In the work presented here, since the expected deformation is large, we used a homogeneous Mooney-Rivlin non-linear material model for the prostate tissue with an initial Young’s Modulus (stiffness) value E = 2kP a and an almost incompressible behavior. This is consistent with the values used in [1] for the peripheral zone of the prostate. For the soft tissue surrounding the prostate, a Mooney-Rivlin material model was also assumed but with an initial stiffness that is 10 times as large as that assumed for the prostate tissue. Such values of material parameters produced deformations that are consistent with observed deformation in real TRUS images. Recently, Magnetic Resonance Elastography has been proposed for in-vivo estimation of material parameter values [8]. If accurate patient specific material parameter values are known, they can be used directly in the model. In section 4 we discuss how our approach can be extended to deal with deformations even if the material parameters were not known accurately, but are known to lie in a certain range. It is important to note that we do not need exact knowledge of the elastic parameters since our goal is to develop a statistical prior model that will follow the actual deformation in TRUS images rather than totally predict the deformation of the prostate. During TRUS-guided prostate brachytherapy, the ultrasound probe is inserted at increasing depths with known constant displacements in between. This causes the dilation of the rectum and exertion of pressure on the surrounding tissues, including the prostate. The displacement of the probe along its axis measured from the start of the rectum as a reference point is denoted by the variable u. The angles φ2 and φ3 denote the rotation angles around the 2nd and 3rd coordinate axes respectively (see Figure 1) and are referred to here as the entry angles of the probe.
456
A. Mohamed, C. Davatzikos and R. Taylor
Fig. 2. The mean shape of the statistical model is shown in the middle, with added -3 standard deviations (left) and +3 standard deviations (right) of the 2nd mode of the deformation.
2.2
Prostate Deformable Model
For training purpose of the deformable statistical model of the prostate, simulations of TRUS probe insertion with different entry angles spanning the range −4 to 4 degrees were performed. A total of 25 such simulations were performed, each with 9 corresponding probe displacements that simulate imaging of the whole prostate in 2D cross sections. Displacements of the probe were 0.5 cm in between, which is consistent with staging in available TRUS systems used for brachytherapy. A total of 225 deformed prostate shapes were therefore available from these simulations. For each simulation, coordinates of the finite element node locations of each of the deformed shapes were assembled into a vector q that represents the deformed shape. Principal Component Analysis (PCA) [9] was performed on the deformed prostate shapes to obtain the main modes of deformation of the prostate. Therefore, any of the simulated deformed shapes can be approximated by q=µ+
M
αi xi
(1)
i=1
where µ is the mean shape of the deformed prostate, xi are the principal modes of deformation, αi are the expansion coefficients, and M is the number of retained modes of deformation. More than 99% of the variation in the training samples was explained by only the first 6 modes of the deformation, and therefore M = 6 was used for the results reported in this work. Some of the modes of deformation were highly correlated with the physical parameters of the biomechanical simulations (modes 1, 2 and 4). In Figure 2, the second mode of deformation of the prostate is shown. It is clear from the figure that the second principal component correlates well (correlation coefficient of 0.88) with the displacement of the TRUS, u. Similarly, modes 1 and 4 correlated highly (correlation coefficients ≥ 0.67) with φ2 and φ3 respectively. Given this observation, a functional relationship was assumed between the principal components of the deformation and the biomechanical simulation parameters, i.e.
A Combined Statistical and Biomechanical Model
αi = fi (u, φ2 , φ3 ),
1≤i≤M.
457
(2)
Linear least squares fitting was used to approximate each fi by fitting 3 degree Bernstein polynomials [10] for each of the principal components in terms of the simulation parameters. Therefore a deformed shape is related to the biomechanical simulation parameters by rd
q = G(u, φ2 , φ3 ) =µ+
M
(3)
fi (u, φ2 , φ3 )xi
i=1
To evaluate the error introduced by the fitting of Bernstein polynomials for the functions fi of equation (3), true deformed shapes that resulted from biomechanical simulations were compared to the deformed shapes constructed by equation (3) for the same simulation parameter values. The maximum error was 0.09 cm, while the maximum deformation encountered in the simulated shapes was 0.7 cm. Therefore, a maximum error of 12.9% was introduced in the training samples by the approximation of equation (3) and using a finite number of deformation modes (M = 6). 2.3
Estimation of Deformed Shapes
During prostate brachytherapy, and before inserting any radioactive seeds, a number of 2D ultrasound images are usually obtained to cover the whole prostate. The displacements between the locations at which the images are obtained are known since a mechanical stepper is typically used to advance the TRUS probe. The goal is to estimate the deformed shape of the prostate at each of those probe locations. If the known displacement between consecutive probe locations is denoted by u, then the deformed shapes are given by qj = G(uo + u(j − 1), φ2 o , φ3 o )
1≤j ≤K.
(4)
where K is the number of probe locations, uo the displacement for the first probe location, φ2 o and φ3 o are the insertion angles for the probe. Thus, if uo , φ2 o and φ3 o were known then the whole set of deformed shapes at different locations of the probe will be available. In the work presented here, it is assumed that at each location of the probe, 2 orthogonal images of the prostate are available. These images are readily obtained by most TRUS probes currently in use for brachytherapy. From each image, coordinates of points on the surface of the deformed prostate can be extracted using manual or automatic outlining. Let the 3D coordinates of the points obtained at the j th location of the probe relative to the ultrasound crystal be denoted by Vj , where 1 ≤ j ≤ K. A coordinate frame transformation relates the coordinate frame of the crystal to the coordinate frame in which the simulations were performed. This transformation can be computed in terms of the geometry of the probe, the parameters u, φ2 , and φ3 , and to , an unknown translation between the coordinate frames. Let this frame transformation for the j th location
458
A. Mohamed, C. Davatzikos and R. Taylor
of the probe be denoted by Tj . Given an estimate of Tj , let V t j denote the points Vj transformed into the simulation coordinate frame by Tj . Also, let the sum of squared distances between the points V t j and their closest corresponding points on the deformed surface qj be denoted by Ej (uo , φ2 o , φ3 o , to ). Therefore, we seek the values u ˆo , φˆ2 o , φˆ3 o , and ˆto that minimize the sum of square errors: (ˆ uo , φˆ2 o , φˆ3 o , ˆto ) = arg min
K
Ej (uo , φ2 o , φ3 o )
(5)
j=1
Using u ˆo , φˆ2 o , φˆ3 o in equation 4 yields the estimates of the deformed shapes, ˆ j , 1 ≤ j ≤ K. The optimization problem is solved using the Nelder-Mead nonq linear optimization method [11] from within the Matlab environment. Similar to the approach in [4], the optimization for the parameters uo , φ2 o , φ3 o , and to is performed at 2 different alternating steps for deformable 2D/3D registration and pure translation.
3
Results
Five simulations of TRUS probe insertion were performed at parameter values uo , φ2 o , and φ3 o that were different from those used for the training but within the range of training values. A pair of orthogonal simulated TRUS prostate image contours were generated at each location of the probe. The estimator described above was then used to obtain the deformed shapes of the prostate at each location of the probe. We computed the estimation error defined as the difference between the estimated deformed prostate shape and true deformed shape obtained by biomechanical simulation: ˆ j − qj 1 ≤ j ≤ K ˆj = q (6) e We also computed the reconstruction error for the deformed shapes, defined as the difference between a deformed shape and its best possible reconstruction in the space spanned by the retained principal modes of the deformation: M
ˇ j − qj ˇj = q e
1≤j≤K
(7)
ˇ j = µ+ i=1 α where q ˇ i xi and α ˇ i are obtained by projecting the deformed shape qj on the orthogonal principal modes xi . The estimation error can therefore be decomposed into 2 orthogonal components [12] ˇj + e ˜j ˆj = e (8) e ˇj , is due to the inability of representing the deformed The reconstruction error e shape qj as the sum of the mean and a linear combination of the principal modes ˜j is due to inability of estimating the deformed of deformation, while the error e shape perfectly from the 2D information provided by the TRUS images, and due the approximation of equation (3). The maximum estimation error and reconstruction error for each of the simulations are shown in Figure 3. In the worst test case (case number 4), the max estimation error was 26.7% of the maximum
A Combined Statistical and Biomechanical Model
459
Fig. 3. The computed maximum estimation error, reconstruction error, and deformation of the prostate for 5 different simulations of TRUS probe insertion.
deformation encountered in this simulation. However, the reconstruction error accounted for more than 57% of the estimation error for this case. The availability of more training samples obtained from more biomechanical simulations will reduce this error, at the expense of increased computational burden.
4
Summary and Future Work
We presented an approach that combines biomechanical and statistical modeling for estimation of the shape of the prostate deformed during TRUS probe insertion. Our approach makes use of a patient specific biomechanical model constructed from a segmented volumetric scan of the patient’s prostate. Since it is virtually not possible to perform biomechanical simulations for every possible value of probe displacement and entry angles, only a small number of biomechanical simulations are used to extract the modes of deformation of the prostate. The coefficients of those modes were then related to the parameters of the biomechanical simulation, namely, the insertion angles and insertion depth of the TRUS probe. This enabled the parameterization of the deformed prostate shape in terms of the biomechanical simulation parameters, and therefore provided a means for combined estimation of a set of deformed prostate shapes given sparse 2D ultrasound images. The framework of [3] can be used to extend the approach presented here to a deformable model for the prostate that includes the modes of deformation as well as modes of shape. Such model can be constructed from several subjects instead of using a patient specific biomechanical model. Another possible extension to
460
A. Mohamed, C. Davatzikos and R. Taylor
the work presented here is the treatment of material parameters as another simulation variable that is related to the modes of deformation, and seeking an estimate of those values as a part of the optimization step.
Acknowledgement The work reported in this paper was supported in part by the National Science Foundation under Engineering Research Center grant EEC9731478, and by the National Institutes of Health grant R01NS42645.
References 1. Bharatha, A., Hirose, M., Hata, N., Warfield, S., Ferrant, M., Zou, K.H., SuarezSantana, E., Ruiz-Alzola, J., D’Amico, A., Cormack, R.A., Kikinis, R., Jolesz, F.A., Temapny, C.M., Evaluation of Three-Dimensional Finite Element-Based Deformable Registration of Pre- and Intraoperative Prostate Imaging. Medical Physics 28(12) December (2001) 2551–2560 2. Booth, J.T., Zavgorodni, S.F., Set-up Error and Organ Motion Uncertainty: a Review. Australas Phys. Eng. Sci. Med., Jun; 22(2) (1999) 29–47 3. Davatzikos, C., Shen, D., Mohamed, A., Kyriacou, S.K., A Framework for Predictive Modeling of Anatomical Deformations. IEEE Transactions on Medical Imaging 20(8) August (2001) 836–843 4. Fleute, M., Lavall´ee, S., Nonrigid 3-D/2-D Registration of images Using Statistical Models. Lecture Notes in Computer Science, Vol. 1679. Medical Image Computing and Computer Assisted Intervention 1999 Springer-Verlag, Berlin Heidelberg New York (1999) 138-147 5. Ferrant, M., Warfield, S.K., Guttmann, C.R., Mulkern, R.V., Jolesz, F.A., Kikinis, R., 3D Image Matching Using a Finite Element Based Elastic Deformation Model. Lecture Notes in Computer Science, Vol. 1679. Medical Image Computing and Computer Assisted Intervention 1999 Springer-Verlag, Berlin Heidelberg New York (1999) 202–209 6. Sullivan, J.M., Charron, G., Paulsen, K.D., A Three-Dimensional Mesh Generator for Arbitrary Multiple Material Domains. Finite Elements in Analysis and Design 25 (1997) 219–241 7. Abaqus version 6.1. Hibbitt, Karlsson, and Sorensen, Inc., USA, 2000. 8. Weaver, J.B., Van Houten, E.E., Miga, M.I., Kennedy, F.E., Paulsen, K.D., Magenetic Resonance Elastography Using 3D Gradient Echo Measurements of SteadyState Motion. Medical Physics 28(8) August (2001) 1620–1628 9. Jolliffe, I.T., Principal Component Analysis. Springer-Verlag, Berlin Heidelberg New York (1986) 10. Farin, G., Curves and Surfaces for Computer Aided Geometric Design. Academic Press Limited, London, UK (1997) 11. Nelder, J.A., and Mead, R., A Simplex Method for Function Minimization. Computer J. 7 (1965) 308–313 12. Mohamed, A., Kyriacou, S.K., Davatzikos, C., A Statistical Approach for Estimating Brain Tumor Induced Deformation. Mathematical Models in Biomedical Image Analysis (2001) 52–59
“Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images Dejan Tomaeviþ, Boštjan Likar, and Franjo Pernuš University of Ljubljana, Faculty of Electrical Engineering, Traška 25, 1000 Ljubljana, Slovenia _HINERXSQE^IZMGFSWXNERPMOEVJVERNSTIVRYWa$JIYRMPNWM
Abstract. Validation of registration techniques needed for image-guided surgery is an important problem, which received little attention in the literature. In this paper we address the challenging problem of generation of a reliable gold standard for evaluating the accuracy of surgical 2D/3D registrations. We have devised a cadaveric lumbar spine phantom with fiducial markers and established highly accurate correspondence between 3D CT and MR images and 18 2D X-ray images. The expected target registration errors are in the order of 0.2 mm for CT to X-ray registration and in the order of 0.3 mm for MR to X-ray registration. As such, the gold standard images, which are available on request from the authors, are useful for testing 2D/3D registration methods in image guided surgery.
1
Introduction
In image-guided orthopedic surgery, 3D preoperative medical data, such as CT and MRI, are commonly used to plan, simulate, guide, or otherwise assist a surgeon in performing a medical procedure. The plan, specifying how tasks are to be performed during surgery, is developed in the coordinate system of preoperative images. To monitor and guide a surgical procedure, the preoperative image and plan need to be transformed into physical space, i.e. a patient-related coordinate system. The spatial transformation is obtained by acquiring intraoperative data and registering them to data extracted from preoperative images [1]. More recent and promising approaches to obtain the spatial transformation rely on intraoperative x-ray projections acquired with a calibrated x-ray device. The location and orientation of a structure in 3D CT or MR image with respect to the geometry of the x-ray device is determined by 2D/3D registration [2-7]. A necessary step, required before wide spread clinical use of any novel registration technique, is the evaluation and validation of the method. While several researchers have addressed the validation problem in the context of particular methods [2-7], very little formal research has been done in this area. One difficulty in evaluating a registration technique is the need for highly accurate gold standard. Because it is practically impossible to establish gold standard registration with real patient data, simulated data or phantoms have to be considered. In this paper, we report on the creation of a cadaveric lumbar spine phantom to which fiducial markers were attached. 3D CT and MR and 2D X-ray images were acquired and accurate gold standard rigid registration between 3D and 2D images was established by means of fiducial markers. The accuracy of gold standard registration was assessed by target registration error [8]. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 461–468, 2002. © Springer-Verlag Berlin Heidelberg 2002
462
2
D. Toma eviþ, B. Likar, and F. Pernuš
Phantom Creation
A cadaveric lumbar spine, comprised of vertebra L1-L5 with some soft tissue, of an 80 year-old female was placed into a plastic tube and tied with thin nylon strings (Fig. 1, top-left). The tube was filled with water to simulate soft tissue and, therefore, to obtain more realistic MR, CT, and X-ray images. Six fiducial markers were rigidly attached to the outside of plastic tube (Fig. 1, bottom-left). Each fiducial marker had two parts, a base that could be screwed to a rigid body and a replaceable marker. Different markers were used for MR and CT and X-ray imaging. Markers, containing a metal ball (1.5 mm in diameter) were used for CT and X-ray imaging, while markers with a spherical cavity (2 mm in diameter) filled with water solution of Dotarem contrast agent (Gothia) were used for MR.
Fig. 1. The spine fastened in a plastic tube (top-left), final phantom with fiducial markers attached to the plastic tube (bottom- left), CT image (top-center), MR image (top-right), AP x-ray image (bottom-center), and lateral x-ray image (bottom-right) image.
3
Image Acquisition
The CT image (Fig. 1, top-center) was obtained with General Electric HiSpeed CT/i scanner at 100kV. Axial slices were taken with intra-slice resolution of 0.27x0.27 mm and 1 mm inter-slice distance. For MR imaging, Philips Gyroscan NT Intera 1.5 T scanner and T1 protocol was used (Fig. 1, top-right). Axial slices were obtained with 0.39x0.39 mm intra-slice resolution and 1.9 mm inter-slice distance. After acquisition, the acquired MR image was retrospectively corrected for intensity inhomogeneity by the information minimization method [9]. X-ray images (Fig. 1) were captured by PIXIUM 4600 (Trixell) digital X-ray detector. The detector had a 429x429 mm active surface, with 0.143x0.143 mm pixel size and 14-bit dynamic range. To simulate Carm acquisition X-ray source and sensor plane were fixed while the spine phantom was rotated on a turntable (Fig. 2, left). In this way mechanical distortion due to gravitational force and other mechanical imperfections of C-arms were avoided, which resulted in a more precise acquisition. By rotating (step=20°) the spine phantom around its long axis, 18 X-ray images were acquired. The X-ray images were filtered by 3x3 median filter and then sub-sampled by the factor of two in order to remove dead pixel artifacts and to reduce the resolution.
“Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images
4
463
Finding Centers of Fiducial Markers
In all 3D and 2D images a rough position pm of each fiducial marker was first defined manually. Next, an intensity threshold IT, that separated a marker from surrounding tissues, was selected for each marker. Finally, the center pc of each marker was defined as: ( I ( p ) − I T )p p∈ pc = (1) ( I ( p) − I T )
∑ ∑
p∈
where I(p) is the intensity at point p and Ω is a small neighborhood around point pm. By this method, centers of markers may be found to sub-pixel or sub-voxel accuracy. Let XMR and XCT be 3x6 matrices, each containing six 3D vectors representing the centers of fiducial markers found in MR and CT, respectively: MR
MR
MR
XMR=[r1 ,r2 ,...,r6 ] CT CT CT XCT=[r1 ,r2 ,...,r6 ]
(2)
T
where r=(x,y,z) . Similarly, let Xϕ be a 2x6 matrix involving six 2D points representing the centers of markers found in X-ray images obtained after rotating the phantom for ϕ degrees (ϕ=0°,20°,...,340°): Xϕ=[p1ϕ, p2ϕ, ..., p6ϕ]
T
where p=(x,y) .
(3)
5ϕ
Fϕ
SKDQWRP
Fϕ
UV
UL
Fϕ
;UD\VRXUFH
6Y 6Y
7YVϕZ(
6V
;UD\VHQVRU
5
SLϕ
SLϕ SLϕ
Fig. 2. X-ray image acquisition (left) and reconstruction of 3D marker position (right).
5
X-Ray Setup Calibration
The X-ray setup was calibrated retrospectively using the centers Xϕ of fiducial markers found in X-ray images and the corresponding centers XCT of markers found in CT volume. Calibration of the acquisition setup (Fig. 2) required the determination of the X-ray projection geometry and rotation between the coordinate system of the phantom and the coordinate system of the X-ray system. This involved determination of 12 geometrical parameters,3 intrinsic wI and 9 extrinsic wE, denoted by calibration paT T T rameter vector w, w=(wI ,wE ) . The intrinsic parameters were describing the X-ray projection geometry while the extrinsic parameters were describing the rotation be-
464
D. Toma eviþ, B. Likar, and F. Pernuš
tween the coordinate system of the phantom and the coordinate system of the X-ray setup. T The intrinsic parameters wI=(xs,ys,zs) define the position of the X-ray source rs in the coordinate systems Ss of the sensor plane and, therefore, define the projection PS(wI) of any 3D point described in the sensor coordinate system Ss to the 2D sensor plane. There are nine extrinsic parameters wE needed to describe the rotation between the phantom and the X-ray system. Four parameters define the axis of rotation in coordinate system Sv of the phantom. We have chosen the coordinate system of the CT volume for Sv. The axis of rotation is defined by point (txv,tyv), which is the intersection of the axis with x-y coordinate plane of Sv and by rotation angles (ωxv,ωyv) of the axis around x and y of Sv. Similarly, four parameters (txs,tys) and (ωxs,ωys) define the same axis of rotation in coordinate system Ss of the X-ray sensor plane. The additional parameter, needed to determine the relation between Ss and Sv on the rotation axis, is distance dvs between the two points of intersection (txv,tyv) and (txs,tys). The T extrinsic parameters wE=(txv,tyv,ωxv,ωyv,dv s ,txs,tys,ωxs,ωys) define transformation TVS(ϕ,wE) that maps, for a given rotation ϕ of the phantom, any 3D point in coordinate system Sv to a 3D point in coordinate system Ss:
TVS (ϕ , w E ) = TRS (txs , tys ,ωxs ,ωys ) ⋅ T(d vs ) ⋅ R (ϕ ) ⋅ TVR (txv , ty v ,ωxv ,ωyv )
(4)
where TVR is the transformation from coordinate system Sv to the axis of rotation, R(ϕ) is the rotation around rotation axis, T(dvs) is the translation along rotation axis, and TRS is the final transformation to the coordinate system Ss. By merging projection PS(wI) and transformation between the coordinate systems TVS(ϕ,wE), the projection PVS(ϕ,w) of 3D point defined in the coordinate system Sv to the 2D point lying in the sensor plane of Ss can be obtained for any rotation ϕ:
PVS (ϕ , w ) = PS ( w I )TVS (ϕ , w E )
(5)
To calibrate the X-ray acquisition system, we thus need to define 12 geometrical parameters w of the projection PVS(ϕ,w). The optimal calibration parameters w are the ones that bring the fiducial markers XCT in CT volume to the best correspondence with the corresponding fiducial markers Xϕ in X-ray images. To find the optimal parameters we project the centers of fiducial markers XCT in CT volume to the sensor plane and compute the root mean squared (RMS) distance Ecalib to the corresponding centers of fiducial markers Xϕ in X-ray images:
Ecalib ( w ) =
1 M
( pϕ −P ∑ ∑ N ϕ 1
N
i
∈Φ
i =1
VS (ϕ , w ) ri
)
CT 2
(6)
where N and M stands for the number of fiducial markers and X-ray images, respectively, and Φ={ϕ1,ϕ2,…,ϕM} defines the X-ray images taken at different phantom rotations. To find the optimal calibration parameters w, we used nine X-ray images Φ={0°,40°,...,320°} and iterative optimization, which resulted in minimum RMS distance (Ecalib) of 0.31 mm. The small RMS indicates that calibration was performed well and reflects the uncertainty of fiducial marker localization in CT and X-ray images.
“Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images
6
465
Reconstruction of 3D Markers from Calibrated X-Ray Images
Once the X-ray acquisition system was calibrated, the positions of X-ray fiducial markers in 3D could be reconstructed from 2D X-ray images. Each point piϕ, repreth senting the center of i fiducial marker in X-ray image taken at rotation ϕ, was backprojected to the X-ray source rs, which yielded the projection line Liϕ (Fig. 2, right). Line Liϕ defines the perspective projection of a 3D marker to the 2D X-ray plane. The projection line Liϕ can be expressed in the coordinate system Sv of the phantom by mapping the X-ray source rs to point cϕ: −1 cϕ = TVS (ϕ , w E ) rs
(7)
and by expressing the line direction in Sv as:
v iϕ =
−1 TVS (ϕ , w E ) ( piϕ − rs )
(8)
−1 TVS (ϕ , w E ) ( piϕ − rs )
where rs and piϕ are points defined in the sensor coordinate system Ss. We reconstructed a 3D marker position from X-ray images by finding the position R of point ri in the coordinate system Sv that minimized RMS distance Erec from point R ri to all lines Liϕ. Erec can be expressed by vector products:
Erec (ri R ) =
1 M
∑ (r ϕ
i
R
− cϕ ) × v iϕ
2
(9)
∈Φ
Reconstruction of 3D position of six fiducial markers from the nine X-ray images Φ={20°,60°,...,340°), which were not used for calibration, by iterative minimization of Erec yielded RMS of less than 0.06 mm for each of the six fiducial markers. The reconstructed fiducial markers from X-ray images were incorporated in a 3x6 matrix R R R XR=[r1 ,r2 ,...,r6 ]. By using different sets of X-ray images for reconstruction and calibration, we were able to validate the calibration procedure. Small RMS of 0.06 mm indicated that the uncertainty of fiducial marker localization in X-ray images was smaller than in CT images and that calibration had been performed well. Therefore, the major source of calibration uncertainty is the uncertainty of fiducial marker localization in CT images, however, its effect on calibration precision is obviously very small.
7
Gold Standard Registration
After calibrating the X-ray acquisition system and reconstructing 3D markers XR from X-ray images, we were able to establish gold standard registration between the X-ray and CT images, and between X-ray and MR images in coordinate system Sv of the phantom. This was achieved by rigid 3D/3D transformation T that minimized the RMS distance Ereg between reconstructed fiducial markers XR from X-ray images and marker points XCT from CT or XMR from MR images:
Ereg (T) =
1 N
∑ (r N
i
i =1
R
− T ri
)
2
(10)
466
D. Toma eviþ, B. Likar, and F. Pernuš CT
MR
where ri stands for points ri or ri . The closed form solution of this minimal RMS problem is known [8]. Rigid transformation T can be decomposed to the rotation component R, represented by 4x4 matrix, and translation vector t: (11) Tr = Rr + t The optimal solution for the translation component is given as:
t = r R − Rr
(12)
R
where r and r stand for mean position of point sets XR and X, respectively, and where set X can either be XCT or XMR. The optimal solution for the rotation component is given as: R = BA T
(13)
A and B are two orthogonal matrices obtained by singular value decomposition (SVD) of the matrix:
X R X T = ADB T
(14)
where D is a diagonal matrix and X R and X are the point sets XR and X, centered at corresponding mean positions r R and r , respectively. Rigid registration of point set (XCT,XR) and (XMR,XR) resulted in minimum RMS distance Ereg of 0.27 mm for CT and 0.44 mm for MR to X-ray registration. Higher RMS for MR than for CT can be attributed to three reasons. First, because CT was used in calibration, second, because intra- and inter-slice resolution of MR images was lower than in CT, which resulted in higher fiducial localization uncertainty, and third, because MR images suffer from non-rigid spatial distortion.
8
Gold Standard Validation
The minimum RMS distance Ereg is also known as fiducial registration error FRE and can be used to evaluate the accuracy of point based rigid registration [8]. By knowing FRE we can determine target registration error (TRE), which is the distance between true, but unknown position of the target, and target position obtained by registration. The expected TRE of a target point r can be estimated from FRE [8]: 3 FLE 2 d k 1 + (15) TRE (r ) = N k =1 f k th where fk is the RMS of the projections of fiducial markers to k principle axis of marker configuration, dk is the projection of target point r to principle axis k, N is the number of fiducial markers, and FLE is the fiducial localization error obtained from FRE: N FLE 2 = FRE 2 (16) N −2 2
∑
Using the above formulation, we had validated the gold standard registration by manually defining eight target points, four per each pedicle (Fig. 3), in each of the 5 vertebra and computing mean TRE for each vertebra. The results of gold standard validation for CT to X-ray and MR to X-ray registration are illustrated in Table 1. The
“Gold Standard” 2D/3D Registration of X-Ray to CT and MR Images
467
expected target registration errors for the pedicles are in the order of 0.2 mm for CT to X-ray registration and in the order of 0.3 mm for MR to X-ray registration.
Fig. 3. The position of eight target points (◆) on the pedicle borders.
9
Vertebrae L1 L2 L3 L4 L5 0.1 0.1 0.2 CT 0.2 0.1 M 0.3 50.2 50.2 90.3 60.4 R 3 4 4 1 2 Table 1. The expected RMS TREs for gold standard registration in mm.
Discussion and Conclusion
We have devised a lumbar spine phantom and obtained and validated a gold standard rigid 2D/3D registration with the aim of testing the performances of methods for 2D/3D registration of X-ray to CT and MR images. Phantom data was composed of CT and MR volumes of the lumbar spine and a set of 18 X-ray projection images. Xray images were obtained by rotating the phantom with a step of 20° around its principal axis, which mimics the intraoperative acquisition with C-arm. As such the phantom is useful for testing 2D/3D registration methods devised for intraoperative image guided surgery. The X-ray acquisition system was calibrated retrospectively by matching the projections of CT markers with the corresponding markers in X-ray images. Calibration with CT markers is generally superior than calibration with MR markers because CT offers better resolution and spatial stability. This observation was confirmed experimentally, as CT-based calibration yielded smaller calibration error Ecalib of 0.31 mm over 0.47 mm found with MR-based calibration. CT-based calibration of the X-ray image acquisition setup already provides registration of CT to X-ray images but does not give any indices of the registration accuracy. We have consequently reconstructed the 3D positions of markers from calibrated 2D X-ray images, which allowed us to implement 3D/3D registration between the reconstructed markers and those found in CT and MR volumes. The result of such a registration reflects: a) uncertainty of marker localization in 2D X-ray images, b) uncertainty of marker localization in 3D CT or MR images, c) uncertainty of the Xray acquisition calibration, and d) uncertainty of marker reconstruction. Altogether, the uncertainties caused fiducial registration error (FRE) of 3D/3D registration, which was used to evaluate target registration error (TRE) of the gold standard CT to X-ray and MR to X-ray registration by the theory developed in [8]. The results in Table 1 indicate that gold standard registration is highly accurate and therefore useful for testing 2D/3D registration methods. However, it should be stressed that the expected TREs for CT to X-ray gold standard registration may possibly be a little larger than those presented in Table 1. This is because the same CT markers were used for X-ray system calibration and for CT to X-ray registration, which could had involved the same bias in the calibration and registration. Nevertheless, if we assume that localization errors for CT markers are much smaller than for MR markers, the expected TREs for CT to X-ray gold standard registration should be
468
D. Toma eviþ, B. Likar, and F. Pernuš
close to those given in Table 1 and are certainly not larger than TREs for MR to X-ray registration. The gold standard image data is available on request from the authors, who believe it will prove useful for validation of newly developed methods with the same data and therefore provide comparison among different registration methods, especially due to the lack of publicly available gold standards for 2D/3D registration.
Acknowledgements The authors would like to thank Laurent Desbat, Markus Fleute and Raphael Martin, University Joseph Fourier, Grenoble, France, Francois Eesteve of Rayonnement Synchrotron et Recherche Medicale, Grenoble, France, and Uroš Vovk of University of Ljubljana for their generous help and support in acquisition of images. This work was supported by the IST-1999-12338 project, funded by the European Commission and by the Ministry of Education, Science and Sport, Republic of Slovenia.
References 1. R. L. Galloway, “The process and development of image-guided procedures,” Annual Rev. Biomed. Eng., vol. 3, pp. 83-108, 2001. 2. S. Lavallée and R. Szeliski, “Recovering the position and orientation of free-form objects from image contours using 3D distance maps,” IEEE Transaction on Pattern Analysis Machine Intelligence, vol. 17, pp. 378-390, 1995. 3. Guéziec, P. Kazanzides, B. Williamson and R. H. Taylor, “Anatomy-based registration of CT-scan and intraoperative X-ray images for guiding a surgical robot,” IEEE Transaction on Medical Imaging, vol. 17, pp. 715-728, 1998. 4. L. Lemieux, R. Jagoe, D. R. Fish, N. D. Kitchen, D. G. T. Thomas, “A patient-tocomputed-tomography image registration method based on digitally reconstructed radiographs,” Medical Physics, vol. 21, pp. 1749-1760, 1994. 5. J. Weese, G. P. Penny, P. Desmedt, T. M. Buzug, D. L. G. Hill, and D. J. Hawkes, “ VoxelBased 2-D/3-D Registration of Fluoroscopy Images and CT Scans for Image-Guided Surgery,” IEEE Transactions on Information Technology in Biomedicine, vol. 1, pp. 284-293, 1997. 6. G. P. Penny, J. Weese, J. A. Little, P. Desmedt, D. L. G. Hill, and D. J. Hawkes, “A comparison of Similarity Measures for Use in 2-D-3-D Medical Image Registration,” IEEE Transactions on Medical Imaging, vol. 17, pp. 586-595, 1998. 7. D. LaRose, J. Bayouth, and T. Kanade, “Transgraph: interactive intensity-based 2D/3D registration of X-ray and CT data”, Medical Imaging 2000, San Diego, USA, K. M. Hanson (ed), SPIE Press 3979:385-396 (2000). 8. J. M. Fitzpatrick, J. B. West, and C. R. Maurer, “Predicting Error in Rigid-Body PointBased Registration,” IEEE Transactions on Medical Imaging, vol. 17, pp. 694-702, 1998. 9. B. Likar, M. A. Viergever, and F. Pernuš, “Retrospective correction of MR intensity inhomogeneity by information minimization”, IEEE Transactions on Medical Imaging, vol. 20, pp. 1398-1410, 2001.
A Novel Image Similarity Measure for Registration of 3-D MR Images and X-Ray Projection Images Torsten Rohlfing and Calvin R. Maurer, Jr. Image Guidance Laboratories, Department of Neurosurgery, Stanford University 300 Pasteur Drive, MC 5327, Room S-012, Stanford, CA 94305-5327, USA {rohlfing,calvin.maurer}@igl.stanford.edu
Abstract. Most existing methods for registration of three-dimensional tomographic images to two-dimensional projection images use simulated projection images and either intensity-based or feature-based image similarity measures. This paper suggests a novel class of similarity measures based on probabilities. We compute intensity distributions along simulated rays through the 3-D image rather than ray sums. Using a finite state machine, we eliminate background voxels from the 3-D image while preserving voxels from air filled cavities and other low-intensity regions that are part of the imaged object (e.g., bone in MRI). The resulting tissue distributions along all rays are compared to the corresponding pixel intensities of the real projection image by means of a probabilistic extension of histogram-based similarity measures such as (normalized) mutual information. Because our method does not compute ray sums, its application, unlike DRR-based methods, is not limited to X-ray CT images. In the present paper, we show the ability of our similarity measure to successfully identify the correct position of an MR image with respect to a set of orthogonal DRRs computed from a co-registered CT image. In an initial evaluation, we demonstrate that the capture range of our similarity measure is approximately 40 mm with an accuracy of approximately 4 mm.
1
Introduction
Most current methods for registering three-dimensional (3-D) tomographic images to two-dimensional (2-D) projection images (e.g., X-ray fluoroscopy, electronic portal images (EPIs) in radiation therapy) make use of digitally reconstructed radiographs (DRR) computed from CT images. The physical foundations of 3-D CT and 2-D X-ray projection imaging are very similar [1]. Therefore, by casting virtual rays through a CT image, one can compute simulated projection images that resemble actual X-ray images (likewise for EPI) of the same patient in the appropriate pose. These simulated projections are compared to the real projections using standard intensity-based image similarity measures in order to achieve registration of projections and 3-D volume [2,3]. Other approaches use geometrical features, such as edges [4] or point-based landmarks T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 469–476, 2002. c Springer-Verlag Berlin Heidelberg 2002
470
T. Rohlfing and C.R. Maurer, Jr.
(either anatomical or artificial) that are back-projected from the 2-D projections into 3-D space and registered using 3-D point-based algorithms [5,6]. Methods based on artificial landmarks (fiducials) are necessarily invasive. Anatomical landmarks, on the other hand, are hard to identify reliably, especially in multimodal images and 2-D projections. Our group has recently introduced a third class of approaches to the registration of 3-D volumetric images and 2-D projections that is based neither on intensities nor on features, but instead on probabilities [7]. Using probabilistic DRRs (pDRR) and a probabilistic extension of histogram-based image similarity measures, we were able to preserve the spatial information present in volumetric images and use it during registration. An additional advantage is that pDRR computation is not based upon the physical interpretation of voxel intensities as X-ray attenuation coefficients. The method can therefore be applied in a meaningful way to tomographic images other than X-ray CT. In the present paper, we apply our probabilistic similarity measure based on pDRR to the registration of 3-D MR images to standard DRR projection images computed from CT. Here, the deterministic DRR images serve as a model for real X-ray projection images, but with a highly accurate known pose, thanks to CTto-MR co-registration. We also describe a method of distinguishing bone in MRI from image background on-the-fly while iterating over the voxel samples along a ray. In summary, this work is, as far as we are aware, the first to introduce a direct way of registering MR images with projection images without any segmentation or other pre-processing.
2
Methods
Probabilistic DRR. For the ray associated with the detector position xdet we define the probabilistic DRR (pDRR) as the distribution P of intensities µ sampled discretely at N uniformly-spaced locations xi along this ray: pDRR(xdet , c) = P [µ(xi ) = c | 0 ≤ i < N ].
(1)
In order to save computation time, the range of samples visited along the ray is restricted to the actual intersection of ray and 3-D image. This is achieved by computing the index Iin of the entry point of the ray into the volume and the index Iout of the exit point. This is efficiently achieved by solving a system of inequalities, originally described in an algorithm for 3-D line clipping on viewport boundaries by Liang and Barsky [8]. The probabilistic DRR can thus be equivalently rewritten as pDRR(xdet , c) = P [µ(xi ) = c | 0 ≤ Iin ≤ i ≤ Iout < N ].
(2)
For a particular pose (position and orientation) of a CT image, we compute a pDRR by generating a histogram of CT intensities along each projection ray. Each pixel in the pDRR image therefore corresponds to a distribution of CT values along the ray that resulted in the projection value at that pixel. In order
A Novel Image Similarity Measure for Registration of 3-D MR Images
471
Projection pixel intensity 2-D histogram
Normalized 1-D histogram
MR image Tissue distribution along ray
Fig. 1. Registration of MRI and projection image (e.g., fluoroscopy, EPI, DRR) using pDRR and pMI. For each projection pixel, the distribution of MR intensities along the corresponding ray is computed. The histogram is normalized to total mass 1, and added to the row in the 2-D histogram that is indexed by the intensity of the current pixel in the projection image.
to avoid interpolation, the original intensities along the ray are entered into the histogram. The proximity of each voxel to the ray is taken into account by adding to the respective histogram bin only a fractional value between 0 and 1 identical to the weight that would otherwise be used for this voxel in the interpolation (partial volume integration [9]). We will later in this paper apply the same principle in order to handle non-scalar, in our particular case probabilistic, data during the computation of histogram-based similarity measures. Probabilistic Mutual Information. The mutual information (MI) image similarity measure [9] has been used with great success in the registration of 3-D to 3-D images [10] (single or multi modality). Based on our previous experience, we usually apply the normalized mutual information [11] (NMI) image similarity measure, which is derived from MI and appears to be less susceptible to changes in mutual image overlap. Both measures are usually computed from discrete 2-D histograms. A 2-D histogram is a matrix H for which each row corresponds to a range of voxel intensities of one of the two images, and each column corresponds to a range of voxel intensities of the other image. A pair of corresponding voxels under the current coordinate transformation therefore indexes one of the matrix fields. The 2-D histogram defined by two images and a particular transformation is the matrix for which every entry has the value that equals the number of corresponding voxel pairs indexing this entry. In 3-D to 3-D image registration, the voxel intensities of one of the two images (the “floating” or “interpolation image”) need to be determined at the voxel locations of the other image “reference image”). Different methods can be used to enter the resulting voxel pairs into the 2-D histogram. The most straightforward techniques involve computing an interpolated intensity value
472
T. Rohlfing and C.R. Maurer, Jr.
from the intensities of the eight voxels enclosing the respective location. Let for example r be the intensity of a particular reference image voxel and fi for i = 0, . . . , 7 the intensities of the eight enclosing voxels in the floating image. Then one may increment the histogram bin indexed by r and the interpolated floating voxel intensity f as follows, producing an updated histogram H as follows: wi fi . (3) Hr,f = Hr,f + 1 where f = i
The two most commonly used interpolation schemes, nearest neighbor and trilinear interpolation, are both special cases of the expression, each with a specific way of computing the interpolation coefficients wi . However, Maes et al. [9] suggest a technique called partial volume integration that completely avoids intensity interpolation. Instead of applying an interpolation scheme such as the one outlined above to the voxel intensities, each of them is entered into the histogram with a weight that is determined by the tri-linear interpolation coefficients that would be applied in the particular situation. As the histogram is actually 2-D, this means that each of the values is actually paired with the single value taken from the other image, and all pairs are entered into the histogram with the respective weights. Hr,fi = Hr,fi + wi for all i.
(4)
This behavior can be understood as adding to the matrix H the result of the outer product of two vectors as follows. One of the vectors is the unit column vector dTr indexing the r-th row of H while the second vector is the distribution of weights assigned to the columns of H: 7 H = H + dTr (5) wi dfi i=0
Here and in all following equations we assume that the respective vector dimensions match the number of rows and columns of H, respectively. The interpolation weights wi are all between 0 and 1 with a total sum of 1. They can therefore easily be re-interpreted as probabilities in a distribution of discrete values (see Fig. 1). We refer to the similarity measures MI and NMI computed from the histograms thus generated as probabilistic MI (pMI), and probabilistic NMI (pNMI), respectively. Background and Air vs. Bone Detection. Clinical images usually show the region of interest of the patient’s body embedded in air. This is useful to ensure that the image boundaries do not crop the presentation of the patient, which would lead to incorrect computation of projections due to missing data. From an image processing point of view, the object of interest is thus surrounded by more or less extended regions of image background, easily detected by its low pixel intensities. For standard DRR computation, values close to zero have no substantial effect on the result.
A Novel Image Similarity Measure for Registration of 3-D MR Images v
v >= T
v
Add v to Hray
Add v to Htemp
v
v >= T Background
Add v to Hray
473
Foreground
Add v to Htemp v >= T
Cavity
Add v and Htemp to Hray
Fig. 2. Finite state machine to distinguish image background from air-filled cavities and surface folds. The lower object voxel threshold is denoted by T , the intensity of the next voxel along the ray is denoted by v. The inequalities over each arrow indicate the condition that leads to the respective state transition. The textual description under the arrows is the operation performed upon this transition.
However, when computing the distribution of intensities along a ray, the background pixels do have a substantial impact on the result. On the other hand, one cannot ignore all voxels identified as background by an intensity threshold, as this would also remove voxels that represent air-filled cavities inside the patient’s body or surface folds. Both obviously carry important information about the shape and distribution of tissues inside the patient. When considering MR images, not abandoning voxels below a certain threshold becomes even more essential, since bony structures, from which most information in X-ray projection images originates, would also be removed by such an operation. Instead of simple thresholding, we have implemented a finite state machine (FSM) to distinguish between air-filled cavities and bone inside the patient, voxels from which are included in the resulting tissue distribution, and image background, voxels from which are discarded. The FSM is illustrated in Fig. 2. Its fundamental principle of operation is to enter voxels encountered along the ray into either the ray histogram (“Hray”) or a temporary histogram (“Htemp”), depending on which state the FSM is in. The temporary histogram temporarily stores below-threshold voxels which are moved to the main histogram when the next above-threshold voxel is encountered.
3
Results
We have computed the pNMI image similarity measure between probabilistic DRRs computed from a 3-D MR image and a DRR computed from a coregistered CT image1 . The results are visualized in Fig. 4. For translations of up to 40 mm in either direction along the x, y, and z axes, we found a peak of the similarity measure at the known correct pose (translation in x and z direction), or at least close to it (within 4 mm in y direction). 1
The registration transformation between CT and MRI was computed using an intensity-based algorithm based on NMI [12]. Our algorithm has been validated to achieve better than 1 mm accuracy for CT to MR registration using the Vanderbilt image data [10].
474
T. Rohlfing and C.R. Maurer, Jr. AP View
Lateral View
DRR
MR Ray Sums
Fig. 3. DRR images (top row ) and spatially equivalent MR ray sum images (bottom row ). The 3-D CT and MR image were aligned using an intensity-based rigid-body image registration algorithm.
4
Discussion
This paper has presented a novel approach to the registration of 3-D tomographic images with 2-D projections. Our method is based on probabilities rather than intensities or geometric features. We have described an extension and reinterpretation of histogram-based similarity measures that allows us to compute these between probabilistic, non-scalar images. We have also introduced a probabilistic extension to DRR computation that is not based on the physical interpretation of voxel intensities as X-ray attenuation. Therefore, this extension and the subsequent computation of entropy-based similarity measures can be applied to other imaging modalities than CT. In particular, we have demonstrated the capability of our similarity measure to identify the correct pose of an MRI volume with respect to two orthogonal DRR images. It is worth noting that the described procedure of computing pMI (pNMI) from pDRR is fundamentally equivalent to back-projecting the real projection image into 3-D space and computing standard MI (NMI) between the 3-D image and this back-projection. This observation may provide some justification for our method and explain to some extent how and why it works. In comparison, however, our method avoids problems resulting from the non-orthogonal grid of the back-projected data when working with the common projection geometries. Furthermore, our approach allows for an easy detection of background vs. bone and air-filled cavities along each ray, and the integration of fuzzy-segmented X-ray projections [13] is straight forward. Obviously, the problem of registering MRI to real, especially intraoperative, X-ray projections is substantially harder than registering to DRR due to noise, presence of surgical instruments, and possibly geometrical distortions. We are therefore currently acquiring multi-modal 3-D image data (CT and MRI) and 2-D flat-panel X-ray images of patient anatomy with implanted markers that will provide for gold-standard pose information to validate our similarity measure against.
pNMI Image Similarity
A Novel Image Similarity Measure for Registration of 3-D MR Images
475
Translation Y (lateral) Translation Z (lateral) Translation X (frontal)
-40.0
-20.0
0.0
20.0
40.0
Translation (mm) Fig. 4. Probabilistic normalized mutual information (pNMI) image similarity measure. Probabilistic DRRs were computed from MRI for different poses and compared to a single DRR image computed from a co-registered CT image. The similarity measure was plotted for translations. For translations along the x axis, image similarity was computed from the AP (frontal) projection image, since due to the near-parallel projection geometry there was no sufficient perspective scaling of the lateral projection images.
Acknowledgments TR was supported by the National Science Foundation under Grant No. EIA0104114. We acknowledge support for this research provided by CBYON, Inc., Mountain View, CA.
References 1. G. T. Herman, Image Reconstruction from Projections, Academic Press, 1980. 2. G. P. Penney, P. G. Batchelor, D. L. G. Hill, D. J. Hawkes, and J. Weese, “Validation of a two- to three-dimensional registration algorithm for aligning preoperative CT images and intraoperative fluoroscopy images,” Med Phys 28, pp. 1024–1032, June 2001. 3. G. P. Penney, J. Weese, J. A. Little, P. Desmedt, D. L. G. Hill, and D. J. Hawkes, “A comparison of similarity measures for use in 2D-3D medical image registration,” IEEE Trans Med Imaging 17, pp. 586–595, Aug. 1998. 4. D. Tomaˇzeviˇc, B. Likar, and F. Pernuˇs, “Rigid 2D/3D registration of intraoperative digital X-ray images and preoperative CT and MR images,” in Medical Imaging: Image Processing, M. Sonka and J. M. Fitzpatrick, eds., vol. 4684 of Proceedings of SPIE, Feb. 2002. In print.
476
T. Rohlfing and C.R. Maurer, Jr.
5. M. J. Murphy, J. R. Adler, M. Bodduluri, J. Dooley, K. Forster, J. Hai, Q.-T. Le, G. Luxton, D. Martin, and J. Poen, “Image-guided radiosurgery for the spine and pancreas,” Comput Aided Surg 5, pp. 278–288, 2000. 6. J. R. Adler, M. J. Murphy, S. D. Chang, and S. L. Hancock, “Image-guided robotic radiosurgery,” Neurosurgery 44, pp. 1299–1307, June 1999. 7. T. Rohlfing, D. B. Russakoff, M. J. Murphy, and C. R. Maurer, Jr., “An intensitybased registration algorithm for probabilistic images and its application for 2-D to 3-D image registration,” in Medical Imaging: Image Processing, M. Sonka and J. M. Fitzpatrick, eds., vol. 4684 of Proceedings of SPIE, Feb. 2002. In print. 8. Y.-D. Liang and B. Barsky, “A new concept and method for line clipping,” ACM Transactions on Graphics 3, pp. 1–22, Jan. 1984. 9. F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens, “Multimodality image registration by maximisation of mutual information,” IEEE Trans Med Imaging 16(2), pp. 187–198, 1997. 10. J. B. West, J. M. Fitzpatrick, M. Y. Wang, B. M. Dawant, C. R. Maurer, Jr., R. M. Kessler, R. J. Maciunas, C. Barillot, D. Lemoine, A. Collignon, F. Maes, P. Suetens, D. Vandermeulen, P. A. van den Elsen, S. Napel, T. S. Sumanaweera, B. Harkness, P. F. Hemler, D. L. G. Hill, D. J. Hawkes, C. Studholme, J. B. A. Maintz, M. A. Viergever, G. Malandain, X. Pennec, M. E. Noz, G. Q. Maguire, Jr., M. Pollack, C. A. Pelizzari, R. A. Robb, D. Hanson, and R. P. Woods, “Comparison and evaluation of retrospective intermodality brain image registration techniques,” J Comput Assist Tomogr 21(4), pp. 554–566, 1997. 11. C. Studholme, D. L. G. Hill, and D. J. Hawkes, “An overlap invariant entropy measure of 3D medical image alignment,” Pattern Recognit 33(1), pp. 71–86, 1999. 12. T. Rohlfing, J. B. West, J. Beier, T. Liebig, C. A. Taschner, and U.-W. Thomale, “Registration of functional and anatomical MRI: Accuracy assessment and application in navigated neurosurgery,” Comput Aided Surg 5(6), pp. 414–425, 2000. 13. D. B. Russakoff, T. Rohlfing, and C. R. Maurer, Jr., “Fuzzy segmentation of fluoroscopy images,” in Medical Imaging: Image Processing, M. Sonka and J. M. Fitzpatrick, eds., vol. 4684 of Proceedings of SPIE, Feb. 2002. In print.
Registration of Preoperative CTA and Intraoperative Fluoroscopic Images for Assisting Aortic Stent Grafting Hiroshi Imamura1 , Noriaki Ida1 , Naozo Sugimoto1 , Shigeru Eiho1 , Shin-ichi Urayama2 , Katsuya Ueno3 , and Kanji Inoue3 1
Graduate School of Informatics, Kyoto University, Uji-city, Kyoto, Japan 611-0011 {imamura,nrak,sugi,eiho}@image.kuass.kyoto-u.ac.jp 2 National Cardiovascular Center Research Institute, Suita-city, Osaka, Japan 565-8565
[email protected] 3 Takeda Hospital, Kyoto-city, Kyoto Japan 600-8558
Abstract. We investigated a registration method between preoperative 3D-CTA and intraoperative fluoroscopic images during intervention. Our final goal is assisting endovascular stent grafting for aortic aneurysm. In our method, DRR (Digitally Reconstructed Radiograph) are generated by voxel projection of 3D-CTA after extracting an aorta region. By increasing/decreasing CT value in the aorta region of CTA, DRR with/without contrast media injection are obtained. Subsequently we calculate matching measures between DRR and fluoroscopic images iteratively by changing imaging parameters. The most similar DRR to fluoroscopic image is selected. We investigated characteristics of several matching measures using simulated fluoroscopic images. From simulation results, we use M-estimator of residual in our method. From an application example to clinical data, registration was successfully applied by M-estimator of residual.
1
Introduction
Endovascular stent grafting is a minimal invasive treatment of aortic aneurysm [1]. Currently 2D fluoroscopic image is used to visualize position of lesion or interventional device. Disadvantage in using fluoroscopic image is lack of information in 3D structure of the object. For discovering its information, registration of preoperative 3D CT angiogram (3D-CTA) and intraoperative 2D fluoroscopic image is useful. Therefore 3D-2D registration have been investigated by several groups[2][3][4][5]. As an application to intervention, Penney et al. developed new intensitybased similarity measure: pattern intensity and gradient difference[5]. They reported both measures are robust to soft tissue deformation and presence of interventional device. In their method they use perspective model to project 3D CT image onto 2D fluoroscopic image. Therefore, this method has to search T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 477–484, 2002. c Springer-Verlag Berlin Heidelberg 2002
478
H. Imamura et al.
many imaging parameters (10 parameters) and it has problems of long calculation time and of narrow capture range. They reported that an estimation error in the direction perpendicular to the projection plane was significantly bigger than the other directions. For intervention such as stent graft placement, CTA (with intra venous contrast injection) is usually taken as preoperative data and fluoroscopic image with/without contrast agent injection as intraoperative one. Penney et al. did not, however, investigate influence of contrast agent injection. In this paper we evaluate several intensity based measures including gradient difference. For reducing calculation time and having enough capture range, we use parallel projection model to project 3D CT image onto 2D fluoroscopic image and reduce the number of imaging parameters to 4. By using parallel projection, it is impossible to estimate position in perpendicular direction to the projection plane. However, we think estimating rotation angles is much more important in aortic stent grafting than estimating position in the perpendicular direction. We also investigate influence of contrast agent both in preoperative and intraoperative images.
2 2.1
Materials and Methods Imaging Geometry and Image Specification
Fig. 1 shows the coordinate system used in our method. We assume 4 imaging parameters, position (x, z), angle (rotation, angulation). For a simulation study described in 2.3 and 3.1, CTA which covers wide area (from thorax and abdomen, matrix size : 512×512×313 [pixel], voxel size : 0.664×0.664×1.250 [mm]) is used. Set of preoperative CTA (matrix size : 512×512×153 [pixel], voxel size : 0.625×0.625×1.50 [mm]) and intraoperative fluoroscopic images (matrix size : 450×450 [pixel], pixel size : 0.390×0.390 [mm]) of the same patient is also used in 2.4 and 3.2. These images were taken for placement of stent graft to abdominal aneurysm. Figures 2 and 3 show CTA images (axial, sagittal, and coronal slices), and Fig. 4 shows one of fluoroscopic images. Rectangle area on the fluoroscopic image is used for matching process described in the following subsection.
Fig. 4. Clinical Fig. 1. Imaging ge- Fig. 2. CTA for sim- Fig. 3. CTA for clin- fluoroscopic image. ulation. ical application. ometry.
Registration of CTA and Fluoroscopic Images for Assisting Stent Grafting
2.2
479
Method
Generation of Digitally Reconstructed Radiograph. First, an aorta region is extracted from CTA using the extraction method which we previously developed[6]. Subsequently, by increasing/decreasing voxel value in aorta region, two kinds of CT data are produced. One is like a CT with intra aortic contrast injection, and the other is like a CT without contrast injection. Digitally reconstructed radiographs (DRR) of with/without contrast injection are, then, produced by parallel voxel projection of CT image with/without high contrasted aorta. By changing the imaging parameters (rotation and angulation in Fig. 1), a lot of DRRs are produced. The most similar DRR to a fluoroscopic image is searched among them. Detecting Contrast Media Injection in Fluoroscopic Image. For detecting contrast media injection in fluoroscopic image, we accumulate pixel value inside ROI in fluoroscopic image (Fig. 4). If sum of pixel value is bigger/smaller than threshold, we estimate this fluoroscopic image is contrasted/non-contrasted and we use DRR with/without contrast media injection for registration. Matching Measures Residual. Residual between fluoroscopic image (Ifl ) and DRR (IDRR ) are defined as follows. Residual can be calculated easily, but it depends on change of brightness and contrast of image. By using sequential similarity detection algorithm (SSDA)[7], calculation time to find a minimum value of this measure is able to be reduced significantly. R=
M N
|Ifl (i, j) − IDRR (i, j)|
j=1 i=1
M-estimator. Robust estimation is a statistic method which is robust to noise included only in one image. In interventional procedure, instruments such as stent or catheter are included only in fluoroscopic image, and there is a possibility that mismatch pair of image (contrasted/non-contrasted fluoroscopic image with non-contrasted/contrasted DRR) are used. M-estimator is one of the most popular criterions in robust estimation. M=
i,j
|Ifl (i, j) − IDRR (i, j)|2 σ 2 + |Ifl (i, j) − IDRR (i, j)|2
(σ : constant)
Gradient Difference. Penney et al.[5] presented a similarity measure for 3D2D registration based on residual of gradient image: gradient difference. They show that it is robust to soft tissue deformation because low frequency components are already filtered out in gradient image. They show it is also robust to presence of linear high intensity region such as a stent or a catheter.
480
H. Imamura et al.
G=
Av
i,j
Av + {IdiffV (i, j)}2
+
i,j
Ah Ah + {IdiffH (i, j)}2
dIDRR dIf l −s , Av , Ah : constants, IdiffV (i, j) = di di dIf l dIDRR IdiffH (i, j) = −s dj dj
By the above formula, SSDA-like fast algorithm can not be utilized. However, this equation can be transformed to the following formula. {IdiffV (i, j)}2 {IdiffH (i, j)}2 G= 1− + 1 − Av + {IdiffV (i, j)}2 Ah + {IdiffH (i, j)}2 i,j i,j =
i,j
{2} − i,j
{IdiffV (i, j)} + Av + {IdiffV (i, j)}2 2
i,j
{IdiffH (i, j)} Ah + {IdiffH (i, j)}2 2
Maximizing G gives the same result as minimizing the second term in the above formula. The second term is a distance measure, thus SSDA-like fast algorithm can be used for minimizing it. In this paper we minimize the second term which corresponds to an M-estimator of residual of gradient images. We also investigated the following three measures, residual of gradient image, correlation coefficient, mutual information. We do not show result on these measures in this paper because result on residual of gradient image, cross correlation, and mutual information resembled gradient difference, residual, and M-estimator of residual respectively. Optimization. We used multi-resolutional analysis for searching optimal imaging parameters. In this paper, we used triple resolutional data. 2.3
Simulation Study
For investigating a characteristics of matching measures, simulation study was performed. In this study, simulated fluoroscopic images were produced almost the same way for generating DRR(2.2.1). However, here, perspective projection was used instead of parallel projection. S-shaped and pincushion distortion was also added on it (Fig. 5). Artificial line of high intensity is, then, added as simulated catheter. Figure 6 shows a finally obtained simulated fluoroscopic image. For investigating influence of rotation and angulation, we produced images from 3 kinds of imaging orientation (anterior, rotated, angulated images). In each case, both thoracic and abdominal images, and also with/without contrasted agent were generated. As a result, 12 fluoroscopic images are used in our study. By using these images and DRRs in Fig. 7, we investigated characteristics of matching measures described in 2.2. In this study we determined Av and Ah for calculating gradient difference in the same way as described in Penney et.al[5].
Registration of CTA and Fluoroscopic Images for Assisting Stent Grafting
Fig. 5. Added geometric distortion to simulated fluoroscopy.
481
Fig. 6. Simulated fluoroscopic images (Left:contrast- Fig. 7. Generated DRR with ed, Right:non-constrasted). /without contrast agent injection.
Table 1. Average and standard deviation of estimation error.
Measure residual
(n)cFL :(Non-)Contrasted fluoroscopic image (n)cDRR :(Non-)Contrasted DRR Pair of Image Rotation [deg] Angulation[deg] X [mm] −0.67 ± 0.94
0.00 ± 1.16
cFL-ncDRR
0.67 ± 8.72
−6.33 ± 7.90
−13.79 ± 23.01 −1.33 ± 12.70
ncFL-cDRR
7.00 ± 9.77
3.67 ± 6.70
−13.12 ± 21.86 25.79 ± 63.12
ncFL-ncDRR
0.33 ± 0.75
0.67 ± 2.49
cFL-cDRR
−0.33 ± 0.82
0.33 ± 2.34
0.45 ± 1.38
−0.45 ± 2.85
−1.00 ± 5.33
−1.67 ± 9.59
−5.34 ± 6.64
12.23 ± 15.75
M-estimator cFL-ncDRR of residual
gradient difference
2.4
ncFL-cDRR −11.33 ± 10.63
−1.67 ± 7.31
ncFL-ncDRR
−0.33 ± 1.97
1.00 ± 2.45
−0.22 ± 0.50
Z[mm]
cFL-cDRR
0.45 ± 0.99
0.67 ± 1.89
4.00 ± 3.75
−30.02 ± 56.91 53.36 ± 34.58 0.00 ± 2.23
−2.89 ± 4.75
5.67 ± 0.69
0.00 ± 3.13
cFL-cDRR
−2.00 ± 0.00
−1.00 ± 1.67
cFL-ncDRR
−8.00 ± 8.67
5.67 ± 7.94
−71.15 ± 12.56 40.24 ± 51.99
ncFL-cDRR
−3.00 ± 3.52
−4.33 ± 7.74
−27.57 ± 39.87 9.34 ± 42.67
ncFL-ncDRR
1.00 ± 6.42
3.67 ± 8.34
−10.89 ± 21.48 −4.89 ± 5.30
Application to Clinical Data
We tested our algorithm to contrasted/non-contrasted clinical fluoroscopic image. First, we produced triple resolutional DRR. Subsequently we detected contrast media injection in fluoroscopic image and registered it with low resolutional DRR. For the lowest resolutional DRR and medium resolutional DRR, we used cross correlation as matching measure. For the highest resolutional data, we used M-estimator of residual as matching measure.
3 3.1
Results Simulation Study
To examine accuracy of parameter estimation, we calculated average and standard deviation of estimation error for each parameter (Table 1). The distributions of matching measures around peak point are calculated one dimensionally on a variable chosen. Residual, M-estimator of residual, and gradient difference are shown in Fig. 8 (a), (b), and (c) respectively. In these figures, one of imaging parameters is changed, other parameters are fixed with same value in the peak point.
482
H. Imamura et al.
(a) Residual
(b) M-estimator of residual
(c) Gradient difference (second term) Fig. 8. One dimensional profile of matching measure distribution around peak point.
Registration of CTA and Fluoroscopic Images for Assisting Stent Grafting
(a) contrasted
483
(b) non-contrasted
Fig. 9. Result of application to clinical data.
3.2
Application to Clinical Data
Experimental results of clinical fluoroscopic image with DRR are shown in Fig 9 (a) and (b). In these figures, left side image is original fluoroscopic image, and right side image is DRR with estimated imaging parameters.
4
Discussion
From Table 1, regarding influence of contrast injection, it was proved that a pair of contrasted fluoroscopic image with contrasted DRR is the best, non-contrasted fluoroscopic image with non-contrasted DRR is the second priority, and others are not good because average and standard deviation of estimation error are much bigger. Standard deviation by gradient difference is much bigger than by other matching measures for non-contrasted fluoroscopic image with non-contrasted DRR case (ncFL-ncDRR). When edge of catheter in non-contrasted fluoroscopic image and edge of rib in non-contrasted DRR are matched, incorrect DRR is selected. Example of such a case is shown in Fig. 10. Edge of a rib and a catheter is indicated by inside of white ellipse region of left side and right side image. Simulation study shows that residual of gradient, M-estimator of residual, gradient difference, mutual information have an enough sharp peak, but they have several local optimal points. On the other hand, residual and cross correlation have broad peak around ground truth. Therefore we use cross correlation to low resolutional data and M-estimator of residual to high resolutional data. From clinical application, it seems that appropriate imaging parameters are estimated.
5
Conclusion
In this paper, we investigated a registration method between preoperative 3D CT angiography (3D-CTA) and intraoperative fluoroscopic images (with/without contrast injection) for assisting endovascular stent grafting. Especially we examined influence of contrast agent both in preoperative and intraoperative images. Simulation results and application to the clinical data show that M-estimator of residual is suitable as matching measure.
484
H. Imamura et al.
Fig. 10. Incorrectly selected differentiated DRR (Left, Middle) and differentiated simulated fluoroscopic image (Right).
Acknowledgements This research is partially supported by Grant-in-Aid for Scientific Research (C)(2)(No.13680935) from Japan Society for the Promotion of Science(JSPS).
References 1. K. Inoue, H. Hosokawa, T. Iwase, M. Sato, Y. Yoshida, K. Ueno et al.: Aortic Arch Reconstruction by Transluminally Placed Endovascular Branched Stent Graft. Circulation 100 (1999) 316–321. 2. S. Lavall´ee and R. Szeliski: Recovering the Position and Orientation of Free-form Objects from Image Contours Using 3-D Distance Maps. IEEE Trans. PAMI 17 (1995) 378–390. 3. A. Gu´eziec, P. Kazanzides, B. Williamson, and R. H. Taylor: Anatomy Based Registration of CT-scan and X-ray Images for Guiding a Surgical Robot. IEEE Trans. Med. Imag. 17 (1998) 715–728. 4. L. Z¨ ollei, E. Grimson, A. Norbash, W. Wells: 2D-3D Rigid Registration of X-Ray Fluoroscopy and CT Images Using Mutual Information and Sparsely Sampled Histogram Estimators. IEEE CVPR (2001). 5. G. P. Penney, J. Weese, J. A. Little, P. Desmedt, D. L. G. Hill, and D. J. Hawkes: A Comparison of Similarity Measures for Use in 2-D-3-D Medical Image Registration. IEEE Trans. Med.Imag. 17 (1998) 586–595. 6. H. Imamura, N. Sugimoto, S. Eiho, S. Urayama, K. Ueno, K. Inoue: Extraction and Quantitative Analyisis of Aneurysmal Aorta for Aiding Endovascular Stent Grafting. IEICE J84-D-II (2001) 2468–2476. 7. D. I. Barnea and H. F. Silverman: A class of algorithms for fast digital image registration. IEEE Trans. Comput. C-21 (1972) 179–186.
Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy for Voxel-Based 2-D/3-D Registration Yoshikazu Nakajima1 , Yuichi Tamura2 , Yoshinobu Sato1 , Takahito Tashiro1 , Nobuhiko Sugano3 , Kazuo Yonenobu4 , Hideki Yoshikawa3 , Takahiro Ochi2 , and Shinichi Tamura1 Division of Interdisciplinary Image Analysis1 Division of Computer Integrated Orthopaedic Surgery2 , Department of Orthopaedic Surgery3 , Osaka University Graduate School of Medicine, Suita, Osaka 565-0871, Japan Osaka Minami National Hospital4 , Kawachinagano, Osaka 586-8521, Japan
Abstract. We have developed a system for the 3-D localization of anatomical structures without the need for surgical exposure by using multiple-view fluoroscopy images. In this paper, we describe the system and evaluate its application to the estimation of optimal imaging orientations in fluoroscopy. For positional measurement, a voxel-based 2-D/3-D registration technique was employed. Since the measurement condition depends on the object shape, spatial distribution of X-ray absorption, and overlap of organs or structures (which differs at each imaging position), determining the optimal combination of fluoroscopy orientations is significant. We propose a system for preoperative determination of the optimal imaging orentation by using the accuracy estimation of stereo localization from single-plane localization results. In an experiment, the computation time needed was 10 hours, which was about 14 times shorter than the time required for a full search of imaging orientation combinations.
1
Introduction
Two-dimensional (2-D)/three-dimensional (3-D) registration between an intraoperative fluoroscopy image and a preoperative 3-D CT image [1] [2] is an effective means of organ localization without surgical exposure. Primary research in 2-D/3-D registration for medical applications was based on the contour-surface matching technique [3] [4] [5]. The contour-based method is, however, not stable with respect to false contours, and exact extraction of bone edges is not always feasible because of material heterogeneity and overlap of peripheral organs. On the other hand, voxel-based registration methods, which use digitally reconstructed radiographs (DRRs) generated from a 3-D CT image, are generally robust compared with the contour-based method [6] [7] [8] [9] [10]. For these reason, we have employed a voxel-based method [6] for our purpose of vertebra localization. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 485–492, 2002. c Springer-Verlag Berlin Heidelberg 2002
486
Y. Nakajima et al.
For accurate pose measurement, registration from two stereo images has been proposed [3] [11]. The object shape, distribution of X-ray absorption, and overlap of peripheral organs or structures — which changes along with the imaging orientation — cause spatial heterogeneity of the pose measurement accuracy. Therefore, to improve the localization accuracy, it is significant to optimize the imaging orientations. Some approaches to this have been reported [12] [13] [14] [15], but, optimal view analysis for registration has not hitherto been proposed. Here, we discuss determination of the optimal imaging orientation in fluoroscopy for voxel-based 2-D/3-D registration.
2
Methods
2.1 Voxel-Based 2-D/3-D Registration The algorithm for voxel-based 2-D/3-D registration [6] [7] consists of three major components: 2-D image (DRR) generation from a 3-D CT image, similarity evaluation between a DRR and a fluoroscopy image, and optimization. In the DRR generation process, segmentation of the anatomical structure of interest in the pre-operative 3-D CT image and pseudo projections to generate a DRR (Fig. 1 (a)) from the segmented CT volume are performed. In the similarity evaluation, the gradient correlation is employed for single-plane pose measurement, which is given by GC =
(i,j)∈T∂
1 2
(i,j)∈T∂
i
where and
Fi Di
i 2
Fi
(i,j)∈T∂
i
∂If l (i, j) ∂If l − , ∂i ∂i ∂If l (i, j) ∂If l Fj = − , ∂j ∂j Fi =
Di2
+
1 2
(i,j)∈T∂
(i,j)∈T∂
j
Fj2
Fj Dj
j
(i,j)∈T∂
j
Dj2
, (1)
∂IDRR (i, j) ∂IDRR − , ∂i ∂i ∂IDRR (i, j) ∂IDRR Dj = − , ∂j ∂j Di =
and If l and IDRR are the pixel intensities of the fluoroscopy image and DRR, respectively. ∂If l /∂i, ∂If l /∂j, ∂IDRR /∂i, and ∂IDRR /∂j are the pixel intensities of horizontal and vertical gradient images of the fluoroscopy image and DRR, respectively. For stereo imaging measurement, the evaluation function — which is the sum of the gradient correlations of both imaging positions — is used. Estimation of the CT volume pose is realized by an optimization approach based on the Powell method [6] [7]. In our experiments, an anatomical coordinate system of the vertebra of interest (Fig. 2 (b)) [16] is determined by manually specifiying surface points on the backface and topface (Fig. 2 (a)) in the 3-D CT image. 2.2
Estimation of Optimal Imaging Orientation
We propose a method of estimating the optimal imaging orientation in fluoroscopy for stereo 2-D/3-D registration. Since the cost of error computation for 2-D/3-D registration is relatively high, a method of estimating the stereo localization accuracy from single-plane localization results is employed. An overview of the system is shown in Fig. 3. From a CT volume and position parameters, an
Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy
487
Fig. 1. Digitally reconstructed radiographs (DRRs). (a) DRR from segmented CT image of L1 vertebra. (b),(c),(d) DRR from original CT image. The imaging angles are 0 deg, 45 deg, and 90 deg, respectively. Overlaps of vertebrae, ribs and soft tissues change along with the imaging angle.
Fig. 2. Anatomical coordinate system of a vertebra. (a) Sampling points on the vertebra surface. A (set of white four points) determines the origin and the backface. B (set of gray points) determines the topface. (b) Anatomical coordinate system determined by the origin, backface, and topface.
X-ray fluoroscopy image is generated for the computer simulation. The CT volume of the vertebra of interest is segmented in the original CT volume. Initial estimates of the position parameters, which are (tx , ty , tz ) for translation and (θx , θy , θz ) for rotation, are made by adding random noise to the true position parameters. Then, 2-D/3-D registration is performed. By changing the position parameters of the CT volume from the imaging system, an error profile of singleplane localization is computed. Errors of translation and rotation are described as a covariance matrix, and the calculated covariance matrix is transformed in local coordinate system of the anatomical structure. The transformed matrix C´single plane in the local coordinate system of the anatomical structure is given by C´single
plane
= (M T M )−1 M T Csingle
plane ((M
T
M )−1 M T )T ,
(2)
where Csingle plane is the covariance matrix of rotation and translation errors, and M is the Jacobian of transformation between the local coordinate system of target anatomical structure and the coordinate system of the imaging plane.
488
Y. Nakajima et al.
The error of stereo localization, determined by using a combination method of error distribution [17], is given by Cstereo = C´single plane2 (C´single plane1 + C´single plane2 )−1 C´single plane1 ,
(3)
where Cstereo is the covariance matrix of stereo measurement and Csingle plane1 and Csingle plane2 are covariance matrices of single-plane measurement. Then, to project the error into the local coordinate axes of the anatomical structure for clinical evaluation, it is approximately evaluated by fitting a 3-D Gaussian function G(x, y, z; σx , σy , σz ) the axes of which correspond to the local coordinate system of the anatomical structure. When the number of imaging orientations for the optimal orientation analysis of single-plane localization is n, the combination number of imaging orientations for the optimal orientation analysis of stereo localization is 12 n(n + 1). Since the proposed method only requires the accuracy of single-plane localization, the number of imaging orientations for accuracy analysis is n and the computation time of an iteration is half that required for stereo localization. In the case of 15-degree resolution (13 positions) of imaging orientation and 50 iterations at each orientation, the computation time needed for the proposed method was 10 hours (with a Pentium Xeon 1.7 GHz, 2 CPUs, and 2 GB memory) which is 1/14 that required for a full search in imaging orientation pairs.
Fig. 3. Accuracy estimation of stereo localization
3
Experiment
3.1 Effects of Overlap We assessed the effects of organ or other structure overlap. Since our 2-D/3-D registration method registers a segmented CT volume of the vertebra of interest,
Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy
489
overlap of upper and lower vertebrae, ribs, and soft tissues causes pose estimation error. The CT data used in the experiment sections were a set of abdominal images that were imaged for clinical diagnosis of intestinal disease. The FOV, slice thickness, and matrix size were 360 mm, 2.0 mm, and 512×512×128 pixels, respectively. The images included the lower thoracic and lumber spine (from T10 to C4). In the system, the CT volume was segmented to parts of the vertebra of interest (L1 vertebra), other vertebrae, ribs, and soft tissues. By controlling of their visibility, we evaluated the effect of organ and other structre overlap. The experimental conditions are shown in Table 1. In the experiment, the matrix size of generated 2-D image was 128 × 128 pixels. Let σerror0 , σerror1 , and σerror2 be standard deviations of error in conditions 0, 1, and 2, respectively, and σef f ect0 , σef f ect1 , and σef f ect2 the overlap effect of each organ or structure, respectively. The overlap effects are given by
σef f ect1 = and σef f ect2 =
σef f ect0 = σerror0 ,
(4)
σerror1 − σerror0
2,
(5)
σerror2 − σerror1
2,
(6)
2 2
respectively. Table 1. Experimental Conditions for Error Analysis of Organ or Other Structure Overlap Segmented part Condition 0 Condition 1 Condition 2 L1 vertebra visible visible visible vertebrae invisible visible visible ribs and soft tissues invisible invisible visible
The results are shown in Fig. 4. The error of localization of the L1 vertebra was 0.43 ± 0.45 degree and 1.27 ± 0.70 mm. The error caused by the overlap of the upper and lower vertebrae was 0.76 ± 0.29 degree and 1.15 ± 0.60 mm. The error caused by the overlap of ribs and soft tissues was 0.01 ± 0.11 degree and 0.11 ± 0.38 mm, which was smaller than the other errors. In a clinical CT image for spine surgery, the imaging volume is limited to the area around the surgical site and does not include all of the ribs and soft tissues. The above results showed the feasibility of preoperative estimation of the optimal imaging orientation using a segmented CT image of vertebrae in clinical use. 3.2
Accuracy Computation of Single-Plane Localization
The localization accuracy of the single-plane measuremenst was validated. The results are shown in Fig. 5. In the figure, the optimal orientation was ±60 degrees on the y-axis rotation. On the x-axis rotation, tilting of less than ±15 degrees might be acceptable. The results for over ±15 degrees of tilt were affected by the overlap of upper/lower vertebrae.
490
Y. Nakajima et al.
Fig. 4. Effects of tissue overlap. (a) Rotation error. (b) Translation error.
Fig. 5. Localization accuracy of single-plane measurement. The horizontal axis shows the imaging angle of the fluoroscope in the vertebra coordinate system. The vertical axis shows the root mean square (RMS) errors of the localized pose of the target vertebra. (a) Rotation around the y-axis. (b) Rotation around the x-axis.
3.3
Estimation of Optimal Imaging Orientations for Stereo Fluoroscopy The optimal imaging orientations for stereo fluoroscopy were estimated. The accuracy estimation results are shown in Fig. 6. Panels (a) and (c) respectively depict the rotational errors of the estimation using single-plane localization accuracy as described in Section 2.2, and the error measurements obtained by using a full-search simulation. Panels (b) and (d) respectively depict the translational errors of the estimation and full-search error measurements. Similar error tendencies were observed in both sets of results. With respect to the rotation accuracy, the optimal pairs of imaging points were {0, 60} in the estimation and {0, 75} in the full-search simulation. Errors of {0, 60} in the estimation were 0.53 and 0.54 mm, respectively, while the error of {0, 75} in the full-search simulation was 0.44 mm. In the case of the translation accuracy, the optimal imaging points were {0, 90} in both results. In situations where the fluoroscopy geometry was restricted, the preoperative analysis of the optimal orientation was effective. For example, when the geometry of fluoroscopy was restricted to ±30 degrees, the optimal combination of imaging orientations was {0, 30} and not {-30, 30}.
Preoperative Analysis of Optimal Imaging Orientation in Fluoroscopy
(a)
(b)
(c)
(d)
491
Fig. 6. Results of estimated accuracy of stereo measurement. Horizontal axes show the imaging positions of each fluoroscopy. Vertical axes show the root mean square (RMS) errors of the localized pose of the target vertebra. (a) Estimated rotation error. (b) Estimated translation error. (c) Simulated rotation error. (d) Simulated translation error.
4
Discussion and Conclusions
In this paper, a method of estimating the optimal imaging orientation in fluoroscopy for voxel-based 2-D/3-D registration is proposed. The optimal imaging orientation pairs estimated experimentally were {0, 60} for rotation and {0, 90} for translation. Using optimal view estimation of stereo imaging from the registration accuracy of single-plane imaging, the system computed the optimal orientation in about 10 hours, representing a computation cost 14 times lower than that required by the full-search method. The method was validated for optimal view determination of stereo localization from the accuracy of single localization. In this method, the stereo localization accuracy is estimated by using pseudo X-ray images (DRRs). Although this is covenient for preoperative analysis, DRR and X-ray fluoroscopy images have different pixel intensities. Their spatial resolutions are also differ. These problems are discussed in [9]. As our next step, we intend to validate the method with respect to the limitations of its application to clinical use. The results reported here were good enough to estimate optimal imaging orientations, but were not adequate for estimating registration accuracy.
492
Y. Nakajima et al.
In the future, we will take up the challenge of evaluating the effect of imaging parameters (the CT slice thickness, etc.) and integrating error components to estimate the registration accuracy and evaluate the optimal imaging parameters. Acknowledgement: This work was partly supported by the Japan Society for the Promotion of Science (JSPS) Research for the Future Program JSPSRFTF99I00903 and the JSPS Grant-in-Aid for Scientific Research (Encouragement of Young Scientists (B) 14780281).
References 1. A. Hamadeh and P. Cinquin: ”Kinematic Study of Lumber Spine Using Functional Radiographies and 3D / 2D Registration”, CVRMed-MRCAS ’97, pp.109-118, 1997. 2. K. Takayanagi, K. Takahashi, M. Yamagata, H. Moriya, H. Kitahara, T. Tamaki: ”Using Cineradiography for Continuous Dyanmic-Motion Analysis of the Lumber Spine”, SPINE, 26(17), pp. 1858-1865, 2001. 3. S. Lavall´ ee, R. Szeliski: ”Recovering the Position and Orientation of Free-Form Objects from Image Contours Using 3D Distance Map”, IEEE Trans. on PAMI, 17(4), pp.378-390, 2001. 4. S.A. Banks, W.A. Hodge: ”Accurate Measurement of Three-Dimensional Knee Replacement Kinematics Using Single-Plane Fluoroscopy”, IEEE Trans. on Biomedical Engineering, 43(6), pp.638-649, 1996. 5. S. Zuffi, A. Leardini, F. Catani, S. Fantozzi, A. Cappello: ”A Model-Based Method for Reconstruction of Total Knee Replacement Kinematics”, IEEE Trans. on Medical Imaging, 18(10), 1999. 6. J. Weese, P. Penney, P. Desmedt, T.M. Buzug, D.L.G. Hill, D.J. Hawkes: ”Voxel-Based 2-D/3-D Registration of Fluoroscopy Images and CT Scans for Image-Guided Surgery”, IEEE Trans. on Information Technology in Biomedicine, 1(4), pp. 284-293, 1997. 7. G.P. Penney, J. Weese, J.A. Little, P. Desmedt, D.L.G. Hill, D.J. Hawkes: ”A Comparison of Similarity Measures for Use in 2-D—3-D Medical Image Registration”, IEEE Trans. on Medical Imaging, 17(4), pp. 586-595, 1998. 8. A. Gu´ eziec, P. Kazanzides, B. Williamson, R.H. Taylor: ”Anatomy-Based Registration of CTScan and Intraoperative X-Ray Images for Guiding a Surgical Robot”, IEEE Trans. on Medical Imaging, 17(5), pp. 715-728, 1998. 9. P. Penney: ”Registration of Tomographic Images to X-ray Projections for Use in Image Guided Interventions”, Thesis for the degree of Doctor of Philosophy of the University of London, 1999. 10. L. Z¨ ollei: ”2D—3D Rigid-Body Registration of X-Ray Fluoroscopy and CT images”, Thesis for the degree of Doctor of Philosophy of the Massachusetts Institute of Technology, 2001. 11. B. You, P. Siy, W. Anderst, and S. Tashman: ”In vivo Measurement of 3-D Skeletal kinematics from Sequences of Biplane Radiographs: Application to Knee Kinematics”, IEEE Trans. on Medical Imaging, 20(6), pp.514-525, 2001. 12. A.C.M. Dumay, J.H.C. Reiber, and J.J. Gerbrands: ”Determination of Optimal Angiographic Viewing Angles: Basic Priciples and Evaluation Study”, IEEE Trans. on Medical Imaging, 13(1), pp. 13-24, 1994. 13. Y. Sato, T. Araki, M. Hanayama, H. Naito, S. Tamura: ”A Viewpoint Determination System for Stenosis Diagnosis and Quantification in Coronary Angiographic Image Acquisition”, IEEE Trans. on Medical Imaging, 17(1), pp. 121-137, 1998. 14. Wilson, D.L.; Royston, D.D.; Noble, J.A.; Byrne, J.V. : ”Determining X-ray Projections for Coil Treatments of Intracranial Aneurysms” IEEE Trans. on Medical Imaging, 18(10), pp. 973-980, 1999. 15. A.S. Talukdar and D.L. Wilson: ”Modeling and Optimization of Rotational C-Arm Stereoscopic X-ray Angiography”, IEEE Trans. on Medical Imaging 18(7), pp. 604-616, 1999. 16. M.M. Panjabi, T. Tanaka, V. Goel, D. Federico, T. Oxland, J. Duranceau, and M. Krag: ”Thoracic Human Vertebrae (Quantitative Three-Dimensional Anatomy)”, SPINE, 16(8), pp.888-901, 1991. 17. W. Hoff and T. Vincent: ”Analysis of Head Pose Accuracy in Augmented Reality”, IEEE Trans. on Visualization and Computer Graphics, 6(4), pp. 1-15, 2000.
A New Similarity Measure for Nonrigid Volume Registration Using Known Joint Distribution of Target Tissue: Application to Dynamic CT Data of the Liver Jun Masumoto1 , Yoshinobu Sato1 , Masatoshi Hori2 , Takamichi Murakami2 , Takeshi Johkoh2 , Hironobu Nakamura2 , and Shinichi Tamura1 1
2
Division of Interdisciplinary Image Analysis Department of Radiology, Osaka University Graduate School of Medicine Suita, Osaka, 565–0871, Japan
Abstract. A new similarity measure for volume registration is proposed, which uses using the assumption that the joint distribution of a target tissue is known. This similarity measure is designed so that it can deal with the tissue slide that occurs at boundaries between the target tissue and other tissues. Pre-segmentation of the target tissue is unnecessary. We intend to apply the proposed measure to registering volumes acquired at different time-phases in dynamic CT scans of the liver using contrast materials. In order to derive the similarity measure, we first formulate the ideal case where the joint distributions of all the tissues are known, after which we derive the measure for a realistic case where only the joint distribution of the target tissue is known. We applied the proposed measure experimentally to eight dynamic CT data sets of the liver. After describing a practical method for estimating the joint distribution of the liver from real CT data, we show that the problem of tissue slide is effectively dealt with using the proposed measure.
1
Introduction
Dynamic contrast-enhanced CT scans are effective means of desease diagnosis and surgical planning for the liver. In a dynamic CT study, several CT volumes are typically acquired at different time-phases not in a single breath-hold. Hence, these volumes are not guaranteed to be registered between different time-phases due to respiratory motion. Their registration by post-processing is highly desirable on account of the following advantages: (1) Accurate correlation between different time-phase images can be performed. (2) In 3D rendering of the liver, portal/hepatic veins and tumors, which are enhanced at different phases, can be registered more accurately. (3) Time−density curves can be estimated at every voxel, which should eventually permit automatic cancer characterization [1]. In this paper, we address the problem of nonrigid registration between volumes acquired at different time-phases of dynamic CT scans of the liver. An important issue in registration of the liver is tissue slide, which occurs along T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 493–500, 2002. c Springer-Verlag Berlin Heidelberg 2002
494
J. Masumoto et al.
boundaries between the liver and other tissues, resulting in discontinuities in the 3D vector field describing the nonrigid deformation [2] [3]. Previous attempts to deal with this problem have required pre-segmentation of the liver region [4] or specification of places where tissue slide may occur before registration [5]. However, because segmentation of the liver from CT data is a far from easy task [6], the ability to employ direct registration between raw CT volumes without segmentation is desirable in the clinical environment. As a means of dealing with tissue slide without pre-segmentation, we propose a new similarity measure for volume registration. In the dynamic CT, the tissue contrast during scans at different time-phases changes differently depending on the particular tissue involved. Thus, unlike a cross−correlation measure, a new similarity measure should also be able to cope with differences in contrast between volumes to be registered. Although mutual information [7] (or the entropy correlation coefficient: ECC [8]) is known to be useful as a similarity measure in such a case [9] [10], we instead employ the following assumption: “The joint distribution of a target tissue is known.” The use of the known joint distribution was originally suggested by Leventon et al. [11]. The main difference between their method and ours is that we utilize the known joint distribution of only the target tissue while they use that of the entire volume. Thus, our method tries to register only the target tissue, for example, the liver, but ignores non-target tissues. By taking this approach, we effectively cope with tissue slide which is inevitable in registration of the abdominal domain.
2 2.1
Theory Ideal Case: Assuming Joint Distributions of All Tissues Are Known
We consider the joint distribution Po (I, J) of two volumes whose intensity values are represented by I and J, respectively. These two volumes are assumed to be correctly registered. If we assume that the volume consists of the tissue set Γ = {γ1 , γ2 , · · · , γn } and the joint conditional distribution for each tissue is known, Po (I, J) can be decomposed into P (I, J, γ) = P (I, J|γ) · P (γ), (1) Po (I, J) = γ∈Γ
γ∈Γ
where γ∈Γ P (γ) = 1. In this case, an optimal similarity measure, B(X), should be maximum when X = Po (I, J) is satisfied. Here, we introduce a concept that we call “exclusive”. We define P (I, J|γi ) as being “exclusive” if P (I, J|γi ) satisfies the following conditions for all γi (1 ≤ i ≤ n): ∀(I0 , J0 ) {(I0 , J0 )|P (I0 , J0 |γi ) = 0} , P (I0 , J|γk ) = 0 ∧ P (I, J0 |γk ) = 0. γk ∈Γ
k=i
J
γk ∈Γ
k=i
I
Using the “exclusive” condition, B(X) can be decomposed into
(2)
A New Similarity Measure for Nonrigid Volume Registration
B(X) =
495
Wγ · Tγ (X)
γ∈Γ
= Wγ1 · Tγ1 (X) + Wγ2 · Tγ2 (X) + · · · + Wγn · Tγn (X),
(3)
where Tγi (X) is maximum when γi is correctly registered (that is, when X = P (I, J|γi )), and Wγi is its weight coefficient satisfying γ∈Γ Wγ = 1. By substituting Po for X in B(X), we have B(P0 ) = Wγ · Tγ (P0 ) γ∈Γ
= Wγ1 · Tγ1 (P0 ) + · · · + Wγi · Tγi (P0 ) + · · · + Wγn · Tγn (P0 ).
(4)
Using the exclusive condition, Tγi (P0 ) is described as Tγi (P0 ) = Tγi (P (I, J|γ1 ) · P (γ1 )) + Tγi (P (I, J|γ2 ) · P (γ2 )) + · · · + Tγi (P (I, J|γi ) · P (γi )) + · · · + Tγi (P (I, J|γn ) · P (γn )) = 0 + 0 + · · · + P (γi ) · Tγi (P (I, J|γi )) + · · · + 0 (5) = P (γi ) · Tγi (P (I, J|γi )). Therefore, Tγi (X) should be maximum when X = P (I, J|γi ). 2.2
Realistic Case: Assuming Joint Distribution of One Target Tissue Is Known
Here, we consider a more realistic case. We assume that our target for registration is only liver tissue. Let the tissue set Γ consist of only two tissues, liver (L) and others (O), where O represents all the tissues except liver. When the occurrence probability of liver is P (L) = α, that of the others is P (O) = 1 − α. Using Equation (1), we therefore have Pr (I, J) = α · P (I, J|L) + (1 − α) · P (I, J|O) .
(6)
As a practical supposition, we assume that the joint conditional distribution of liver tissue, P (I, J|L), is known, while P (I, J|O) is unknown. By assuming that P (I, J|L) and P (I, J|O) are exclusive, we have B(X) = WL · TL (X) + WO · TO (X).
(7)
By substituting Pr for X, B(Pr ) = WL · TL (Pr ) + WO · TO (Pr ) = α · WL · TL (P (I, J|L)) + (1 − α) · WL · TL (P (I, J|O)) +α · WO · TO (P (I, J|L)) + (1 − α) · WO · TO (P (I, J|O)) .
(8)
Since P (I, J|L) and P (I, J|O) are exclusive, TL (X) and TO (X) should satisfy the following conditions:
496
1 2 3 4
J. Masumoto et al.
TL (X) is zero when X = P (I, J|O). TO (X) is zero when X = P (I, J|L). TL (X) is maximum when X = P (I, J|L). TO (X) is maximum when X = P (I, J|O).
It should be noted here that P (I, J|O) is unknown. Thus, the above condition 4 should be satisfied for any possible P (I, J|O). Our aim is to derive a similarity measure satisfying the above four conditions. 2.3
Derivation of Similarity Measure for the Realistic Case
We assume that P (I, J|L) is well-approximated by the gaussian function given by 1
P (I, J|L) = 2π|Σ|
T 1 (I, J) − (I, J) Σ −1 (I, J) − (I, J) 2 e , −
(9)
where (I, J) and Σ are the average values and covariance matrix, respectively. In order to obtain an approximated similarity measure satisfying the above four conditions, we use {FL (I, J) · X(I, J)} TL (X) = I,J
TO (X) =
{FO (I, J) · X(I, J)} ,
(10)
I,J
where T 1 (I, J) − (I, J) Σ −1 (I, J) − (I, J) 2 FL (I, J) = e −
(I−I )2 − 2σ I
FO (I, J) = 1 − max e
(J−J )2 − 2σ J
,e
.,
(11) (12)
in which I and J are average values of the projections of P (I, J|L) onto the I-axis and J-axis, and σI and σJ are their variances. Finally, we have the similarity measure B(X) given by B(X) = β · TL (X) + (1 − β) · TO (X).
3 3.1
(13)
Experiments Method for Estimating Joint Distribution of the Liver
We have assumed that the joint distribution of a target tissue is known. To apply the theory described in the previous section, a practical method of estimating the
A New Similarity Measure for Nonrigid Volume Registration
497
Fig. 1. Method for estimating joint distribution of the liver. (a) Volume of interest (VOI) used for the estimation. (b) Estimated FL (I, J) ( Equation (11)). (c) Estimated FO (I, J) ( Equation (12)).
joint distribution of a target tissue from two unregistered volumes is necessary. The field of view (FOV) for abdominal CT scans is usually set based on the spine position. We set the volume of interest (VOI) so that it would be mostly occupied by liver tissue (Fig. 1(a)). The position of the VOI could be fixed for each patient since the position of the liver relative to the spine was not greatly different in each case. We estimated the averages (I, J) and covariance matrix Σ of the joint probability distribution P (I, J|L) of Equation (9) by analyzing the joint histogram inside the VOI of the two volumes. (I, J) and Σ were estimated from the histogram region whose center was the mode of the joint histogram and whose horizontal and vertical widths were three times the full width half maximum (FWHM) values of 1D histograms projected onto the I- and J-axes, respectively. Although the two volumes were not registered at this point, it still gave a good approximation. Figure 1 shows an example of the above estimation. 3.2
Registration Method
Nonrigid volume registration methods are typically comprised of three steps: definition of the similarity measure, representation of the deformation, and maximization of the defined similarity measure. With respect to the latter two steps, we employed an existing nonrigid registration method using free-form deformation by a hierarchical B-spline grid proposed by Rueckert et al. [2]. The hierarchical grid consisted of three levels: 42 mm, 21 mm, and 10.5 mm. We embedded the proposed similarity measure and the entropy correlation coefficient (ECC), which is essentially equivalent to normalized mutual information [8], into the registration method, and compared these two different similarity measures. The parameter value employed in Equation (13) was β = 0.5. 3.3
CT Data Sets
Eight data sets of dynamic CT scans of the liver acquired at Osaka University Hospital and the National Cancer Center were used for performance evaluation.
498
J. Masumoto et al.
Fig. 2. Illustrative examples of registration results. (a) Left: ECC (which is equivalent to normalized mutual information). Right: Proposed similarity measure. (b) Left: ECC. Right: Proposed similarity measure. Table 1. Summary of evaluation results. The quality of the registration results is ranked into five groups based on the visually observed discrepancy: A (discrepancy 0 – 2 mm), B (2–4 mm), C (4–6 mm), D (6–8 mm), E (8– mm). Case #
Imaging conditions Thickness (Interval) FOV Phase 1
1 2 3 4 5 6 7 8
2.5 2.5 2.5 2.5 2.0 2.0 2.0 2.0
mm mm mm mm mm mm mm mm
(1.25 mm) (1.25 mm) (1.25 mm) (1.25 mm) (1.0 mm) (1.0 mm) (1.0 mm) (1.0 mm)
34 34 34 34 28 32 32 32
× × × × × × × ×
34 34 34 34 28 32 32 32
cm2 cm2 cm2 cm2 cm2 cm2 cm2 cm2
early arterial early arterial early arterial early arterial pre-contrast pre-contrast pre-contrast pre-contrast
Evaluation Phase 2 Initial Proposed ECC portal portal portal portal portal portal portal portal
E C E C D C B E
A A A A A A A E
C B C A C B A E
The imaging conditions are summarized in Table 1. Each CT data set originally consisted of volumes at three or four different time-phases, out of which two phases were registered. One was before the injection of the contrast material (pre-contrast) or the early arterial phase (when the effect of the contrast material is small); the other was the portal phase (when the effect of contrast enhancement is large). Because the volumes at these two phases were not acquired in a single breath-hold, there was a possibility of deformation between them due to respiratory motion. The original volume size was 512 × 512 × 150−200 (voxels), which was reduced to half size along each axis direction. 3.4
Results
Table 1 summarizes the evaluation results for the eight data sets. In the evaluation, we classified the quality of the registration results into five groups (see caption of Table 1) based on visually observed discrepancy throughout the volumes. Registration error was reduced from the initial states in the both proposed and ECC measures, but the proposed similarity measure was more effective. Fig-
A New Similarity Measure for Nonrigid Volume Registration
499
ure 2 shows illustrative examples of comparisons between the proposed measure and ECC. Two volumes are displayed using the checker-board method. In Fig. 2(a), tissue slide between the liver and the gallbladder is evident. Using the proposed method, the boundaries of the liver are continuous in the checker-board display, which means they are well-registered, whereas the boundaries are not well-registered in the two volumes using ECC. In Fig. 2(b), the ribs are wellregistered but the liver is not using ECC. In this case, even though the ribs and liver are in close proximity, their motions were largely dissimilar and there is discontinuity in the deformation field between them. The liver is well-registered using the proposed measure since it tracked only liver tissue. However, it should be noted that the boundary of the ribs is not well-registered since the joint distribution of the ribs (bone tissue) differs from the known distribution.
4
Discussion
For application to dynamic CT data of the liver, the proposed measure showed better results than ECC. The reason is considered to be that the new method can more effectively deal with tissue slide. Using the proposed measure, the registration process does not try to register the entire volume but only those regions having the known joint distribution. It simply ignores non-target tissues. Consequently, it is not affected by discontinuities in the deformation field that occur at boundaries of two tissues. In fact, the rib boundary (Fig. 2(b)) was not well-registered using the proposed method, but this is not considered disadvantageous because the aim is to register only the target (i.e. liver) tissue. The proposed similarity measure assumes that the joint distribution of the target tissue is known. One problem is how this should be estimated. The method using histogram analysis of the fixed VOI, explained above in section 3.1, was quite effective so long as the relative position of the target tissue in the FOV was roughly determined. We applied the method to CT data sets acquired at two hospitals and confirmed that the liver region inside the VOI was more than 50% of the entire VOI. The estimation was successful with all the data sets used in our experiments. It should be noted that the boundaries of the target tissue are appropriately registered based on the proposed similarity measure, though the information provided on intensity patterns may be insufficient for registering the inner part of the tissue. The deformation field is considered to be estimated mostly based on B-spline interpolation in the inner part. One approach to addressing this problem would be to use a biomechanically appropriate interpolation method rather than B-splines.
5
Conclusion
We have proposed a novel similarity measure for volume registration when the joint distribution of a target tissue is known. Application of the proposed measure to dynamic CT data sets of the liver confirmed that it could effectively deal with
500
J. Masumoto et al.
tissue slide without the need for any pre-segmentation or manual interaction. We further showed a method for estimating a good approximation of the joint distribution of the target tissue from two unregistered volumes. The proposed measure works well for registering the boundaries of the target tissue, while the registration of the inner part of the tissue is estimated mostly based on B-spline interpolation. Future problems include quantitative evaluation of the proposed similarity measure and developing a post-processing method able to register the inner part of a tissue by taking intensity patterns into account.
Acknowledgements This work was partly supported by JSPS Research for the Future Program JSPSRFTF99I00903 and JSPS Grant-in-Aid for Scientific Research (B)(2) 12558033.
References 1. Andress Carrillo, Jefrey L. Duerk, Jonathan S. Lewin, and David L. Wilson. Semiautomatic 3-D Image Registration as Applied to Interventional MRI Liver Cancer Treatment. IEEE Trans. Med. Imaging, 19(3):175-185, 2000. 2. D.Rueckert, L.I.Sonoda, C.Hayes, D.L.G.Hill, M.O.Leach, D.J.Hawkes. Nonrigid Registration Using Free-Form Deformations: Application to Breast MR Images. IEEE Trans. Med. Imaging, 18(8):712-721, 1999. 3. H. Lester and S.R. Aridge. A survey of hierarchical non-linear medical image registration. Pattern Recognition, 32:71-86, 1999. 4. Mi Chen, Takeo Kanade, Dean Pomerleau, Jeff Schneider. 3D Deformable Registration of Medical Images Using a Statistical Atlas. Lecture Notes in Computer Science, 1679 (MICCAI’99): 621-630, 1999. 5. Yongmei Wang, Lawrence H Staib. Physical model-based non-rigid registration incorporation statistical shape information. Medical Image Analysis, 4:7-20, 2000. 6. Andrea Schenk, Guido Prause, and Heinz-Otto Peitgen. Efficient Semiautomatic Segmentation of 3D Objects in Medical Images. Lecture Notes in Computer Science, 1935 (MICCAI2000): 186-195, 2000. 7. William M. Wells III, Paul Viola, Hideki Atsumi, Shin Nakajima and Ron Kikinis. Multi Modal volume registration by maximization of mutual information. Medical Image Analysis, 1(1):35-51, 1996. 8. Josien P.W. Pluim, J.B. Antoine Maintz, and Max A. Viergever. Interpolation Artifacts in Mutual Information-Based Image Registration. Computer Vision and Image Understanding, 77:211-232, 2000. 9. Mark Holden, Derek L. G. Hill, Erika R. E. Denton, Jo M. Jarosz, Tim C. S. Cox,Trosten Rohlfing, Joanne,Goodey, David J. Hawkes. Voxel Similarity Measures for 3-D Serial MR Brain Image Registration. IEEE Trans. Med. Imaging, 19(2):94102, 2000. 10. Alexis Roche, Greegoire Malandain, Nicholas Ayache, and Sylvain Prima. Toward a Better Comprehension of Similarity Measures Used in Medical Image Registration. Lecture Notes in Computer Science, 1679 (MICCAI’99): 555-566, 1999. 11. Michael E.Leventon and W.Eric L. Grimson. Multi-Modal Volume Registration Using Joint Intensity distributions. Lecture Notes in Computer Science, 1496 (MICCAI’98): 1057-1066, 1998.
2D-3D Intensity Based Registration of DSA and MRA – A Comparison of Similarity Measures John H. Hipwell1 , Graeme P. Penney1 , Tim C. Cox2 , James V. Byrne3 , and David J. Hawkes1 1
Division of Radiological Sciences, UMDS, Guy’s & St Thomas’ Hospitals London SE1 9RT, UK {
[email protected]} 2 National Hospital for Neurology and Neurosurgery, Department of Radiology Queens Square, London, WC1N 3BG, UK 3 Department of Radiology, University of Oxford, The Radcliffe Infirmary, Oxford Oxford, OX2 6HE, UK
Abstract. We have compared the performance of six similarity measures for registration of three-dimensional (3D) magnetic resonance angiography (MRA) to two-dimensional (2D) x-ray angiography images of the cerebral vasculature. The accuracy and robustness of each measure was investigated using a ground truth registration of a neuro-vascular phantom which was obtained using fiducial markers, and using “gold-standard” registrations of four clinical data sets calculated using manual alignment by a neuro-radiologist. Of the six similarity measures, pattern intensity, gradient difference and gradient correlation performed consistently accurately and robustly for all data sets. Using these similarity measures, and for starting positions within 8◦ rotation, 3mm in-plane translation and 50mm out-of-plane translation from the gold-standard/ground-truth positions, we obtained a success rate of greater than 80% for the clinical data sets, whilst none of the phantom registrations failed. The root-mean-square (rms) target reprojection error was less than 1.3mm for the clinical data sets. The rms target reprojection error for the phantom images was less than 1mm when using the most accurate similarity measures.
1
Introduction
Registration of interventional digital subtraction angiography (DSA) to pre-operative magnetic resonance angiograms (MRA) can greatly enhance visualisation during minimally invasive neuro-interventions and introduces potentially useful complementary information such as three-dimensional (3D) blood flow. Whilst there have been a number of papers describing 2D-3D registrations of MRA and x-ray images, these studies have tended to favour a feature based approach in which, for instance, 2D and 3D vascular skeletons are extracted and matched using a suitable distance metric [3,4,5]. In this paper we apply the intensity-based registration of Penney et. al. [6] to the registration of MRA and DSA of the cerebral vasculature. In order to determine the most appropriate similarity measure for this new application, we compare the performance of six measures when applied to the registration of both a physical phantom and routinely acquired clinical images. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 501–508, 2002. c Springer-Verlag Berlin Heidelberg 2002
502
J.H. Hipwell et al.
2 The Registration Algorithm Six rigid-body extrinsic parameters describe the position (X, Y , Z) and orientation (θx , θy , θz ) of the 3D data set. These parameters are iteratively varied and digitally reconstructed radiographs (DRRs) generated and compared to the DSA image using a suitable similarity measure. DRRs of the vasculature are produced by casting rays through the segmented 3D data set, from the x-ray source to each pixel location in the DSA image. As each ray passes through a spherical volume of interest (VOI) selected by the user, the intensities of the intercepted voxels are integrated and projected onto the imaging plane to produce a DRR. A gradient descent search strategy [6] is used to search the extrinsic parameter space to optimise the similarity measure. To reduce processing time and improve the robustness of the algorithm, a multi-resolution strategy has been adopted and a pair of concentric regions of interest (ROI) are specified. The smaller of the two ROI is used to obtain an initial approximate match between the images and the larger is used to refine this registration. The radii of these ROI are set to a quarter and a half of the projected VOI radius. We have compared the performance of normalised cross correlation (CC) gradient correlation (Grad. CC) entropy of the difference image (Entropy), mutual information (Mut. Info.), pattern intensity (Pat. Int.) and gradient difference (Grad. Diff.) when used to quantify the similarity of the DRR and DSA images. Please refer to [6] for an overview of these measures.
Fig. 1. The neuro-vascular phantom. Left: surface rendering of a thresholded CT scan of the phantom with the positions of the eight fiducial markers clearly visible. Middle: the maximum opacity image from the DSA sequence showing the concentric ROI masks used. Right: DRR corresponding to a registration with 0.5mm target reprojection error.
3
Phantom Experiment
In order to assess the accuracy of the algorithm we have applied it to images acquired from a physical silicon neuro-vascular aneurysm phantom (middle cerebral artery bifurcation aneurysm, figure 1). This phantom was made by Professor D. Rufenacht and Dr. K. Tokunaga of the Division of Neuro-Radiology, Geneva University Hospital, Switzerland. It was mounted in a perspex box and filled with a 15% (by weight) aqueous solution of gelatin to give realistic x-ray attenuation and scatter.
2D-3D Intensity Based Registration of DSA and MRA
503
Fig. 2. Bottom: DRRs of the gold standard registrations of the clinical data sets (left to right, patients 1 to 4). Top: The maximum opacity images of the DSA sequence (inverted) to which the DRRs are registered showing the concentric ROI masks used.
DSA images were acquired using an Advantx DX (GE Medical Systems) x-ray set (two views). The PAL composite video output from the x-ray set was digitised via a Pulsar frame capture card (Matrox Imaging) and saved via a PC workstation as 512 × 512 pixel matrix, 8 bit grey-level images. The x-ray tube voltage was 85 kV and the phantom was placed in the isocentre of the x-ray system. Two views were acquired with the C-arm orientated at approximately ±45◦ to the vertical. A distortion-correction phantom and software were used to correct for pincushion distortion in the fluoroscopy images. Phase-contrast MR angiography of the phantom was performed using a GE Medical Systems Signa Echospeed 1.5T. The acquired image contained 256 × 256 × 124 voxels, each with dimensions 0.86×0.86×1.0 mm. Blood vessels were segmented as described in [1] to produce a binary image (the 3D phantom model). The intrinsic perspective parameters of the “ground truth” registration were calculated from images of a 60mm acrylic calibration cube in which 14 radio-opaque ball-bearings are embedded at each of the vertices and at the centers of each face. The extrinsic rigid-body parameters were calculated using eight fiducial markers attached to the perspex box containing the phantom. The fiducial markers consisted of a post to which two different types of acrylic imaging caps could be attached. The MR imaging cap contained a void which was filled with contrast fluid (0.5 mM Gadolinium). The x-ray imaging cap contained a divot which contained a 3mm diameter steel ball bearing. The caps have been accurately manufactured so that the centre of the ball coincides with the centre-of-gravity of the contrast fluid.
4
Clinical Validation
We obtained clinical MRA and DSA images from three patients undergoing treatment for cerebral aneurysms and one patient with an arteriovenous malformation (AVM).
504
J.H. Hipwell et al.
Table 1. Displacements of the test registration starting positions from the ground truth (phantom) or gold standard (clinical data) registrations, in terms of the extrinsic parameters X, θx , θy and θz . Also given are the mean target reprojection errors for these starting positions. Start Position 1 2 3 4
δX ±25 mm ±50 mm ±75 mm ±100 mm
δθx ±4◦ ±8◦ ±12◦ ±16◦
δθy ±4◦ ±8◦ ±12◦ ±16◦
δθz No. of Reg’ns Mean Reproj. Error (mm) ±4◦ 16 2.4 ±8◦ 16 4.7 ±12◦ 16 6.9 ±16◦ 16 9.1
Digital subtraction angiograms were obtained for all patients (two views per patient) using a GE Medical Systems Advantx DX x-ray set. These images were acquired using a Matrox Meteor II frame grabber, captured at half second intervals and saved as 512×512 pixel matrix, 8 bit grey-level images. A distortion-correction phantom and software were used to correct for pincushion distortion of the acquired images. Phase-contrast MR angiography was performed using a Siemens Magnetom Vision 1.5T. The acquired images contained 256 × 256 × 64 voxels with a resolution of 0.8 × 0.8 × 0.5 mm. Blood vessels were segmented from three of these data sets as described in [1] and from the fourth as described in [2] to produce binary images (the clinical 3D models). No perspective calibration cube images were available for these clinical data sets, so the four intrinsic parameters were estimated from the known focal length and image resolution of the x-ray set. This estimation is not expected to introduce significant errors into the registration, however any errors that are present will be included in the estimated target reprojection error calculation (section 5). The extrinsic parameters of the gold standard registrations were generated via manual manipulation of a surface rendering of the 3D model using a interactive, graphical tool. To assess the reproducibility of the gold standard registrations two of the data sets were chosen and eight additional manual registrations were carried out by two observers. The first observer was a consultant neuro-radiologist (JVB) and the second a research fellow in medical imaging science (JHH). The mean reprojection error (calculated over the points described in section 5) between the gold standard and these manual registrations was calculated to be 1.7mm (standard deviation 0.4mm).
5
Registration Accuracy and Robustness Experiments
From the phantom ground truth registration, and the gold standard registrations for the clinical data, a total of 64 starting positions were generated by altering the positions of the 3D data sets using the perturbations given in table 1. The in-plane translation (δY or δZ) is assumed eliminated using a trivial manual alignment procedure, however, we simulate errors in this alignment by introducing a random perturbation (3-mm standard deviation) of the in-plane (Y and Z) position. In order to assess the performance of the registrations, sets of between 12 and 18 target points were chosen by two consultant neuro-radiologists on the 3D phantom and clinical models. These points coincided with features such as bifurcations and points of
2D-3D Intensity Based Registration of DSA and MRA Clinical Registration Robustness 100
90
90
80
80
70
70
Success Rate (%)
Success Rate (%)
Phantom Registration Robustness 100
60 50 40 30
10 0
0
1
50 40
CC Entropy Pat. Int. Grad. CC Grad. Diff Mut. Info.
20 10
2
3
4
Start Position - Distance from "Ground Truth"
Mon Mar 4 19:16:51 2002
60
30
CC Entropy Pat. Int. Grad. CC Grad. Diff Mut. Info.
20
505
5
0
0
1
2
3
4
Start Position - Distance from "Gold Standard"
5
Mon Mar 4 19:13:14 2002
Fig. 3. Robustness results for the phantom (left) and clinical data sets (right), comparing the performance of the six similarity measures when registering the segmented MRA to the maximum opacity DSA images.
high vessel curvature. For each registration the mean target reprojection error of these points was calculated with respect to the corresponding ground truth or gold standard registration. If this mean error was greater than 4mm then the registration was deemed to have failed. Two images were generated from each image sequence. The first was a single, approximately mid-sequence frame exhibiting good opacity of all arterial blood vessels. The second was a maximum opacity image in which the intensity of each pixel was set equal to the maximum opacity achieved during the DSA sequence.
6 Results The results for registering the phantom MRA to the maximum opacity DSA images are summarised in figure 3 (left) and table 2 (top). 100% success rates have been obtained for the two closest starting positions using pattern intensity, gradient correlation and gradient difference. These measures fail more often than entropy, however, as the starting position is moved further from the ground truth. Mutual information performs less well than these measures but correlation is the least successful. Pattern intensity and gradient difference are the most accurate of the similarity measures, both achieving target reprojection errors of less than 1mm for all successful registrations. The results for registering the clinical MRA data sets to the maximum opacity DSA images for all four patients are summarised in figure 3 (right) and table 2 (bottom). There
506
J.H. Hipwell et al.
Table 2. Target reprojection error results for the phantom (top) and clinical data sets (bottom), comparing the performance of the six similarity measures when registering the segmented MRA to the maximum opacity DSA image. Start Posn. 1 2 3 4
CC 1.59 (0.01) 1.66 (0.13) 2.10 (0.25) 2.13 (0.39)
Start Posn. 1 2 3 4
CC 1.26 (0.01) 1.38 (0.00) 1.67 (0.03) 1.64 (0.00)
Phantom Reprojection Errors in mm (SD). Entropy Pat. Int. Grad. CC Grad. Diff. 1.08 (0.02) 0.89 (0.05) 0.90 (0.03) 0.88 (0.02) 1.14 (0.01) 0.88 (0.05) 1.25 (0.36) 0.90 (0.05) 1.10 (0.06) 0.94 (0.05) 1.10 (0.03) 0.99 (0.08) 1.20 (0.07) 0.94 (0.00) 1.42 (0.13) 0.91 (0.05) Clinical Reprojection Errors in mm (SD). Entropy Pat. Int. Grad. CC Grad. Diff. 1.69 (0.02) 1.12 (0.04) 1.18 (0.04) 1.28 (0.02) 1.99 (0.07) 1.15 (0.04) 1.08 (0.04) 1.26 (0.04) 2.17 (0.20) 1.24 (0.04) 1.26 (0.03) 1.39 (0.02) 2.02 (0.06) 1.35 (0.08) 1.34 (0.10) 1.58 (0.01)
Mut. Info. 1.25 (0.07) 1.38 (0.06) 1.45 (0.35) 1.11 (0.08) Mut. Info. 1.15 (0.05) 1.20 (0.05) 1.36 (0.25) 1.74 (0.03)
are a number of differences between these results and those obtained for the phantom. The first is that entropy performs markedly worse than all the other measures for this clinical data, whereas it was at least as good if not better than the majority of the other measures for the phantom data. Of the other measures gradient difference performs consistently well for the clinical data sets, followed by pattern intensity, which out-performs gradient difference for start position 2, and gradient correlation. For these clinical data sets pattern intensity and gradient correlation achieve the lowest target reprojection errors of 1.1 to 1.4mm. The target reprojection errors of gradient difference and mutual information are less than 1.3mm for start positions 1 and 2 but rise more steeply for more distant start positions. Entropy produces the least accurate registrations. The results for registering to the single DSA images were similar to those obtained for registration to the maximum opacity images. In nearly all cases, however, registering to the maximum opacity DSA images resulted in a higher success rate (by up to 10%) compared to registering to the single mid-sequence frame. The maximum opacity registrations were also more accurate (by up to 0.5 mm in some cases). This result is not unexpected as these images will have higher contrast and signal-to-noise ratio than the single frames.
7
Discussion
We have found that gradient correlation, pattern intensity and gradient difference perform best of the six similarity measures compared. This is in agreement with the findings of Penney et. al. [6] for the comparison of these measures when used to register computed tomography (CT) to fluoroscopy images of a spine phantom. This is despite the large differences in modality and anatomy between these two applications, and the fact that none of these similarity measures have been specifically developed for application to MRA-DSA registration.
507
No. of Registrations
2D-3D Intensity Based Registration of DSA and MRA
Similarity Measure Mon Mar 4 15:58:44 2002
Fig. 4. Registration of patient 4, view 2. Left: Histogram of final values of the gradient difference similarity measure. Right: Comparison of registrations (with sub-regions enlarged). Central column: maximum opacity DSA image to which the MRA is registered. Left column: the goldstandard registration. Right column: the “best” gradient difference registration, i.e. that producing the highest value of the gradient difference similarity measure.
We have found that the success rate of these registrations falls off rapidly once the start position exceeds position 2, that is 8◦ rotation, 3mm in-plane translation and 50mm out-of-plane translation from the gold-standard/ground-truth registration. This is representative of intensity-based registration algorithms which have a “capture range” within which a certain fraction of corresponding features must be approximately aligned. However, we have found that manual alignment to within these tolerances can be rapidly and easily achieved using an interactive tool. The registration success rates varied significantly between the four clinical data sets. For patients 1, 2 and 3, for instance, 90% of the registrations obtained using the gradient difference similarity measure were successful. For the second view of patient 4, however, only 48% succeeded. The histogram of similarity measure values for these registrations of patient 4 (figure 4) reveals a small cluster of 9 of the 128 registrations that all have consistently high similarity measures and very similar extrinsic parameter values. This mean registration position differs from the gold-standard position by 10◦ , 6◦ and 10◦ rotations about the x, y and z axes respectively. Visual comparison of these two registration positions, however, (figure 4) suggests that the position found by the algorithm is much more accurate than the gold-standard position. The algorithm currently takes approximately 10 minutes running on a 1.2 GHz AMD processor PC with 1 GByte of RAM, however, considerable speed-up could be achieved using techniques such as shear-warp factorisation. The timing variation with different similarity measures was found to be negligible.
8
Conclusions
We have applied an intensity based 2D-3D registration algorithm to the multi-modality alignment of MRA and DSA images. Of the six similarity measures compared, gradient
508
J.H. Hipwell et al.
difference, pattern intensity and gradient correlation performed consistently accurately and robustly for all data sets. Using these similarity measures, and for starting positions within 8◦ rotation, 3mm in-plane translation and 50mm out-of-plane translation from the gold-standard/ground-truth positions, we obtained a success rate of greater than 80% (less than 4mm target reprojection error) for the clinical data sets. Whilst none of the phantom registrations failed. The root-mean-square (rms) target reprojection error of the clinical registrations was less than 1.3mm (less than the 1.7mm reprojection error estimated for the gold standard registrations) and for the phantom images less than 1mm when using the most accurate similarity measures.
Acknowledgments We would like to thank Kawaldeep Rhode, Robert McLaughlin, Albert Chung and Paul Summers for their assistance in acquiring the images used in this paper and also Kawaldeep Rhode for distortion correcting the DSA images used. This research is funded by EPSRC grant GR/M55015.
References 1. A.C.S. Chung and J.A. Noble. Statistical 3D vessel segmentation using a Rician distribution. In Proc. MICCAI, pages 83–89, 1999. 2. A.C.S. Chung, J.A. Noble and P. Summers. Fusing speed and phase information for vascular segmentation in phase contrast MR angiograms. In Proc. MICCAI, pages 166–175, 2000. 3. J. Feldmar, G. Malandain, N. Ayache, S. Fernandezvidal, E. Maurincomme and Y. Trousset. Matching 3D MR angiography data and 2D x-ray angiograms. In Proc. CVRMed/MRCAS, pages 129–138. Berlin, Germany: Springer-Verlag, 1997. 4. Y. Kita, D.L. Wilson and J.A. Noble. Real-time registration of 3D cerebral vessels to x-ray angiograms. In Proc. MICCAI, pages 1125–1133, 1997. 5. A. Liu, E. Bullitt and S.M. Pizer. 3D/2D Registration via skeletal near projective invariance in tubular objects. In Proc. MICCAI, pages 952–963, 1998. 6. G.P. Penney, J. Weese, J.L. Little, Desmedt P., D.L.G. Hill, and D.J. Hawkes. A comparison of similarity measures for use in 2D-3D medical image registration. IEEE Transactions on Medical Imaging, 17((4):586–595, 1998. 7. G.P. Penney, P.G. Batchelor, D.L.G. Hill, D.J. Hawkes. and J. Weese. Validation of a two- to three-dimensional registration algorithm for aligning preoperative CT images and intraoperative fluoroscopy images. Medical Physics, 28(6):1024–1032, 2001.
Model Based Spatial and Temporal Similarity Measures between Series of Functional Magnetic Resonance Images Ferath Kherif1,2 , Guillaume Flandin1,3 , Philippe Ciuciu1,2 , Habib Benali2,4 , Olivier Simon2,5 , and Jean-Baptiste Poline1,2 1
2
Service Hospitalier Fr´ed´eric Joliot, CEA, 91401 Orsay, France {kherif,poline}@shfj.cea.fr Institut F´ed´eratif de Recherche 49 (Imagerie Neurofonctionnelle), Paris, France 3 INRIA, Epidaure Project, Sophia Antipolis, France 4 INSERM U 494, CHU Pitie-Salpetriere, Paris, France 5 INSERM U 334, CEA, 91401 Orsay, France Abstract. We present a method that provides relevant distances or similarity measures between temporal series of brain functional images. The method allows to perform a multivariate comparison between data sets of several subjects in the time or in the space domain. These analyses are important to assess globally the inter subject variability before averaging subjects to draw some conclusions at the population level. We adapt the RV-coefficient to measure meaningful spatial or temporal similarities and use multidimensional scaling for visualisation.
1
Introduction
Functional brain imaging has been an extremely active field of research during the last fifteen years, first with Positron Emission Tomography and more recently with the advent of functional Magnetic Resonance Imaging (fMRI) because of their potential for the understanding of the human brain functions organisation. An fMRI experiment consists for one subject, in the acquisition of a large number (100 to 1500) of 3D volumes (64x64x32) measuring a parameter related to the brain neural activity in each voxel. The subject is submitted to a experimental paradigm consisting in different conditions designed to study a particular brain system (e.g. memory, language, vision ...). A entire study consists in the acquisition of data for approximately 10 to 30 subjects. The most challenging problem of the neuro-imaging field is to extract the relevant information in this vast amount of data. It is especially important to be able to draw some conclusions from the study across subjects, therefore at the population level. This is complex because of the anatomical and functional differences between subjects. A standard way to analyse multi-subjects data is to summarise the relevant information per subject in one brain volume (for instance the average difference between condition A and B) and use the inter subject variance to infer results at the population level (the so-called random effect group analyses) [8]. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 509–516, 2002. c Springer-Verlag Berlin Heidelberg 2002
510
F. Kherif et al.
These multi-subjects analyses are based on the assumption that subjects are drawn from a single population, and therefore assumes some homogeneity between subjects. Clearly, this assumption may not be verified. Subjects are not necessarily homogeneous in the spatial domain (different brain regions are activated) or in the time domain (the time courses of the brain responses are different for a common experimental paradigm for different subjects). This non homogeneity can be due to many factors, including different strategies across subjects, or acquisition differences that cannot be controlled. If the group studied is not homogeneous this may lead at best to less efficient analyses and at worst to erroneous results and interpretations [8]. Although this topic is clearly of major importance for the analysis of fMRI data, it has so far received little attention. This is probably due to both the complexity and the amount of data to be analysed that depend on the experimental paradigm and the noise characteristics. In this paper, we introduce a general technique to derive relevant distances between series of 3D fMRI brain images in order to assess their similarity in the time or spatial domain. The technique is based on the RV-coefficient adapted for this purpose and the use of the multi-dimensional scaling to visualize the group structure. It is flexible, and allows to compare data sets (e.g. subjects) in a the light of a specific question relating to the experimental paradigm in a reasonable computational time. In the following, we first briefly review possible distances or similarity measures and discuss their pros and cons in relation with their application to neuroimaging data. In the section 2.2 we introduce the selected distance based on the RV-coefficient. In section 2.4 we present the experimental data set and some results on the differences in the spatial and temporal domains between subject for this fMRI study.
2 2.1
Methods Candidate Similarity Measures or Distances Given the Data and Problem Specificities
In this section, we briefly present an overview of measures or distances that could be used for comparing series of images. We first review some characteristics of our data and some desirable features for the distance measures. Data and Problem Specificities. (C1 ) The data originating from different subjects may have different number of voxels. Conversely, if images are put in a common spatial reference (realigned to an atlas) the number of scans (time dimension) may not be the same across subjects. (C2 ) When addressing the temporal (resp. spatial) aspect of the data, the mean image (data averaged across time, resp. the mean time course) does not convey any meaningful information.
Spatial and Temporal Similarity Measures
511
(C3 ) Time series at different position in the brain may have different variance due to physiological reasons. The distance looked for should be insensitive to a voxel per voxel variance scaling and to the overall data variance. (C4 ) Estimated covariance structure in the time domain can be inverted (although this may not be advised given the estimation noise). This inversion is not possible in the spatial domain. (C5 ) Data may have non Gaussian components. (C6 ) Similarity measure computations have to be fast enough to be used by clinicians or brain scientists. This is a challenge given the size of the data. (C7 ) The measure should be able to include information from the experimental paradigm and from the known noise characteristics of the data. This is mainly addressed in section 2.3 through the modeling of the experimental variance due to the paradigm.
Some Candidate Similarity Measures. In this section, we only address the comparison in the time domain and mention the comparison in the space domain when dimensions can not be simply swapped. The presentation follows an ”incremental” line of thought. Distance measures successively address different (more complex) aspects of the data. Let Yi be the sample data for the i th subject represented as a matrix with ni rows (voxels dimension) and ti columns (scans or time dimension) with ni ti . The corresponding (time x time) sample covariance matrices are denoted Σi for the i th subject. Σi+j denotes the pooled covariance matrix between the i th and the j th subjects. • Mahalanobis distance D2 . This widely used distance is most meaningful when data are multi-normal and measures the weighted distance between data sets means [1]. It relies on the inversion of the (common) covariance matrix of −1 (Y i − Y j ). It can be tested the data and is computed with D2 = (Y i − Y j )t Σi+j for the null hypothesis that the two groups have the same mean through an F-test. Clearly, this distance can only be computed in the time dimension (cf (C4)) and can not reflect complex links. • Covariance equality test and distance. Once the data means have been compared, a likelihood ratio statistic such as the Box’M can be conducted to test the hypothesis that covariance matrices are equals [1]. The issuing B coefficient −M (B = e df ) can be used as a distance measure between covariance matrices. Two data sets have similar density volumes if B is close to one. This test is meaningful with multi-normal data but is not robust otherwise (cf (C5)). • Canonical Correlation Analysis (CCA). CCA is used to identify linear relations between two data sets [1] found to have an overall link with the covariance distance. CCA finds successive sets of pair of linear combinations (one canonical eigenvector per data set) that explain best this relation and the corresponding canonical roots inform about the relation strength. It is therefore more general than the Mahalanobis distance since the search of the linear link is done in a greater space. However, it relies on the computation of the inverse covariance
512
F. Kherif et al.
matrix of the data in the time and space domains, while the latter is not tractable (cf (C4)). • Krzanowski’s method : One problem with CCA is that it explicitly searches for linear links between two data sets corrected for their covariance structures. This might not be the most relevant comparison between fMRI data series. An alternative to this can be found in the seminal work of Krzanowski [4], who suggests to compare data sets based on the computation and comparison of the eigen-components of their covariance matrices. Comparison is performed by computing angles between sub-spaces spanned by their first principal components. The drawbacks of these methods are those of PCA analysis, they are not scale invariant and highly depend on the pre-processing steps (centering, normalisation,...), (cf (C3)). • Distribution distance : While the previous methods hold in general under multi normal assumptions and linear links, it is easy to define distances that are more general through the data sampled distributions. Several authors [5,7] have proposed such measures, related to mutual information, the expression of which is simplified if the data are normal. For example Matusita derives a separability measure between densities [7]. The difficulty lies in the efficient computation of probability densities (C7). Nethertheless, we plan to investigate these similarity measures in the future. 2.2
The RV-coefficient as a Similarity Measure.
The RV-coefficient was first described by Robert [9] for evaluating multidimensional linear association between several data sets. For each data sets, the matrix: 2 Si = Yit Yi (a time by time ti × ti ) can be considered as a point in Rti . The comparison of two data sets i, j in this space can be made by computing the RV-coefficient as follow trace(Si Sj t ) RVi,j = trace(Si Si t ) trace(Si Si t )
(1)
Escouffier [9] considers each Si as an operator and derives an inner product (and a distance metric) based on the Hilbert-Schmidt norm: |A|2 = trace(At A), for a given matrix A. In this context, the RV-coefficient is seen as the cosine of the angle between Si and Sj . The RV-coefficient can also be considered as a multivariate extension of the classical Pearson correlation coefficient. Lavit showed that if RVi,j is one, then one can derive eigen-components of data set i from data set j through an homothetic transformation [6]. For comparing fMRI data sets, it has several advantages. It reflects the linear link between data sets covariance but is normalised for the absolute amount of variance in the data (C3). Second, it can be used in both the spatial and the temporal domain (C1) and (C2). Third, it is fast to compute (C6). Fourth, it does not necessarily require the inversion a covariance structure, although such normalisation can be included when possible (C4). Lastly, it should be robust with non gaussian data (C5), and is easily adaptable to compare data
Spatial and Temporal Similarity Measures
513
sets considering a specific question (that can be put in the framework of standard fMRI analysis (C7)). This is the subject of the next section. 2.3
Model Based RV-coefficient in fMRI Analysis
Adapting the RV-coefficient. The analysis of fMRI data generally relies on the specification of an a priori model describing the expected time courses derived from the experimental paradigm. The model consists in a time by parameter matrix X (t × p), assumed to explain all deterministic temporal variations of the data. A fMRI analysis consists in linearly regressing the model at each and every voxel, and in testing a contrast of the parameters reflecting the neuroscience question under study. So called statistical parametric maps are constructed with the test statistic attributed at each voxel. The model X, used to analyse the data, can be introduced in the similarity measure. Rather than considering for each subject the covariance matrix estimated from the raw data, it is generally more meaningful to consider the covariance matrix between the data and the model (C7). This allow to compare subjects data sets depending on how well the model X predicts the data. This is obtained through the modified RV-coefficient computed with Si = Yit XX t Yi , a (p × p) matrix. More often than not, only a sub-space G of the model X is of interest (for instance the subspace representing the difference between experimental conditions). In such a case, both model and data can be projected onto this subspace. The model X becomes XG and the data Y becomes YG , leading to a RV-coefficient tuned for the specific question represented by G. The RV-coefficient allows the introduction of two metrics, M and N , respectively for the column (temporal) and row (voxels) spaces, defined respectively by: 1
t V XG )−1/2 M − 2 = (XG
N
− 12
=
diag{ˆ σ1−1 , σ ˆ2−1 , · · ·
(2) σ ˆn−1 }
(3)
The metric M corrects for the scaling differences in the model regressors and takes into account the temporal correlation represented by the (estimated or assumed) time by time matrix V . The diagonal elements of the metric N are the inverse of the square-root of the residual variances estimated for each voxel. 1 1 t YG N − 2 . This leads to compute Si with : Yi = M − 2 XG Spatial and Temporal Similarity Measures. We have so far constructed the Si matrices as a time by time cross-product matrix. If all Yi have identical number of rows, the same computation can be made in the voxel space considering Si = Yi Yi t , a ni × ni matrix. This leads to the same formulation of the RV-coefficient, and provides a similarity measure in the space domain. Computational Cost. The method based on RV-coefficient involves the computation of the trace of matrices. Due to the large amount of data in an fMRI
514
F. Kherif et al.
experiment computation and data storage can be very cumbersome. Our implementation is designed to avoid direct computation of the products between the matrices (using the Hadamard product). Only one pass through the data simultaneously for all the subjects is needed. Whenever possible, computations are performed in the model parameter space which reduces considerably computational cost. For the set of data presented in the following, computation time was of the order of a few minutes on a Sun workstation (.8Ghz, 512M RAM). Results Visualization. The two by two similarity measures Ri,j are first trans formed into a distance measure with with di,j = 2(1 − Ri,j ). A symmetric distance matrix (di,j ) with k(k − 1) distinct values is constructed and processed through Multidimensional Scaling (MDS) [2] to get the best Euclidean representation of these distances. 2.4
Experimental Paradigm and Data
Data are obtained from nine subjects who underwent a calculation task and a control task [10]. During the fMRI scanning, six blocks of 26s each alternating computation and control tasks were presented. Each subject performed two such sequences. A total of 186 scans (64x64x28 voxels per scan) were acquired per subject. The (linear) model used for analysing the data consisted in 3 regressors per condition (computation and control) derived from a standard hemodynamic response. Within this model a sub-space of interest was formed to highlight activations induced by the calculation task relatively to the control task.
3
Results and Conclusion
This section presents the results of the temporal and spatial comparisons for the nine subjects data sets using the adapted RV-coefficient to investigate intersubject distances with respect to the comparison between activation and control. For this purpose, we use equation (1) and formulas in section 2.3 with a subspace G that spanned the expected activation space. Temporal Distances. Figure 1 shows a 2D MDS representation of the temporal distance between subjects. In this case, we observe that although subjects can not be easily divided into more than one group, subjects 3 and 4 lie far apart from the group center of mass. This indicates a different temporal behaviour such that these subjects should probably be considered as outliers. These temporal differences between subjects are observed in figure 2. This figure shows first components of the output of a Multivariate Linear Model (MLM) analysis described in [11]. Components summarise the temporal behaviour and are clearly seen to be similar for two subjects (8,9) close on figure 1. Conversely, those patterns are clearly different from subject 4 component, a subject that is also found far from subject 8 and 9 on figure 1. This result is in accordance with
Spatial and Temporal Similarity Measures 1
7
0
5
4
515
6
6 8
82 9
1
9
5 0
3
7 4
3 0
2 0
Fig. 1. Inter-subject variability in terms of temporal (left panel) and spatial (right panel) distances.
Fig. 2. Illustration of the temporal variability observed in figure 1 (left) with the first temporal MLM eigencomponents.
an other study [3] that showed the particular temporal behaviour of those two subjects data. Spatial Distances. Figure 1 (right panel) shows a 2D MDS representation of the spatial distance between subjects. In this plot, part of the inhomogeneity found in the temporal domain is observed again. In particular, subjects number 1, 3, and 4 are found to be the farthest from the group center. This spatial distance is illustrated on a statistical parametric map showing the activation effect in figure 3 (one axial slice for each subject). Distances between the subject 4 and subjects 8 and 9 are mainly reflected by a greater activity in the left parietal lobe for subjects 8 and 9.
4
Conclusion
We have developed an easy to use, fast and flexible method to analyse the similarity of different subjects fMRI time series in the temporal or spatial domain, taking into account the specificities of these complex data. The method has the potential to detect outliers in the time or spatial before performing any kind of group analysis (resp. in the time or space domain), or to detect any particular
516
F. Kherif et al.
Fig. 3. Illustration of the spatial variability observed in figure 1 (right) on an axial slice.
grouping in the data that would invalidate such group analyses. In the future, the method will be coupled with clustering and outliers detection tests. The method is likely to find a number of application in clinical (e.g. helping for the diagnosis psychiatric diseases) or neuroscience context (e.g. relating the distances with genetic or phenotypic information).
References 1. T.W. Anderson. Introduction to Multivariate Statistical Analysis. John Wiley, 1984. 2. C. Gower. Multidimensional scaling displays. In New York: Praeger., editor, esearch methods for multimode data analysis. Law,H.G., 1984. 3. F. Kherif, J.B Poline, H. Flandin, G.and Benali, S. Dehaene, and K.J. Worsley. Multivariate model specification for fmri data. NeuroImage, 2002 (submitted). 4. W.J. Krzanowski. Between-groups comparison of principal components. Journal of the American Statistical Association, 74:703–704, 1979. 5. S. Kullback and Leibler Leibler. On information and sufficiency. Annals of Math. Stats., 22:79–86, 1951. 6. C. Lavit. Analyse conjointe de tableaux quantitatifs. Masson, 1984. 7. K. Matusita. Decision rules based on the distance for problems of fit. Ann. Math. Statist., 26:631–640, 1955. 8. K.M. Petersson, T.E. Nichols, J.B. Poline, and A.P. Holmes. Statistical limitations in functional neuroimaging. ii. signal detection and statistical inference. Philos Trans R Soc Lond B Biol Sci, 354(1387):1261–81, Jul 1999. 9. P. Robert and Y. Escoufier. A unifying tool for linear multivariate statistical methods: The rv-coefficient. Applied Statistics, 25:257–265, 1976. 10. O. Simon, J.F. Mangin, L. Cohen, D. Le Bihan, and S. Dehaene. Topographical layout of hand, eye, calculation, and language-related areas in the human parietal lobe. Neuron, 31(33(3)):475–87, Jan 2002. 11. K.J. Worsley, J.B. Poline, K.J. Friston, and A.C. Evans. Characterizing the response of pet and fmri data using multivariate linear models. Neuroimage, 6(4):305–19, Nov 1997.
A Comparison of 2D-3D Intensity-Based Registration and Feature-Based Registration for Neurointerventions Robert A. McLaughlin1 , John Hipwell2 , David J. Hawkes2 , J. Alison Noble1 , James V. Byrne3 , and Tim Cox4 1
Medical Vision Laboratory, Dept. Engineering Science, University of Oxford, Oxford, England
[email protected] 2 CISG, Division of Radiological Sciences, Guy’s Hospital King’s College London, London, England
[email protected] 3 Department of Radiology, Radcliffe Infirmary, Oxford, England 4 National Hospital for Neurology and Neurosurgery Queen Square, London, England
Abstract. Registration of 2D-3D data can improve visualisation during minimally-invasive neurointerventions. Using four clinical data sets, we quantitatively compared two approaches: an intensity-based algorithm and a feature-based algorithm. The intensity-based approach was found to be more accurate, with an average registration accuracy of 1.4mm, compared to the feature-based algorithm with an average accuracy of 2.3mm. The intensity-based algorithm was also found to be more reliable. Reliability of the feature-based algorithm was found to be more sensitive to the complexity of the vasculature structure.
1
Introduction
The registration of 2D-3D data sets is important in minimally invasive neurointerventions, such as the coiling of brain aneurysms or glueing of arteriovenous malformations (AVM). During such interventions a neuro-radiologist guides a catheter through the brain vasculature using 2D X-ray images. The 2D nature of the images can make it difficult to navigate and position the catheter accurately in a complicated 3D angioarchitecture. One solution would be to utilise a pre-operative phase contrast magnetic resonance angiography (PC-MRA) scan. Such a scan could be segmented [1][2] to produce a 3D model of the vasculature. By registering the intra-operative X-ray image with this 3D model, it would be possible to accurately display the position of the catheter relative to the 3D model. In this paper, we compare two approaches to 2D-3D registration: an intensitybased method [3] and a feature-based method [4]. We compare accuracy and robustness of these two algorithms on four clinical data sets. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 517–524, 2002. c Springer-Verlag Berlin Heidelberg 2002
518
2 2.1
R.A. McLaughlin et al.
Method Intensity-Based Registration
The intensity-based registration algorithm builds on the work in [3] and iteratively optimises the six rigid-body parameters describing the location and rotation of the 3D model. Digitally reconstructed radiographs (DRR) are generated by casting rays through the segmented volume, and are compared to the digital subtraction angiography (DSA) image using the gradient difference similarity measure [5]. Gradient images are computed for both the the DSA image and the DRR using 3x3 Sobel templates. The gradient difference similarity measure minimises the difference between these gradient images. Details are given in [3]. Some modifications to the algorithm of [3] were required to adapt it to work with segmented 3D data and DSA images, rather than unsegmented CT data and fluoroscopy images. The primary modification was the use of a spherical volumeof-interest (VOI), manually defined around the feature of interest (aneurysm or AVM). Only voxels lying within the VOI were used in the registration. The VOI was projected onto the DRR as a circular mask. A concentric circular mask with one quarter the radius was then defined, and pixels within this smaller mask were used in an initial registration. The radius of the smaller mask was then doubled and this larger mask was used to refine the registration. The centre of rotation for the volume was set to be the centre of the VOI. To reduce processing time at each stage, a multi-resolution strategy was adopted whereby the DRRs and DSA images were sub-sampled by a factor of four. These dimensions were subsequently doubled until the optimisation of the parameters was completed with both images at their full resolutions. 2.2
Feature-Based Registration
The feature-based registration algorithm [4] first skeletonises vessels in the DSA image, reducing the thickness of each to a single pixel. Blood vessels in the 3D model are also skeletonised by extracting the medial axis of each vessel. The algorithm registers the data sets by matching the skeletonised DSA image with a projection of the skeletonised 3D model. For each 3D point, the closest corresponding point in the skeletonised DSA image is found using a territory-based correspondence search as described in [4]. Using these pairs of points and the method outlined in [6], the algorithm finds the optimal rotation and translation of the 3D model to achieve a registration. The registration was performed in three stages, using the VOI defined in Section 2.1. Registration was initially performed with the small mask, refined using the larger mask and finally completed using the entire DSA image. A similar use of masks was presented in [7].
3 3.1
Experiments Data
Phase-contrast MRA (PC-MRA) scans were obtained for three patients with aneurysms (patients 1-3) and one patient with an AVM (patient 4). Scans were
2D-3D Intensity-Based Registration versus Feature-Based Registration
519
acquired on a Siemens Magnetom Vision 1.5T with voxel size 0.78 x 0.78 x 1.5mm and image dimensions 256 x 256 x 64. The scans for patients 1, 2 and 4 contained flow speed information and were segmented as described in [1]. An improved scan was used for patient 3, giving both flow speed and flow direction information, and this extra information was used to give an improved segmentation [2]. Visualisations of the segmented MRA data sets are shown in Figure 1.
Patient 1
Patient 2
Patient 3
Patient 4
Fig. 1. 3D visualisations of segmented MRA scans.
For each patient, two DSA runs at different orientations were acquired using a GE Medical Systems Advantx DX, and digitised from the PAL composite video signal at an image resolution of 512 × 512 pixels using a Matrox Meteor II framegrabber. For each DSA run, three to seven images were acquired at half second intervals. These were used to generate two images: a maximal image where the images were combined so that the maximal level of contrast over the run is recorded for each pixel; and a single-frame image, where an image that had maximal opacification of the arterial system was chosen. A distortioncorrection phantom and software were used to correct for pincushion distortion in the images [5]. Typical DSA images produced are shown in Figure 2. 3.2
Calculation of “Gold-Standard” Registration
Parameters for the gold-standard registration may be described as either intrinsic or extrinsic. Intrinsic parameters describe properties of the imaging system,
520
R.A. McLaughlin et al.
Patient 1
Patient 2
Patient 3
Patient 4
Fig. 2. The first of two DSA runs obtained for each patient. Patients 1 and 2 show single frame DSA images, while Patents 3 and 4 show maximal DSA images.
such as the perspective projection matrix. Extrinsic parameters describe the orientation (rotation) and position (translation) of the 3D model [5] Intrinsic parameters for the gold standard registration were computed from parameters obtained from the X-ray machine display during acquisition. As no fiducial markers were available in either the PC-MRA scans or the DSA images, the extrinsic parameters were obtained by a manual registration performed by JVB (neuro-radiologist). The manual registration was performed using 3D visualisation software which simulated X-ray images for a specified translation and rotation, allowing the neuro-radiologist to align clinically relevant points in the images. To test the stability of these manual results, registrations for patient 2: DSA run 1 and patient 4: DSA run 1 were each repeated eight times, and variation in the results were computed using the reprojection distance described in the next section. 3.3
Experiments for Accuracy and Robustness
The segmented MRA data sets were registered with both maximal and singleframe DSA images. Starting positions for the registrations were chosen by perturbing the gold standard values by set amounts. This methodology was used in [5]. Four experiments were performed, with the amount of perturbation increased each time, as shown in Table 1. For each experiment, different combinations of the four perturbations resulted in sixteen different starting positions. Note that there were no in-plane translations (δX or δY ), as these can be accurately calculated by selecting a single corresponding point in both the DSA image and the DRR simulated from the MRA data. To measure accuracy, the reprojection distance was used, as defined by Masutani et al. [8]. A number of anatomically visible points on the segmented 3D model were chosen, along with the corresponding points in the DSA image. Using the rotation and translation matrix resulting from each registration, the position of the 3D points was recomputed. The minimum distance (in mm) from each point to the ray passing from the X-ray source to the corresponding DSA image point was then calculated. This gave a measurement of the accuracy of the registration when projecting from 3D to 2D. A discussion of the measurement
2D-3D Intensity-Based Registration versus Feature-Based Registration
521
Table 1. Perturbations of the starting positions from the gold standard for four of the six rigid-body parameters. Experiment # δZ δθx δθy 1 ±25 mm ±4◦ ±4◦ 2 ±50 mm ±8◦ ±8◦ 3 ±75 mm ±12◦ ±12◦ 4 ±100 mm ±16◦ ±16◦
δθz ±4◦ ±8◦ ±12◦ ±16◦
can be found in [5]. Finally, the average RMS error of all such points for each experiment was computed. If the average RMS error for a particular registration was less than 4 mm, the registration was judged to have succeeded.
4
Experiment and Results
Figure 3a plots registration accuracy for each algorithm, using both the maximal DSA images and the single-frame DSA images. These are plotted against the variability in the manual ’gold-standard’ registration, which was computed as 1.7mm. Figure 3b plots the percentage of successful registrations. Only successful registrations were used in computing the accuracies shown in figure 3a. Results of typical registrations are displayed in figure 4. 3
Feature: single frame DSA Feature: maximal DSA Intensity: single frame DSA
RMS error (mm)
2.5 2
Intensity: maximal DSA Manual registration
1.5 1 0.5 0 1
2
3
4
Experiment number
a. 100
Feature: single frame DSA
% successes
90 80
Feature: maximal DSA
70
Intensity: single frame DSA
60
Intensity: maximal DSA
50 40 30 20 10 0 1
2
3
4
Experiment number
b. Fig. 3. (a) Registration accuracy. (b) Registration reliability.
The percentage of successful registrations varied with the data sets, being notably higher with Patients 1 and 2 than with Patients 3 and 4. Graphs showing the percentage of successful registrations with each data set are shown in figure 5.
522
R.A. McLaughlin et al.
a.
b.
e.
f.
c.
d.
Fig. 4. Results of registration. Registered 3D vessels are overlaid in black. (a) Original DSA image for Patient 2. (b) Typical successful registration for patient 2. (c, d) Failed registrations for patient 2. (e) Original DSA image for Patient 4. (f) Typical successful registration for patient 4. 100
Patient 1
% successes
90 80
Patient 2
70
Patient 3
60
Patient 4
50 40 30 20 10 0 1
2
3
4
Experiment number
a. 100
Patient 1
% successes
90 80
Patient 2
70
Patient 3
60
Patient 4
50 40 30 20 10 0 1
2
3
4
Experiment number
b. Fig. 5. Registration reliability for each patient data set. (a) Feature-based algorithm. (b) Intensity-based algorithm.
5
Discussion
The intensity-based algorithm had the greater accuracy of the two algorithms, with an average accuracy of 1.4mm. This compared to an average value of 2.3mm
2D-3D Intensity-Based Registration versus Feature-Based Registration
523
for the feature-based algorithm. Recall that the feature-based algorithm registers a skeleton of the 3D model with a skeleton of the DSA image. It is thus sensitive to inaccuracies in the position of the 2D and 3D skeletonised points. This accounts for its lower accuracy when compared to the intensity-based algorithm, which registers the intensity values of every individual pixel. Note that the difference in image quality between the maximal and single-frame DSA images did not noticeably alter the accuracies for either algorithm. The intensity-based algorithm was also more robust. This is in contrast to the experimental results of [7], which found the feature-based algorithm to be more robust. The experiments in [7] were performed using the far simpler vasculature of an in-vitro silicon aneurysm phantom (middle cerebral artery bifurcation aneurysm). Our results suggest that while for simple angioarchitectures the feature-based approach may be more robust, in complicated situations the intensity-based approach is superior. The results in figure 5 support this conclusion. The robustness trends shown in the graphs fall into two distinct classes, with Patients 1 and 2 proving to be more robust than Patients 3 and 4. Recall that while the scans for Patients 1 and 2 contained only flow speed information, Patient 3 contained both flow speed and direction information. This led to a more complicated segmentation, with small vessels detected. Patient 4 was complex due to the angioarchitecture of the AVM. These results suggest that robustness of the feature-based approach could be greatly improved if, in some initial stage of processing, the vasculature in the 3D model and DSA image could be simplified to contain only the most significant vessels. An essential difference between the two algorithms lies in the method by which they combine conflicting information and iteratively improve the current state of the registration. The intensity-based algorithm tests each minor perturbation to the current rotation and translation, minimising a similarity measure that is summed over the entire data set. In contrast, in the feature-based algorithm each pair of matching points (one from the 3D skeleton and one from the skeletonised DSA image) specify an optimal change to the present rotation and translation. It is the average rotation and translation that is chosen. This method renders the algorithm sensitive to a misregistration of one or two erroneous vessels, as these will produce greatly different estimates for the rotation and translation. This suggests that the use of a robust fitting method such as RANSAC [9] may greatly improve the reliability of the feature-based algorithm. The computation time of the algorithms has important ramifications for the clinical suitability of either approach to registration. The feature-based algorithm is far less computationally intensive than the intensity-based algorithm, resulting in a much faster registration. This is because the feature-based algorithm operates on a small number of skeletonised points, rather than the exhaustive pixel-based approach of the intensity-based algorithm. In future work, we will seek to quantify these differences.
524
6
R.A. McLaughlin et al.
Conclusion
We have compared an intensity-based and a feature-based registration algorithm for the registration of 3D PC-MRA data to DSA images. The algorithms were tested using four clinical PC-MRA data sets and eight DSA runs. The intensitybased registration algorithm produced more accurate registrations, with an average RMS reprojection error of 1.4 mm. The feature-based algorithm was found to have an average RMS reprojection error of 2.3 mm. The intensity-based algorithm was found to converge to the correct solution with greater reliability. Our results suggest that reliability of the feature-based algorithm are more effected by the complexity of the angioarchitecture than is the intensity-based method. In future work we will explore whether the featurebased approach may be made more reliable by the incorporation of a robust fitting method.
Acknowledgements We wish to thank Dr. G. P. Penney, K. Rhode, Dr. A.C.S. Chung and Dr. Y. Kita for their help in undertaking this research. This work was supported by ESPRC grants GR/M55008 and GR/M55015.
References 1. Chung, A.C.S., Noble, J.A.: Statistical 3D vessel segmentation using a Rician distribution. In: Proc. MICCAI. (1999) 82–89 2. Chung, A., Noble, J.: Fusing magnitude and phase information for vascular segmentation in phase contrast mr angiograms. In: Proc. MICCAI. (2000) 166–175 3. Penney, G.P., Weese, J., Little, J.A., Desmedt, P., Hill, D.L.G., Hawkes, D.J.: A comparison of similarity measures for use in 2d-3d medical image registration. IEEE Transactions on Medical Imaging 17 (1998) 586–595 4. Kita, Y., Wilson, D.L., Noble, J.A.: Real-time registration of 3D cerebral vessels to X-ray angiograms. In: MICCAI’98. (1998) 1125–1133 5. Penney, G.P.: Registration of Tomographic Images to X-ray Projections for Use in Image Guided Interventions. Phd thesis, University College London, CISG, Division of Radiological Sciences, Guy’s Hospital, King’s College London, London SE1 9RT England (2000) 6. Heuring, J.J., Murray, D.W.: Visual head tracking and slaving for visual telepresence. In: Proc. of IEEE Int. Conf. on Robotics and Automation. (1996) 2908–2914 7. McLaughlin, R.A., Hipwell, J., Penney, G.P., Rhode, K., Chung, A., Noble, J.A., Hawkes, D.J.: Intensity-based registration versus feature-based registration for neurointerventions. In: Proceedings of Medical Image Understanding and Analysis (MIUA). (2001) 69–72 8. Masutani, Y., Dohi, T., et al., F.Y.: Interactive virtualized display system for intravascular neurosurgery. In: CVRMed-MRCAS’97. (1997) 427–435 9. Fischler, M.A., Bolles, R.C.: Random sample consensus: A paradigm for model fitting with applications to image analysis and automated cartography. Communications of the ACM 24 (1981) 381–395
Multi-modal Image Registration by Minimising Kullback-Leibler Distance Albert C.S. Chung1 , William M. Wells III2,3 , Alexander Norbash2 , and W. Eric L. Grimson3 1
Dept. of Computer Science, Hong Kong University of Science & Technology, HK 2 Harvard Medical School, Brigham & Women’s Hospital, Boston, MA USA 3 MIT Artificial Intelligence Laboratory, Cambridge, MA USA
[email protected]
Abstract. In this paper, we propose a multi-modal image registration method based on the a priori knowledge of the expected joint intensity distribution estimated from aligned training images. The goal of the registration is to find the optimal transformation such that the discrepancy between the expected and the observed joint intensity distributions is minimised. The difference between distributions is measured using the Kullback-Leibler distance (KLD). Experimental results in 3D-3D registration show that the KLD based registration algorithm is less dependent on the size of the sampling region than the Maximum log-Likelihood based registration method. We have also shown that, if manual alignment is unavailable, the expected joint intensity distribution can be estimated based on the segmented and corresponding structures from a pair of novel images. The proposed method has been applied to 2D-3D registration problems between digital subtraction angiograms (DSAs) and magnetic resonance angiographic (MRA) image volumes.
1
Introduction
A key issue in the medical imaging field is multi-modal image registration. As the use of co-registration packages spreads, the number of the aligned image pairs in image databases (either by manual or automatic methods) increases dramatically. These image pairs can serve as a set of training data, in which the statistical joint intensity properties can be observed and learned in order to acquire useful a priori knowledge for future registration tasks. In this paper, we propose a multi-modal image registration method based on the a priori knowledge of the expected joint intensity distribution estimated from aligned training images. One of the key features is the use of the expected joint intensity distribution between two pre-aligned, training images as a reference distribution. The goal is to align any two images of the same or different acquisitions such that the expected distribution and the observed joint intensity distribution are well matched. In other words, the registration algorithm aligns two different images based on the expected outcomes. The difference between distributions is measured using the Kullback-Leibler distance (KLD), which is a T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 525–532, 2002. c Springer-Verlag Berlin Heidelberg 2002
526
A.C.S. Chung et al.
frequently used information theoretic similarity measure in the machine learning and information theory fields. The KLD value tends to zero when the two distributions become equal. The registration procedure is an iterative process, and is terminated when the KLD value becomes sufficiently small. Experimental results in 3D-3D registration show that the KLD based registration algorithm is less dependent on the size of the sampling region than the Maximum log-Likelihood based method. We have also shown that, if manual alignment is unavailable, the expected joint intensity distribution can be estimated based on the segmented and corresponding structures from a pair of novel images. The proposed method has been applied to 2D-3D registration problems between DSAs and MRA image volumes.
2 2.1
Description of the Registration Algorithm The Expected and Observed Joint Intensity Distributions
Expected joint intensity distribution: there are two ways of constructing the expected joint intensity distribution. Firstly, the joint distribution can be constructed by manual alignment, which can be done by experienced clinicians with the help of external or internal markers. Let I1 and I2 be the intensity values of two training images of the same or different acquisitions, and X1 and X2 be their image domains respectively. Assume that the values of image pixels are independent of each other. Since the two images have been already aligned, samples of intensity pairs Iˆ = {i1 (x), i2 (x)|i1 ∈ I1 , i2 ∈ I2 } can be drawn from I1 and I2 , where x are the pixel coordinates, x ∈ X and X = X1 = X2 . The expected joint intensity distribution Pˆ (I1 , I2 ) can be approximated by either Parzen windowing or histogramming [1]. Histogramming is employed in this paper because the approach is computationally efficient, and the intensity histogram size is practical (the histogram has only 2 dimensions in this case). To achieve sub-voxel accuracy, histogram partial volume (PV) interpolation [7] can be used. A smooth histogram can be obtained by convolving with a Gaussian density function, given by Gψ (z) = (2n)
−n 2
|ψ|
−1 2
e
−1 −1 z 2 z ψ
,
(1)
where ψ is the co-variance of the Gaussian function and z can be a vector or scalar value. If manual alignment is unavailable, a second method of constructing the expected joint intensity distribution is to perform segmentations separately in the two images, I1 and I2 , such that the internal anatomical structures are labelled. Let sk , k = 1 . . . M , be the internal structures, where M represents the number of anatomical structures. Then, samples of intensity pairs Iˆ = {i1 (x), i2 (y)|i1 ∈ I1 , i2 ∈ I2 , x, y ∈ sk , k = 1 . . . M } can be drawn if x and y belong to the same structure sk , where x and y are the pixel coordinates in X1 and X2 respectively. Similarly, the expected joint intensity distribution Pˆ (I1 , I2 ) can be approximated by either Parzen windowing or histogramming.
Multi-modal Image Registration by Minimising Kullback-Leibler Distance
527
Observed joint intensity distribution: given a new image pair with a hypothesized transformation T , samples of intensity pairs Io = {i1 (x), i2 (T (x))|i1 ∈ I1 , i2 ∈ I2 } can be drawn from I1 and I2 , where x are the pixel coordinates, x ∈ Ω and Ω ⊂ X1 ∪ X2 . This means Ω represents a sampling domain that is equal to or inside X1 ∪ X2 . Note that the observed joint intensity distribution PoT (I1 , I2 ) is dependent on the values of the transformation T and changes during the registration. The Parzen windowing or histogramming approach can also be used to estimate the distribution PoT . 2.2
Kullback-Leibler Distance (KLD) Given the expected Pˆ and observed PoT joint intensity distributions, the Kullback-Leibler distance between the two distributions is given by P T (i1 , i2 ) . (2) PoT (i1 , i2 ) log o D(PoT ||Pˆ ) = Pˆ (i1 , i2 ) i1 ,i2 According to [3,5], D(PoT ||Pˆ ) has two important properties. 1. D(PoT ||Pˆ ) ≥ 0; and 2. D(PoT ||Pˆ ) = 0 iff PoT = Pˆ . These properties show that, when the two images I1 and I2 are not perfectly registered, the values of KLD, D, will be non-zero and positive because the observed and expected joint intensity distributions are not equal, PoT =Pˆ . On the other hand, if the images are well registered, then the value of KLD is equal to zero, i.e. D = 0. 2.3
Optimisation of the Transformation T
The goal of the registration is to find the optimal transformation Tˆ by minimising the difference between the observed Po and expected Pˆ , which is formulated as Tˆ = arg min D(PoT ||Pˆ ). T
(3)
The proposed method is conceptually different from the mutual information based registration method, which encourages the functional dependence between the two image random variables, I1 and I2 . The KLD based registration method guides the transformation T based on the difference between the expected Pˆ and observed PoT joint intensity distributions, or, in other words, based on the expected outcomes learned from the training data. In this paper, the value of KLD is minimised by Powell’s method with a multi-resolution strategy [9] because it does not require calculations of gradient and, hence, is simpler in terms of implementation. Powell’s method iteratively searches for the minimum value of KLD along each parameter axis T (1D line minimisation) while other parameters are kept constant. The search step ∂T is relatively large in a coarse resolution and decreases as the resolution gets higher, ∂T is set to 2, 1 and 0.5mm in this paper (in Section 3.2). The iteration process stops when the change of KLD is sufficiently small (set 0.001 in this paper).
528
A.C.S. Chung et al.
a. T1 image
b. T2 image
Fig. 1. (a) T1 and (b) T2 images.
3 3.1
Experimental Results T1 – T2 (3D-3D) Registration
The T1 and T2 datasets are obtained from the BrainWeb Simulated Brain Database (277 × 241 × 181 voxels and 1 × 1 × 1mm3 ) [2], in which all the corresponding images have already been perfectly aligned and can be used as a testing platform for studying the performance of different objective functions. Maximum log-Likelihood (ML) [6] and Mutual Information (MI) [10] were compared with the KLD, their definitions are given by log Pˆ (i1 (x), i2 (T (x))), and (4) ML = x
MI =
i1 ,i2
PoT (i1 , i2 ) log
PoT (i1 , i2 ) T Po (i1 )PoT (i2 )
(5)
respectively, where PoT (i1 ) and PoT (i2 ) are the marginal distributions, x are the pixel coordinates, x ∈ Ω and Ω ⊂ X1 ∪ X2 . One of the pairs of 2D T1 and T2 image slices is shown in Figs. 1a and 1b respectively, with their intensity values and image domains represented by I1 and I2 , and X1 and X2 respectively. Since these images in the datasets are aligned, the expected joint intensity distribution Pˆ (I1 , I2 ) can be estimated based on the method described in Section 2.1 (only slices from positions 30 to 160 were used in order to avoid the inherent image artifacts in the dataset). In order to study the performance of the objective functions, X2 was shifted horizontally and rotated, whereas the position and orientation of X1 were fixed. Given a transformation T , if any pixel x2 in X2 fell between the voxel positions of X1 , then its corresponding intensity value i1 was computed by linearly interpolating the values of its four neighbouring pixels in X1 to achieve the sub-voxel accuracy. The observed joint intensity distribution PoT was then estimated according to Section 2.1. In this paper, the number of bins was set to 32 and the co-variance matrix ψ in Eq. 1 was a diagonal matrix DIAG(σ 2 , σ 2 ) and σ 2 = 1.
Multi-modal Image Registration by Minimising Kullback-Leibler Distance
a. KLD
b. ML
529
c. MI
Fig. 2. T1-T2 registration performance analysis. T2 image was shifted horizontally. The offset values range from −40mm to 40mm.
a. KLD
b. ML
c. MI
Fig. 3. T1-T2 registration performance analysis. T2 image was rotated. The offset values range from −40o to 40o .
We set Ω = X2 for ease of implementation in this paper. If x2 fell outside the domain of image X1 , then an arbitrary intensity value in the background of X1 was assigned to i1 . As plotted in Figs. 2 and 3, the performances of the three different measures (KLD, ML and MI) are comparable when the T 2 image (X2 ) was shifted horizontally between −40mm and 40mm, and was rotated between −40o and 40o . However, it is also common to discard a sample (i1 (x), i2 (T (x))) if it fell outside the overlapping region, i.e. x ∈X1 ∩ X2 . As shown in Fig. 4, when Ω was set to X1 ∩ X2 , the performance of ML was adversely affected when only samples drawn from the overlapping region were included in the calculation. As compared with ML, the figure shows that KLD and MI are less dependent on the size of the sampling region Ω. The major reason is that, from Eq. 4, the value of ML depends only on the observed samples x. Therefore, when the area of the overlapping region is small, fewer samples are obtained and thus the value of ML increases. In contrast, given the same set of observed samples, the value of KLD consists of the contributions of the observed samples and, most importantly, the penalties of the unobserved samples from the expected joint intensity distribution Pˆ . Therefore, the entire distribution Pˆ is utilised in the KLD measure. Finally, the value of MI depends mostly on the randomness of the observed samples. The decease in overlapping area increases the sample randomness and, hence, the value of MI decreases. In terms of computational efficiency, comparing Eq. 2 with Eq. 4, it is observed that, since KLD does not require the calculation of the marginal distri-
530
A.C.S. Chung et al.
a. KLD
b. ML
c. MI
Fig. 4. T1-T2 registration performance analysis. T2 image was shifted horizontally. However, only samples, which fell in the overlapping region of the two images, were included in the calculations.
butions, PoT (i1 ) and PoT (i2 ), it can be more computationally efficient than MI. From Eq. 4, the efficiency of ML is directly proportional to the number of samples drawn. On the other hand, the efficiency of KLD is directly proportional to the product of the number of bins B1 and B2 partitioning I1 and I2 respectively. As such, the efficiencies of ML and KLD are related to different parameters, and their comparison is parameter dependent. 3.2
DSA - MRA (2D-3D) Registration
The proposed method was applied to 2D-3D registration problems and tested in two clinical datasets, which were acquired at the Department of Radiology, Brigham and Women’s Hospital, Boston, USA. Each dataset consists of a pre-interventional 3D magnetic resonance angiographic (MRA) image volume (256 × 256 × 60 voxels and 0.78 × 0.78 × 1.3mm3 ), and a 2D digital subtraction angiogram (DSA) during the interventional treatments. Figs. 5a and 5d show the two cropped DSAs. The DSAs were distortion corrected using a distortion correction object with a uniform grid pattern [4]. A maximum intensity projection (MIP) of each MRA volume was generated using the projective geometry and ray casting method [8,11], in which there were six rigid body transformation parameters (three translational and three rotational). The initial transformations were obtained from the machine readings of the C-arm X-ray systems, as shown in Figs. 5d and 5h. For each dataset, the expected joint intensity distribution was estimated based on the segmented and corresponding structures from the novel DSA and the initial non-registered MIP. These structures consist of vessel and background regions, in which each region was defined by a manually selected intensity range for the two datasets (more advanced methods can be applied but they are not the focus of this paper). The expected distribution Pˆ was estimated by randomly drawing samples of the same structures from the DSA and MIP, as described in the Section 2.1. Then, the observed distribution PoT was generated during the registration and used to guide the rigid body transformation using the KLD measure, as defined in the Eqs. 2 and 3. The optimal transformation was searched using Powell’s method with a multi-resolution strategy, as described in Section
Multi-modal Image Registration by Minimising Kullback-Leibler Distance
a.
b.
c.
d.
e.
f.
g.
h.
531
Fig. 5. 2D-3D registration results: (a,e) digital subtraction angiograms (DSA) (vessels are black in colour), (b,f) final image alignments, maximum intensity projections (MIP) of the magnetic resonance angiographic (MRA) image volumes (vessels are white in color and their intensity is directly proportional to the flow speed), (c,g) segmented MIPs are overlaid on their corresponding DSAs and (d,h) initial image alignments.
2.3. Figs. 5b and 5f show the MIPs of the registered MRA volumes and the results are promising. Segmented vessel regions of the MIPs are overlaid on the corresponding DSAs, as shown in Figs. 5c and 5g. Note that the remaining discrepancy between the DSA and MIP may be caused by (a) some vessels that are visible in one image and are not visible in another image due to different vessel delineation properties in different acquisitions and different regions of interest selected, (b) signal loss in the MRA images (e.g. turbulent or eddy flow), or (c) the geometric distortion due to the MR gradient field nonlinearity.
4
Summary and Conclusions
In this paper, we have proposed a multi-modal image registration method based on the a priori knowledge of the expected joint intensity distribution estimated from the aligned training images. The difference between the expected and observed joint intensity distributions is measured by the Kullback-Leibler distance (KLD), which has non-zero and positive value when there is any discrepancy be-
532
A.C.S. Chung et al.
tween the two distributions. The KLD-based registration algorithm guides the transformations by minimising the KLD value until the two datasets are aligned. The results based on T1-T2 (3D-3D) registration experiments show that, as compared with the Maximum log-Likelihood (ML) based registration method, the KLD-based registration algorithm is less dependent on the size of sampling region. In DSA-MRA (2D-3D) registration experiments, we have shown that the expected joint intensity distribution can also be estimated based on the segmented and corresponding structures (vessel and background regions) from the novel DSA and the initial non-registered MIP. The DSA-MRA registration results are promising and demonstrate the applicability of our method in 2D3D registration. Future work will include a further validation of the proposed algorithm by applying it to a large number of datasets.
Acknowledgements We would like to thank K. Rhode and D. Hawkes at Guy’s Hospital, London, U.K. for sharing the DSA image distortion correction software. W. M. Wells III would like to acknowledge support from the NSF ERC grant (JHU Agreement #8810-274) and the NIH (grant #1P41RR13218).
References 1. C.M. Bishop. Neural Networks for Pattern Recognition. Oxford U. Press, 1995. 2. D.L. Collins, A.P. Zijdenbos, and et al. Design and Construction of a Realistic Digital Brain Phantom. IEEE Trans. Med. Img., 17(3):463–468, 1998. 3. T.M. Cover and J.A. Thomas. Elements of Information Theory. John Wiley & Sons, Inc., 1991. 4. P. Haaker, E. Klotz, and et al. Real-time distortion correction of digital X-ray II/TV-systems: an application example for digital flashing tomosynthesis (DFTS). International Journal of Cardiac Imaging, 6(1):39–45, 1990-91. 5. S. Kullback. Information Theory and Statistics. Dover Publications, Inc., 1968. 6. M.E. Leventon and W.E.L. Grimson. Multi-Modal Volume Registration Using Joint Intensity Distributions. In MICCAI, pages 1057–1066, 1998. 7. F. Maes, A. Collignon, and et al. Multimodality Image Registration by Maximization of Mutual Information. IEEE Trans. Med. Img., 16(2):187–198, 1997. 8. G.P. Penney, J. Weese, and et al. A Comparison of Similarity Measures for Use in 2D-3D Medical Image Registration. IEEE Trans. Med. Img., 17(4):586–595, 1998. 9. W.H. Press, S.A. Teukolsky, and et al. Numerical Recipes in C, 2nd Edition. Cambridge University Press, 1992. 10. W.M. Wells, P. Viola, and et al. Multi-Modal Volume Registration by Maximization of Mutual Information. Medical Image Analysis, 1(1):35–51, 1996. 11. L. Z¨ ollei. 2D-3D Rigid-Body Registration of X-Ray Fluoroscopy and CT Images. MIT Masters Dissertation, 2001.
Cortical Surface Registration Using Texture Mapped Point Clouds and Mutual Information Tuhin K. Sinha, David M. Cash, Robert J. Weil, Robert L. Galloway, and Michael I. Miga Vanderbilt University, Nashville TN 37235, USA {tk.sinha,dave.cash,michael.i.miga}@vanderbilt.edu
[email protected] http://bmlweb.vuse.vanderbilt.edu
Abstract. An inter-modality registration algorithm that uses textured point clouds and mutual information is presented within the context of a new physical-space to image-space registration technique for imageguided neurosurgery. The approach uses a laser range scanner that acquires textured geometric data of the brain surface intraoperatively and registers the data to grayscale encoded surfaces of the brain extracted from gadolinium enhanced MR tomograms. Intra-modality as well as inter-modality registration simulations are presented to evaluate the new framework. The results demonstrate alignment accuracies on the order of the resolution of the scanned surfaces (i.e. submillimetric). In addition, data are presented from laser scanning a brain’s surface during surgery. The results reported support this approach as a new means for registration and tracking of the brain surface during surgery.
1
Introduction
Understanding the geometric characteristics and the impact of intraoperative surgical events upon the cortical brain surface has important implications in the development of image-guided surgery (IGS) systems. In recent studies [1], the need for brain shift compensation strategies to prevent compromising IGS navigation has become an important area of research [2]. When using a computational approach to correct for brain shift [3], capturing the geometric and visual changes of the brain surface due to deformation may be a valuable source of intra-operative data. To achieve this end, a laser range scanning system capable of capturing textured surfaces with sub-millimetric accuracy will be used. Using features from the cortical surface to register does have precedent. Nakajima et al. demonstrated an average of 2.3 ± 1.3 mm fiducial registration error (FRE) using cortical vessels for registration [4]. More recently, Nimsky et al. reported a deformable surface approach to quantify surface shifts using a variation on the iterative closest point (ICP) algorithm [1]. Also, some preliminary work utilizing a scanning based system for cortical surface registration has been reported but a systematic evaluation has not been performed to date [5]. The novelty of the approach reported here is that both vessel information and three-dimensional T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 533–540, 2002. c Springer-Verlag Berlin Heidelberg 2002
534
T.K. Sinha et al.
topography will be used as the basis of alignment. Furthermore, the scanner provides a highly accurate method for tracking the brain surface that can be used in the model-updating framework. As an initial step, an implementation has been developed using an iterative closest point (ICP) [6] framework with mutual information (MI) [7]. Although ICP and MI have been used extensively [8][9], previously published registration frameworks do not entirely apply to the unique data provided by the scanner or this particular registration approach. The data acquired by the scanner provides a one-to-one correspondence between contour point and image intensity. However, intensity correspondence between a three-dimensional MR surface and an intraoperatively acquired laser-scanned cortical surface is somewhat more elusive. The most similar work relating to this registration framework is that by Johnson and Kang [10] in which these investigators used an objective function for registration based on a combined Euclidean distance and color difference metric. Used primarily in a landscape alignment application, this technique would not be amenable to the alignment process here, since the intensity distribution between scanner and MR image data is fundamentally very different. To our knowledge, no registration algorithm has been developed that will register textured three-dimensional surfaces from two different imaging modalities within the context of cortical surface registration.
2
Methods
In the realization of this approach, a laser range scanning system (RealScan 3D, 3D Digital Corporation, Danbury, CT) capable of capturing three-dimensional textured surfaces to sub-millimeter accuracy has been utilized (see Figure 1). The scanner is lightweight, compact and has a standard tripod mount. The scanning field consists of 500 horizontal by 494 vertical points per scan and is accomplished in approximately 5 seconds. Extensive calibration and characterization has been performed by Cash et al. and has demonstrated the fidelity at which surface data can be acquired [11]. Additionally, the device is approved for use in neurosurgery by the Vanderbilt University Medical Center Institutional Review Board.
Fig. 1. Laser scanner used to acquire textured point clouds.
The registration framework involves two primary steps in its execution. The first step involves acquisition and preparation of the registration surfaces. With
Cortical Surface Registration
535
respect to laser scanned surfaces, the scanner is currently placed approximately 1-2 feet from the surface of interest (achieved either by passive arm or monopod for intraoperative use). The horizontal range of the scanner is established and a vertical laser stripe passes over the surface in approximately 5 seconds. The data acquired consists of a three-dimensional point cloud with each Cartesian coordinate color-encoded via texture mapping into a digital image that is acquired just after scanning. The texture-space to scanner-space registration is calibrated by the manufacturer. The MR-generated point cloud is prepared by segmenting the brain volume, followed by ray-casting to find surface points, and averaging subsequent voxels to generate gray-scale values for each surface point (Analyze AVW - Biomedical Imaging Resource). The final step in our approach is to perform surface registration using a twostage process. An iterative closest point (ICP) algorithm is performed initially to align the point clouds of interest (i.e. laser-scanned surface and/or MR surface). The second stage is a constrained intensity-based registration. The constraint requires the alignment transformation to only operate in spherical coordinates with known radius R; the radius is provided by sphere-fitting the target surface [12]. By enforcing this restriction on the transformation, the degrees of geometric freedom are reduced from six to three, i.e. elevation φ, azimuthal θ, and roll ψ. For the method of intensity-based registration, a maximization of normalized mutual information (NMI) [13] approach is conducted using Powell’s optimization algorithm [14]. Referred to as Surface MI in this work, the method aligns textured surfaces only and does not use volumetric image data. The results presented here do not reflect true cross-modality registration (i.e. scanner to MR).
3
Registration Experiments
To evaluate robustness and accuracy of Surface MI, an initial series of experiments was conducted using a spherical phantom with a heterogenous intensity pattern on the surface. The range scanned surface acquired for registration experiments occupied a solid angle of Ω = 1.2π steradians1 and contained 67257 points (see Figure 2). A known transformation was then applied to the target surface to generate the floating surface. The l imits for elevation, azimuthal and roll angle perturbations were ±13, ±13, and ±25 degrees, respectively (the radius of the spherical phantom was approximately 110 mm). The floating and target surfaces are then re-registered using Surface MI. Five hundred randomly distributed combinations of φ, θ, and ψ were tested for registration accuracy. The second series of experiments employed the point clouds generated from surface projections of the MR volume. The target surface that was generated using a clipping plane had a solid angle of approximately Ω = .38533π steradians and contained 48429 points (see Figure 3). Similar to the spherical phantom experiments, perturbations in φ, θ, and ψ were applied to the MR surface over 500 trials. The range for the parameters φ, θ, and ψ were the same as those for t he previous experiment with similar radius (R=105 mm). 1
The solid angle of a unit sphere Ω = 4π steradians.
536
T.K. Sinha et al.
Fig. 2. Sample textured point cloud generated using a laser range scanner.
Fig. 3. Sample textured point cloud generated using surface projection on a gadolinium enhanced MR volume.
Fig. 4. Use of a clipping plane to select a region of interest in the surface projection.
The last series of experiments evaluated the efficacy of the developed algorithm in registering surfaces across modalities. Inter-modality surfaces were simulated by inverting the texture of the point cloud. Five hundred trials registering a texture-inverted region of interest (ROI) to the original MR brain surface were performed with initial misregistrations comparable to the spherical phantom experiments. The ROIs were generated by varying the normal of the clipping plane used to create the target sur face between ±0.1 cm in the sagittal and coronal axis while holding the axial value at 1 cm (see Figure 4). To create the misregistration between the float and target surface, each surface was re-centered about it’s geometric centroid.
Cortical Surface Registration
4
537
Registration Results and Discussion
Since the same scan was used for both target and floating surfaces in the registrations process, the one-to-one correspondence in points was known. This allowed calculation of the mean target registration error (TRE) between point clouds as well as the global maximum for NMI. Sample registration results are presented for each experiment series (i.e. spherical phantom, intra-modality MR, simulated inter-modality MR) in Figure 5. In addition, a distribution of TREs for each series of experiments can be seen in Figure 6. Registration results from the 500 trials using the spherical phantom yielded a mean TRE of 11.38±28.75 mm (min.=0.04,max.=127.61 mm). Although this result is less than remarkable, it should be noted that 70% of the trials achieved a mean TRE of 0.20±0.05 mm (min.=0.04,max.=0.31 mm). Furthermore, the misalignment range during surgery is expected to be ±5 degrees within each angular coordinate. Within this range, the registration process achieved a 100% success rate (i.e. NMI optimization reached it’s global maximum). With respect to the intra-modality MR experiments, all 500 trials resulted in an ideal value of NMI. The mean TRE for the 500 trials was 0.14±0.04 mm (min.=0.04,max.=0.27 mm). The increased success rate of this series of experiments as compared to the previous trials is likely due to the differences in the geometric structure of the intensity information. Most of the intensity information of the spherical phantom is contained in the central region of the surface. In some cases, when the initial mis-registration of the spherical phantom caused sufficient non-overlap of the central area, the algorithm did not register the surfaces correctly. For the brain, the intensity pattern of the vessel structure occupies most of the surface. Thus, even though the brain’s surface occupies a smaller solid angle than that of the ball, the distribution of the intensity pattern allows the alignment of more severely misregistered surfaces. The last series of experiments simulating inter-modality registration generated a mean TRE of 3.38±7.18 mm (min.=0.07,max.=53.75 mm). Similar to the spherical phantom, 67% of these trials produced a mean TRE of 0.37±0.19 mm (min.=0.07,max.=1.00 mm). Analysis of the failed trials indicated that the spherical constraint prevented accurate registration. In general, the algorithm failed to register surfaces clipped from or containing the periphery of the surface projection, which contained a much higher surface curvature as compared to the target surface. This discrepancy in surface curvatures between target and floating surfaces caused the sub-optimal registrations. In general, the occurrence of curvature discrepancies intra-operatively will be limited since vessel landmarks will be used to provide an initial alignment for the Surface MI.
5
Conclusions and Future Work
The results of this paper show that the ICP and MI framework is a useful tool for cortical surface registration. Results of both intra- and inter-modality surface registration show sub-millimetric accuracies using a phantom. This paper outlines preliminary steps taken with the laser range scanner and the Surface
538
T.K. Sinha et al.
Fig. 5. Sample registration results. Top row, from left to right: on-axis view of misregistered and registered surfaces of the spherical phantom, off-axis view of misregistered and registered surfaces. Middle row: sample results of the intra-modality registration, presented similar to the top row. Bottom row from left to right: misregistered and registered surfaces from simulated inter-modality experiments.
MI algorithm. In vivo analysis of the registration results is currently in progress. Figure 7 shows intra-operative data of the cortical surface acquired by the laser range scanner. More quantitative studies of the laser range scanner and registration algorithm are also planned using an optical tracking system. Algorithmically, the ability to track and register cortical deformations is also being studied.
Acknowledgements The authors acknowledge Dr. Hill for his correspondence on MI. VTK (Kitware Inc.) and Analyze AVW (Mayo Clinic) provided software. A grateful acknowledgement to the VUMC Neurosurgical staff. This project is supported in part by the Vanderbilt University Discovery Grant Program.
Cortical Surface Registration
539
Fig. 6. Distribution of Target Registration Error (TRE) for each series of experiments.
Fig. 7. Example dataset taken with the laser range scanner in the operating room. Left, a CCD image of the surgical area. Right, a tessellated point cloud with texture mapped points on the right.
References 1. Nimsky, C., Ganslandt, O., Cerny, S., Hastreiter, P., Greiner, G., Fahlbusch, R.: Quantification of, visualization of, and compensation for brain shift using intraoperative magnetic resonance imaging. Neurosurgery 47 (2000) 2. Roberts, D., Miga, M., Hartov, A., Eisner, S., Lemery, J., Kennedy, F., Paulsen, K.: Intraoperatively updated neuroimaging using brain modeling and sparse data. Neurosurgery 45 (1999)
540
T.K. Sinha et al.
3. Miga, M., Paulsen, K., Lemery, J., Eisner, S., Hartov, A., Kennedy, F., Roberts, D.: Model-updated image guidance: Initial clinical experiences with gravity-induced brain deformation. IEEE: Trans. on Med. Img. 18 (1999) 4. Nakajima, S., H, H.A., Kikinis, R., Moriarty, T.M., Metcalf, D.C., Jolesz, F.A., Black, P.M.: Use of cortical surface vessel registration for image-guided neurosurgery. Neurosurgery 40 (1997) 5. Audette, M.A., Siddiqi, K., Peters, T.M.: Level-set surface segmentation and fast cortical range image tracking for computing intrasurgical deformations. LNCS: Med. Image Computing and Computer-Assisted Intervention 1679 (1999) 6. Besl, P.J., McKay, N.D.: A method for registration of 3-d shapes. IEEE Trans. on Pattern Analysis and Machine Intelligence 14 (1992) 7. Wells, W.M., Viola, P., Atsumi, H., Nakajima, S., Kikinis, R.: Multi-modal volume registration by maximization of mutual information. Med. Image Analysis 1 (1996) 35–51 8. Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., Suetens, P.: Multimodality image registration by maximization of mutual information. IEEE Trans. on Med. Imag. 16 (1997) 187–198 9. Audette, M.A., Ferrie, F.P., Peters, T.M.: An algorithmic overview of surface registration techniques for med. imag.. Med. Image Analysis 4 (2000) 201–217 10. Johnson, A.E., Kang, S.B.: Registration and integration of textured 3d data. Image and Vision Computing 17 (1999) 135–147 11. Cash, D.M., Sinha, T.K., Chapman, W.C., Galloway, R.L., Miga, M.I.: Fast accurate surface acquisition using a laser scanner for image-guided surgery, SPIE: Med. Imag. 2002 (2002) 12. Ahn, S.J., Rauh, W., Warnecke, H.J.: Least-squares orthogonal distances fitting of circle, sphere, ellipse, hyperbola, and parabola. Pattern Recognition 34 (2001) 13. Studholme, C., Hill, D.L.G., Hawkes, D.J.: An overlap invariant entropy measure of 3d medical image alignment. Pattern Recognition 32 (1999) 71–86 14. Press, W.H., Teukolsky, S.A., Vetterling, W.T., Flannery, B.P.: Num. Rec. in C : The Art of Scientific Computing. Second edn. Cambridge University Press (1993)
A Viscous Fluid Model for Multimodal Non-rigid Image Registration Using Mutual Information E. D’Agostino, F. Maes , D. Vandermeulen, and P. Suetens Katholieke Universiteit Leuven Faculties of Medicine and Engineering Medical Image Computing (Radiology - ESAT/PSI) University Hospital Gasthuisberg, Herestraat 49, B-3000 Leuven, Belgium
[email protected] Abstract. We propose a multimodal free form registration algorithm based on maximization of mutual information. Images to be aligned are modeled as a viscous fluid that deforms under the influence of forces derived from the gradient of the mutual information registration criterion. Parzen windowing is used to estimate the joint intensity probability of the images to be matched. The method was verified by for registration of simulated T1-T1, T1-T2 and T1-PD images with known ground truth deformation. The results show that the root mean square difference being the recovered and the ground truth deformation is smaller than 1 voxel.
1
Introduction
Maximization of mutual information has been demonstrated to be a very general and reliable approach for affine registration of multimodal images of the same patient or from different patients, including atlas matching [7,9]. In applications where local morphological differences need to be quantified, affine registration is no longer sufficient and non-rigid registration (NRR) is required, aiming at finding a 3D vector field describing the deformation at each point. Applications for NRR include shape analysis (to warp all shapes to a standard space) and atlas-based segmentation (to compensate for gross morphological differences between atlas and study images). Different approaches have been proposed for extending the mutual information criterion to NRR. Spline-based approaches [8,6] can correct for gross shape differences, but a dense grid of control points is required to characterize the deformation at voxel level detail, implying high computational complexity. Block matching [4] or free-form approaches, using a non-parameterized expression for the deformation field, assign a local deformation vector to each voxel individually, but need appropriate constraints for spatial regularization of the resulting vector field. Elastic constraints are suitable when displacements can be assumed to be small, while for large magnitude deformations a viscous fluid model is more appropriate.
Frederik Maes is Postdoctoral Fellow of the Fund for Scientific Research - Flanders (FWO-Vlaanderen, Belgium).
T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 541–548, 2002. c Springer-Verlag Berlin Heidelberg 2002
542
E. D’Agostino et al.
Recently, a multimodal NRR algorithm was presented in [5], defining the forces driving the deformation at each voxel such that mutual information is maximized and using a regularization functional derived from linear elasticity theory. In this paper, we extend the approach of [5] by replacing the elastic model by the viscous fluid regularization model of Christensen et al. [3] and thus generalize the method of [3] to multimodal image registration based on maximization of mutual information. The Navier-Stokes equation modelling the viscous fluid is solved by iteratively updating the deformation field and convolving it with a Gaussian filter. The deformation field is regridded as needed during iterations as in [3] to assure that its Jacobian remains positive everywhere, such that the method can handle large deformations. We verified the robustness of the method by applying realistic known deformations to simulated multispectral MR images and evaluating the difference between the recovered and ground truth deformation fields in terms of displacement errors and of tissue classification errors when using the recovered deformation for atlas-based segmentation.
2 2.1
Method The Viscous Fluid Algorithm
We follow the approach of [3] to deform an template image F onto a target image G, using an Eulerian reference frame to represent the mapping T = x − u(x) of fixed voxel positions x in target space onto the corresponding positions x − u(x) in the original template space. The deforming template image is considered as a viscous fluid whose motion is governed by the Navier-Stokes equation of conservation of momentum. Using the same simplifications as in [3], this equation can be written as ∇2 v + ∇ (∇.v) + F (x, u) = 0 (1) with F (x, u) a force field acting at each position x that depends on the deformation u and that drives the deformation in the appropriate direction, and with v(x, t) the deformation velocity experienced by a particle at position x: ∂u ∂u du = + vi dt ∂t ∂xi i=1 3
v=
T
(2) T
with v = [v1 (x, t), v2 (x, t), v3 (x, t)] and u = [u1 (x, t), u2 (x, t), u3 (x, t)] . In section 2.2, we derive an expression for the force field F such that the viscous fluid flow maximizes mutual information between corresponding voxel intensities. When the forces are given, solving (1) yields deformation velocities, from which the deformation itself can be computed by integration over time. In [3] the Navier-Stokes equation is solved by Successive Over Relaxation (SOR), but this is a computationally expensive approach. Instead, we follow the approach of [2] and obtain the velocity field by convolution of the force field with a Gaussian kernel ψ: v =ψF (3)
A Viscous Fluid Model for Multimodal Non-rigid Image Registration
543
The displacement u(k+1) at iteration (k + 1) is then given by: u(k+1) = u(k) + R(k) .∆t
(4)
with R(k) the perturbation to the deformation field: (k) 3 (k) ∂u (k) (k) R =v − vi ∂xi i=1
(5)
The time step ∆t is constrained by ∆t ≤ max(R).∆u, with ∆u the maximal voxel displacement that is allowed in one iteration. To preserve the topology of the object image, the Jacobian of the deformation field should not become negative. When the Jacobian becomes anywhere smaller than some positive threshold, regridding of the deformed template image is applied as in [3] to generate a new template, setting the incremental displacement field to zero. The total deformation is the concatenation of the incremental deformation fields associated with each propagated template. 2.2
Force Field Definition
We define an expression for the force field F (x, u) in (1) such that the viscous fluid deformation strives at maximizing mutual information I(u) of corresponding voxel intensities between the deformed template image F(x − u) and the target image G(x). We adopt here the approach of [5] who derived an expression for the gradient ∇u I of I with respect to the deformation field u, modelling the F ,G (i1 , i2 ) of template and target images as a conjoint intensity distribution pu tinuous function using Parzen windowing. If the deformation field u is perturbed into u + h, variational calculus yields the first variation of I: F ,G pu+h (i1 , i2 ) ∂ ∂I(u + h) F ,G = (i1 , i2 ) log F di1 di2 p ∂ ∂ u+h p (i1 )pGu+h (i2 ) =0 =0 F ,G F ,G ∂pu+h (i1 , i2 ) (i1 , i2 ) pu = 1 + log F di1 di2 (6) ∂ p (i1 )pGu (i2 ) =0
The joint intensity probability is constructed from the domain of overlap V of both images (with volume V ), using the Parzen windowing kernel ψ(i1 , i2 ): 1 F ,G pu (i1 , i2 ) = ψ(i1 − F(x − u), i2 − G(x))dx (7) V V Inserting (7) in (6) and rearranging as in [5], yields ∂Lu 1 ∂I(u + h) ψ (F(x − u), G(x))∇F(x − u)h(x)dx (8) = ∂ V V ∂i1 =0 with Lu (i1 , i2 ) = 1 + log
F ,G (i1 , i2 ) pu F p (i1 )pGu (i2 )
(9)
544
E. D’Agostino et al.
Fig. 1. Left: T1 MPRAGE Patient image; Middle: CSF segmented using standard prior; Right: CSF segmented after non-rigid matching of the atlas. Table 1. Root mean square error ∆T in millimeter between ground thruth and recovered deformation fields within the brain region for different multimodal image combinations of BrainWeb simulated MR brain images at different noise levels. Case 1 2 3
T1/T1 T1/T2 T1/PD 0% 3% 7% 0% 3% 7% 3% 0.384 0.430 0.465 0.577 0.759 0.685 0.723 0.304 0.398 0.433 0.443 0.640 0.649 0.661 0.351 0.411 0.459 0.505 0.753 0.775 0.772
We therefore define the force field F at x to be equal to the gradient of I with respect to u(x), such that F drives the deformation to maximize I: ∂Lu 1 ψ (F(x − u), G(x))∇F(x − u) (10) F (x, u) = ∇u I = V ∂i1 2.3
Implementation Issues
The method was implemented in Matlab, with the image resampling and histogram computation coded in C. The histogram was computed using 128 bins for both template and target images. Parzen windowing was performed by convolution of the joint histogram with a 2D Gaussian kernel. The maximal displacement at each iteration ∆u was set to 0.3 voxels and regridding was performed when the Jacobian became smaller than 0.5. Iterations were continued as long as mutual information I(u) increased, with a maximum of 75 iterations. A multiresolution optimization strategy was adopted by smoothing and downsampling the images at 3 different levels of resolution, starting the process at the coarsest level and gradually increassing resolution as the method converged. Computation time for matching two images of size 128x128x80 is about 50 minutes.
A Viscous Fluid Model for Multimodal Non-rigid Image Registration
545
Table 2. Overlap coefficient for different tissue classes of tissue maps obtained with ground thruth and recovered deformation fields for different multimodal image combinations of BrainWeb simulated MR brain images. Noise level was 3% in each case. Case 1 2 3
3
T1/T1 WM GM CSF WM 0.9282 0.9179 0.8698 0.8579 0.9253 0.9277 0.8969 0.8595 0.9279 0.9260 0.8795 0.8460
T1/T2 T1/PD GM CSF WM GM CSF 0.8320 0.7645 0.8604 0.8454 0.7844 0.8463 0.7839 0.8564 0.8373 0.7818 0.8270 0.7579 0.8552 0.8028 0.7413
Experiments
The method was validated on simulated images generated by the BrainWeb MR simulator [1] with different noise levels. In all experiments the images were non-rigidly deformed by known deformation fields T ∗ . These were generated by using our method to match the T1 weighted BrainWeb image to real T1 weighted images of 3 periventricular leukomalacia patients, typically showing enlarged ventricles. We evaluate how well the recovered deformation T , obtained by matching the original T1 weighted BrainWeb image to the T1, T2 or proton density (PD) weighted images deformed by T ∗ , resembles the ground truth T ∗ . Both deformations were compared by their root mean square (RMS) error ∆T evaluated in millimeter over all brain voxels B:
1 ∆T = (|T (x) − T ∗ (x)|)2 (11) NB B
We also verified the impact of possible registration errors on atlas-based segmentation by comparing the (hard classified) tissue maps M and M ∗ , obtained by deforming the tissue maps of the original image using T and T ∗ respectively. We measure the difference between M and M ∗ by their overlap coefficient Oj (M, M ∗ ) for 3 tissue types j, white matter (WM), grey matter (GM) and cerebro-spinal fluid (CSF): Oj (M, M ∗ ) =
2Vj (M, M ∗ ) Vj (M ) + Vj (M ∗ )
(12)
with Vj (M, M ∗ ) the volume of the voxels that are assigned to class j in both maps and Vj (M ) and Vj (M ∗ ) the volume of the voxels assigned to class j in each map separately. Figure 1 shows the registration result of the BrainWeb T1 image to one of the patient images and the segmentation of CSF obtained using the method of [9] with affine and with our non-rigid atlas registration procedure. Note how the segmentation of the enlarged ventricles is much improved by using non-rigid atlas warping. Table 1 shows the RMS error ∆T computed for T1 to T1, T2 and PD registration of the BrainWeb images at different noise levels (each time identical for
546
E. D’Agostino et al.
Fig. 2. Left: Original BrainWeb T1 template; right: BrainWeb target image obtained by applying a known deformation; middle: template matched to target. Top: T1/T1 registration; middle: T1/T2; bottom: T1/PD.
object and target images), for 3 different ground truth deformations. All values are smaller than one voxel, with the most accurate results being obtained for T1/T1-matching. The overlap coefficients for WM, GM and CSF in the ground truth and recovered tissue maps are tabulated in table 2. The results are visualized in figure 2 and figure 3.
4
Discussion
We present an algorithm for non-rigid multimodal image registration using a viscous fluid model by defining a force field that drives the deformation such that mutual information of corresponding voxel intensities is maximized. Our method is in fact the merger of the mutual information based registration functional presented in [5] with the viscous fluid regularization scheme of [3]. The joint intensity probability of the images to be matched is estimated using Parzen windowing and is differentiable with respect to the deformation field. The size of the Parzen windowing kernel needs to be properly chosen such that the criterion is a more or less smooth function of the deformation field. This choice is related to the image noise. For all experiments described above, the same kernel was used, indepedently of the multispectral nature of the images. In the current implementation, the extension of the Parzen estimator is automatically computed using a leave k out cross validation technique maximizing an empirical likelihood of the marginal densities[10,11]. The impact of the Parzen windowing kernel on the registration process needs further investigation.
A Viscous Fluid Model for Multimodal Non-rigid Image Registration
547
Fig. 3. Misclassified WM (left), GM (middle) and CSF (right) voxels of recovered vs ground truth deformation using the results in figure 2. Top: T1/T1 registration; middle: T1/T2; bottom: T1/PD.
Another relevant implementation parameter is the time step ∆t or the maximal displacement ∆u allowed at each iteration that is specified to update the displacements after solving the Navier-Stokes equation. Selecting a larger value for ∆t will result in larger displacement steps and a more frequent regridding of the template as the Jacobian of the transformation is more likely to become non-positive. A smaller value of ∆t on the other hands implies a larger number of iterations for convergence. More experiments are needed to properly tune this parameter. We validated our algorithm using simulated T1, T2 and PD images from BrainWeb with different noise levels and different realistic ground truth deformations generated by registration of the simulated image with real patient images. Although the RMS error was found to be subvoxel small in all cases, T1/T1 registration gave more accurate results than T1/T2 or T1/PD registration. The contrast between gray and white matter especially is much better in T1 than in T2 or PD and the algorithm succeeds better at recovering the interface between both tissues in T1 than in T2 or PD. We also compared T1-to-T2 versus T2-toT1 registration and found that somewhat better results are obtained using T1 as the template image. This can be explained by the fact that the forces driving the registration depend on the gradient of the template image, which is better defined in T1 than in T2 at the interface between white and gray matter.
548
5
E. D’Agostino et al.
Conclusions
We have presented a multimodal free-from registration algorithm based on maximization of mutual information that models the images as a viscous fluid. The forces deforming the images are defined as the gradient of mutual information with respect to the deformation field, using Parzen windowing to estimate the joint intensity probability. We have validated our method for matching simulated T1-T1, T1-T2 and T1-PD images, showing that the method performs quite well in both mono and multi-modal conditions. Future work includes the introduction of more spatial information and more specific intensity models into the similarity criterion in order to make the registration more robust.
References 1. Available at http://www.bic.mni.mcgill.ca/brainweb/. 2. M. Bro-Nielsen, C. Gramkow. Fast Fluid Registration of Medical Images. Proc. Visualization in Biomedical Computing (VBC’96), Lecture Notes in Computer Science, vol. 1131, pp. 267-276, Springer, 1996. 3. G.E. Christensen, R.D. Rabitt, M.I. Miller. Deformable Templates Using Large Deformation Kinematics. IEEE Trans. Medical Imaging, 5(10):1435–1447, 1996. 4. T. Gaens, F. Maes, D. Vandermeulen, P. Suetens. Non-rigid multimodal image registration using mutual information. Proc. Medical Image Computing and Computer-Assisted Intervention (MICCAI98), Lecture Notes in Computer Science, vol. 1496, pp. 1099-1106, Springer, 1998. 5. G. Hermosillo, C. Chef d’Hotel, O. Faugeras. A Variational Approach to MultiModal Image Matching. INRIA Technical Report N. 4117, February 2001. 6. B. Likar, F. Pernus. A hierarchical approach to elastic registration based on mutual information. Image and Vision Computing, 19:33-44, 2000. 7. F. Maes, A. Collignon, D. Vandermeulen, G. Marchal, and P. Suetens. Multimodality image registration by maximization of mutual information. IEEE Trans. Medical Imaging, 16(4):187–198, 1997. 8. D. Rueckert, L.I. Sonoda, C. Hayes, D.L.G. Hill, M.O. Leach, D.J. Hawkes. Nonrigid registration using free-form deformation: application to breast MR images. IEEE Trans. Medical Imaging, 18(8):712–721, 1999. 9. K. Van Leemput, F. Maes, D. Vandermeulen, P. Suetens. Automated model-based tissue classification of MR images of the brain. IEEE Trans. Medical Imaging, 18(10):897–908, 1999. 10. G. Hermosillo Valadez. Variational methods for multimodal image matching. Doctoral Thesis, Universite de Nice, Sophia Antipolis, 138-141, 3 May 2002. 11. B. A. Turlach. Bandwidth selection in kernel density estimation: a review. Discussion Paper 9317, Institut de Statistique, UCL, Louvain La Neuve, 1993.
Non-rigid Registration with Use of Hardware-Based 3D B´ ezier Functions Grzegorz Soza1 , Michael Bauer1 , Peter Hastreiter1,2 , Christopher Nimsky2 , and G¨ unther Greiner1 1
2
Computer Graphics Group, University of Erlangen-Nuremberg Am Weichselgarten 9, 91058 Erlangen, Germany
[email protected] Neurocenter, Department of Neurosurgery, University of Erlangen-Nuremberg
Abstract. In this paper we introduce a new method for non-rigid voxelbased registration. In many medical applications there is a need to establish an alignment between two image datasets. Often a registration of a time-shifted medical image sequence with appearing deformation of soft tissue (e.g. pre- and intraoperative data) has to be conducted. Soft tissue deformations are usually highly non-linear. For the handling of this phenomenon and for obtaining an optimal non-linear alignment of respective datasets we transform one of them using 3D B´ezier functions, which provides some inherent smoothness as well as elasticity. In order to find the optimal transformation, many evaluations of this B´ezier function are necessary. In order to make the method more efficient, graphics hardware is extensively used. We applied our non-rigid algorithm successfully to MR brain images in several clinical cases and showed its value.
1
Introduction
Non-rigid registration and elastic warping of medical images have been addressed in numerous works. Bajcsy et al. [1] were first to demonstrate non-rigid registration of medical images. Generally, registration algorithms can be categorized into several different groups. The first group consists of pure voxel-based algorithms, where the computations are done analyzing only voxel grey-value information contained in the image datasets. The analysis is usually conducted according to some special similarity measures, like mutual information [14, 4], without any assumptions of external factors causing the deformation. Optical flow [7] and viscous fluid approaches [3] form another group. Further, there are physically-based methods, where the deformation of the soft tissue is described with physically motivated, mostly differential equations, that are discretized on a 3D grid and then approximately solved using finite element methods [5, 9]. In order to validate any of these registration algorithms, a precise quantification of occurring deformation is necessary [8]. The method we introduce is a novel voxel-based approach that combines elements of geometric transformations with computations done in graphics hardware in order to reduce computation time. Interpolation features of the hardware T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 549–556, 2002. c Springer-Verlag Berlin Heidelberg 2002
550
G. Soza et al.
are used in a special manner for the approximation of the deformation function with 3D piecewise linear patches. Our own experiences with rigid [6], non-rigid registration [13] and, generally, with appliance of graphics hardware [11] were extended in this work in order to allow for computation of Free-Form Deformations (FFD) [2, 12]. An optimal solution is searched for in the space of B´ezier functions, as they seem to be flexible enough and have some inherent elasticity, which makes them suitable for describing deformation of the brain tissue. The paper is divided into 4 sections. An introduction into the theory of B´ezier transformations is given in the Section 2. Subsequently, a general FreeForm Deformation approach and our hardware-based modification of the method are described. In Subsection 2.3 operations done in graphics hardware are explained. At the end of Section 2 the non-linear registration algorithm based on the modified FFD is described. In order to evaluate our registration algorithm we applied it to pre- and intraoperative MR images of the brain and summarized the results in Section 3 and 4.
2 2.1
Method Mathematical Background
Registration of medical images can be treated as a deformation of one of them in the way that the deformed image aligns with the reference image. We deform medical data using Free-Form Deformation (FFD). The idea is to warp the space surrounding an object that will be then warped implicitly. For the purpose of deformation of the space we take three-dimensional B´ezier functions, as they provide a mechanism for their modification and are characterized by intuitive behavior on their change. This kind of Free-Form Deformation contains inherent elasticity as well, which makes it a good choice for describing the movements of the soft tissue. Let us consider the object space OS parameterized with the function P : PS → OS leading from the parameter space PS being [0, 1]3 into this object space. The object space is associated with one of the datasets that will be transformed, the second dataset remains fixed. Let us introduce the deformation function D : PS → T S leading from the parameter space into the texture space T S (defined in the next section). We assume D is a B´ezier function, thus the shape of this deformation function is uniquely defined by the corresponding lattice of control points bi,j,k (i = 0, . . . , l, j = 0, . . . , m, k = 0, . . . , n) placed in the texture space. This deformation function can be expressed then as a trivariate tensor product: D(s, t, u) =
l m n
Bil (s)Bjm (t)Bkn (u)bi,j,k .
(1)
i=0 j=0 k=0
Movements of the control points bi,j,k in the lattice are followed by immediate changes in the form of the deformation function D. The basis functions Bil , Bjm , Bkn are Bernstein polynomials of order l, m and n, respectively.
Non-rigid Registration with Use of Hardware-Based 3D B´ezier Functions
2.2
551
Free-Form Deformation (FFD)
In order to accelerate the FFD we make extensive use of graphics hardware. In the first step the image volume from the object space is recomputed and loaded into the 3D texture memory of the graphics card, since we want to do the most expensive computations in the texture processing unit of graphics hardware. This image data is loaded into the texture memory only once, at the beginning. In order to perform texture mapping, the texture space T S being [0, 1]3 is associated with the texture memory. A single FFD of an object is divided into three steps. In the first step the object is embedded in the initial lattice of control points (the lattice lays in the texture space and the object lays physically in texture memory and in logical sense in the object space). In the next step control points are moved to their new locations in the texture space, thereby changing the control lattice. The movement is denoted by function M : T S → T S M(bx , by , bz ) = (tx , ty , tz ) .
(2)
In our approach we do not consider the absolute coordinates of the control points. We consider the free parameters of our B´ezier transformation to be offset vectors (tx , ty , tz ) from the initial control points positions (bx , by , bz ). Such a treatment allows us to view the occurring deformation as a change of a vector field that deforms an object placed within it, which is closer to the physical nature of the phenomenon. Initially the vector field is everywhere 0. For technical reasons the vector field is set to be 0 at the border during the whole registration process, thus only the inner control points of the B´ezier function are free for optimization. It is also well motivated in practice, because usually no deformation occurs at the boundary of a 3D image volume, as the interesting information is contained in the interior of medical images. After executing these steps, in classical FFD approaches [12] the new positions for every object point are explicitly calculated, based on the new locations of the control points. Instead, in our algorithm texture coordinates are computed with function D only for some uniform discrete sparse grid of points in the parameter space D(s, t, u) =
l m n
Bil (s)Bjm (t)Bkn (u)(bi,j,k + M(bi,j,k )) .
(3)
i=0 j=0 k=0
It should be mentioned, that this grid can be denser than the control lattice in order to get closer to the shape of the original 3D function. Having these texture coordinates on the sparse grid, we use them for approximation of the 3D B´ezier function with piecewise linear 3D patches. The motivation for such an approach is to make possible appliance of graphics hardware in order to optimize the execution of time consuming computations. An example presenting such an analogous approximation (only in 2D) is shown in Figure 1.
552
G. Soza et al.
1.0
1.0
1.0
1.0
0.75
0.75
0.75
0.75
0.5
0.5
0.5
0.5
0.25
0.25
0.25
0.25
0
0 0
0.25
0.5
a)
0.75
1.0
0 0
0.25
0.5
0.75
1.0
0 0
0.25
b)
0.5
c)
0.75
1.0
0
0.25
0.5
0.75
1.0
d)
Fig. 1. Subdivision of a slice into 2D piecewise linear patches. The B´ezier function is defined over a 3 × 3 lattice. Control point b1,1 was moved from its initial position (0.5,0.5) to (0.1,0.1), which resulted in D(0.5, 0.5) = (0.4, 0.4). a) Values from the image of function D on a uniform discrete grid 3 × 3. b) Resulting 2D piecewise linear subdivision of the slice. c) Values from the image of function D on a uniform 5 × 5 grid. d) Piecewise linear subdivision of the slice based on the values from c)
2.3
3D B´ ezier Function and 3D Textures
Based on the values of the function D on this sparse grid, the deformation is then propagated on the whole volume using trilinear interpolation. For accelerating this operation texture processing operations of graphics hardware are used. Using this approach less computational time is needed, as we do not need to process the whole 3D image voxel by voxel in software in order to obtain the new positions of the object points. For this purpose the uniform grid in the parameter space is sliced with planes parallel to one of the main axes in this space. The intersection points of the grid with the planes create a uniform quadrilateral structure in each slice. The number of resulting slices is equal to the resolution of the image volume in the direction perpendicular to the slices. For each slice a corresponding deformed slice in the texture space is computed (see Figure 2). Such a deformed slice consists of non-planar quadrilaterals whose vertices are defined by texture coordinates linearly interpolated from the values of function D on the sparse grid. For the purpose of rendering, the deformed texture coordinates are then assigned to their corresponding vertices in the parameter space. In order to avoid artifacts caused by an incorrect automatic triangulation in OpenGL the quadrilaterals are explicitly triangulated. Polygons are then rendered into the frame buffer. These polygons are texture mapped according to the computed texture coordinates and corresponding image information obtained after trilinear interpolation in graphics subsystem. 2.4
Registration with B´ ezier Functions
As an initial estimation for the non-rigid registration rigidly registered datasets are taken [6]. After the rigid registration also accelerated with graphics hardware, we consider one of the datasets, load it into the texture memory and embed it in a lattice of control points, thereby creating a structure allowing intuitive deformation of this image data. Initially the lattice has the form of a uniform
Non-rigid Registration with Use of Hardware-Based 3D B´ezier Functions
553
texture coordinates additional vertices for explicit triangulation
a)
b)
Fig. 2. Explicitly triangulated slice from the parameter space with corresponding texture coordinates: a) initially and b) after transformation of the texture coordinates
parallelepiped. The main idea of the non-rigid registration is to manipulate free control points in the lattice in such a way that the volume deformed with FFD (as described in Section 2.2) aligns with the reference volume. This makes the registration tantamount to a multidimensional optimization problem. The quality of the alignment is assessed based on mutual information. For the purpose of optimization Powell’s direction set method [10] is used. As we consider the occurring deformation as a deformation of a vector field, the degrees of freedom during the optimization are the translation vectors from the initial positions of the inner control points in the lattice. The control points on the lattice boundary remain fixed during the registration. In each optimization step a coordinate in one dimension of only one control point is changed, and the new volume obtained with FFD is computed. The procedure continues until the similarity measure computed between the deformed volume and the reference dataset reaches its optimum within a desired precision.
3
Results
We validated the algorithm in 7 clinical cases of patients with brain tumors. The experiments were carried out with pairs of MR T1-weighted scans of the head acquired before and during surgery on an open skull at the Department of Neurosurgery of the University of Erlangen-Nuremberg. All the scans were done with a Siemens Magnetom Open 0.2 Tesla scanner with resolution of 256 × 256 × 112 voxels and voxel size of 0.97 mm × 0.97 mm × 1.5 mm. Note the difference between pre- and intraoperative MR images, although the same pulse sequence was applied for both data. This is due to a special coil used for taking intraoperative images and to artifacts resulting from the operating environment. In all cases a significant brain shift effect had occurred. This phenomenon is
554
G. Soza et al.
influenced by a variety of factors, like gravity and leakage of cerebrospinal fluid. The effect was compensated for with our non-linear registration procedure. Each pair of the datasets was firstly registered rigidly and after that registered nonlinearly with our method, as already described. The experiments with the non-rigid registration method were conducted with a control lattice of 5 × 5 × 5 control points. However, in order to better approximate the corresponding B´ezier function, the function was sampled on a more dense grid of 9 × 9 × 9 points. This divided the object space uniformly into parallelepipedes of 3.12 cm × 3.12 cm × 2.40 cm. For achieving acceleration of the execution time we experimented also with downsampled data. The downsampling was done completely in hardware, therefore its computational cost was very low. The results so obtained were almost identical to the ones where original data was used, however, a significant acceleration was achieved. An average non-linear registration lasted between 6 and 7 minutes for one dataset. We would like to mention that no expensive and special hardware was needed for the computations. All computations can be done on a PC equipped with one of the commonly available graphics cards which support 3D texture mapping. Our experiments were executed on a PC with AMD Athlon 1.2 GHz processor and GeForce3 64MB graphics card. After the datasets had been successfully registered, the results were inspected visually by neurosurgeons. A good quality of the registration was observed, above all in the region of the brain surface (cortex), as presented in Figure 3. However, in the vicinity of the ventricles some small artifacts were seen. This could be explained with a quite sparse lattice of control points taken for the registration (5 × 5 × 5). This can be compensated for with a denser lattice of control points. However, as a trade it would result in a higher number of free parameters in the optimization and consequently longer computation times. Finally, we did a quantitative assessment of our algorithm to determine more precisely the quality of the method. The decisive evaluation criterion was the maximal extent of the brain shift measured at the cortex. We considered the magnitude of the brain shift (in mm) after a rigid registration only and after deforming the preoperative image with our approach in a non-linear way. The summarized results of the comparison are collected in Table 1. We can see from the table that the registration algorithm could compensate for the brain shift phenomenon with satisfying precision.
4
Conclusion
We presented a novel, non-linear registration approach based on Free-Form Deformation. In comparison to traditional approaches, the flexibility of B´ezier transformation is combined with the performance achieved applying graphics hardware in a special manner. Tests conducted with data from real patients showed the robustness and efficiency of the method. Despite of a poor contrast in the intraoperative images, the algorithm could correctly match them to the preoperative data in all cases.
Non-rigid Registration with Use of Hardware-Based 3D B´ezier Functions
a1)
b1)
c1)
d1)
a2)
b2)
c2)
d2)
555
Fig. 3. Results of rigid and non-linear registration of pre- and intraoperative MR scans of the brain. Top: axial, bottom: sagittal view. a) A preoperative slice image. b) The corresponding rigidly registered slice of intraoperative image superimposed with the contours of the brain extracted from the preoperative scan. c) Slice from non-linearly deformed image a). d) Slice of intraoperative image overlayed with the contours of brain from the deformed image Table 1. Quantitative results of the experiments Patient Age No (gender) 1 2 3 4 5 6 7
50 (F) 38 (F) 67 (F) 59 (F) 54 (M) 53 (F) 50 (F)
Diagnosis
Location of the tumor
Max shift at the cortex rigid non-linear
astrocytoma WHO II right ventricle 10.97 mm cavernoma frontal 9.47 mm metastasis left frontal 7.11 mm glioblastoma left frontal 6.67 mm glioblastoma left temporal 10.53 mm metastasis left frontal 7.26 mm metastasis left frontal 8.02 mm
1.80 1.74 1.26 1.28 2.13 1.88 1.97
mm mm mm mm mm mm mm
For our experiments we segmented brains in a semi-automatic way, as only the soft tissue undergoes deformation, the skull remaining stiff. This meant some time cost for preprocessing, which is surely a constraint of the procedure. However, there exist numerous methods that allow a completely automatic execution of segmentation of brains. In this work we concentrated exclusively on the registration method, that in the presented approach is completely automatic and computationally very efficient. Moreover, generally, the method is very flexible and can be used for registration of images of different modalities, as it is based on a statistical similarity measure, which mutual information in fact is. This method has been used to carry out successful experiments with registration of CT and MR data.
556
G. Soza et al.
Acknowledgments We gratefully acknowledge the help of Joel Heersink in proofing this paper. This work was funded by Deutsche Forschungsgemeinschaft in the context of the project Gr 796/2-1.
References ! ! ! " #$%&' (( ' ! !% ))! (*' + , -" . ((# # / !!" , 0 ! ! / !1 ! 2" " 23 4 +%'$+&'*# ((5 5 6
/ 7 8 6 / 9! - : ! " 3 +, 2 ) 2" 3
." 6 ;!
! ! !!! " # '<='>%+#&+(* '<< $ 4 ; ! 2 " " 0! 3 ! 2" , 2 )" *&5 (( * / 7 - - !! 222 - : ! ! 6 9! 4 ! ?)! 6!@ 3 3 0!
, 3 3 2 ) 2" $ " '#%5+& 5+ '<<< , !! / A , , A! - !! B@ !! , @ 4 @ 2 " 3 2 )
, 1 3 ." 54 2 ! % ! ! !!! " # *%*&'5 (( ( " ! ) 6 / 2 0 /! !" 3 2 !
"! A" $ % ! ''=#>%+5#&+$+ '<<< < - 6! 4 ! - 0 !" & ' (( 7 " . '<<' 1! " 6 0! , 3 1 ! ) @ 2 )'*+! ,-. * '<< ' , A 2 , A !! ? A , 9 @ " " ." 6 16 , 3 % /))!
2" !!! " # =>%*'&*' ((( + 1! 2 ) " ) @ 2 % # /% 0 1 # 2 3 '<<' # 0! - - !! /!" 7 B 3 ! 23 2 )" $&'+ 7 " / ((5
Brownian Warps: A Least Committed Prior for Non-rigid Registration M. Nielsen1 , P. Johansen2 , A.D. Jackson3 , and B. Lautrup3 1 2
IT U., Copenhagen Denmark DIKU, Copenhagen, Denmark 3 NBI, Copenhagen, Denmark
Abstract. Non-rigid registration requires a smoothness or regularization term for making the warp field regular. Standard models in use here include b-splines and thin plate splines. In this paper, we suggest a regularizer which is based on first principles, is symmetric with respect to source and destination, and fulfills a natural semi-group property for warps. We construct the regularizer from a distribution on warps. This distribution arises as the limiting distribution for concatenations of warps just as the Gaussian distribution arises as the limiting distribution for the addition of numbers. Through an Euler-Lagrange formulation, algorithms for obtaining maximum likelihood registrations are constructed. The technique is demonstrated using 2D examples.
1
Introduction
In any non-rigid registration algorithm, one must weigh the data confidence against the complexity of the warp field mapping the source image geometrically into the destination image. This is typically done through spring terms in elastic registration [3,8,7], through the viscosity term in fluid registration [5] or by controlling the number of spline parameters in spline-based non-rigid registration [1,20]. If non-rigid registration algorithms, symmetric in source and destination, can be constructed, many problems in shape averaging and shape distribution estimation can be avoided. The regularizer is not symmetric with respect to source and destination in the methods mentioned above. While symmetric regularizers can be constructed in most cases simply by adding a term for the inverse registration [6], this solution is not theoretically satisfactory. The aim of the present paper is to construct a regularizer that exhibits this symmetry inherently and that is also least committed in the same way as the Gaussian distribution is for the addition of numbers. This will be made precise later. Section 2 gives more concise definitions, motivation, and states the principles of the problem. Section 3 contains the solution to the problem formulated in Section 2 and some of its properties. Section 4 describes a gradient descent method for finding the optimal warp given a set of landmark matches. In this way, this paper offers both theoretical considerations and their application. Subsequent development will demonstrate this on real medical data. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 557–564, 2002. c Springer-Verlag Berlin Heidelberg 2002
558
2
M. Nielsen et al.
Definitions and Motivation
A non-rigid registration may be modeled by a warp field W : IRD → IRD mapping points in one D-dimensional image into another D-dimensional image. We give the definition: Definition 1 (Warp Field). A warp field W (x) : IRD → IRD maps all points in the source image IS (x) : IRD → IR into points of the destination image ID (x) : IRD → IR such that IS (W (X)) is the registered source image. W is invertible and differentiable (i.e., a diffeomorphism) and has everywhere a positive Jacobian det(∂xi W j ) Here, we have made the assumption that non-rigid registrations are invertible and differentiable. This seems valid in cases where images are created from similar structures. In some cases, such as separated bone fractures, this conjecture is not appropriate. However, in nearly all medical cases, a non-rigid registration is made on the basis of anatomical structures of identical topology, and the above definition will apply. A diffeomorphism will always have the same sign of the Jacobian everywhere. Our choice of positive Jacobian applies to those cases where the object is not geometrically mirrored. The identification of a warp field on the basis of images is a matter of inference. Below we will apply the Bayes inference machine [13], but a similar formulation should appear when using information theoretic approaches such as the minimum description length principle [17]. We wish to determine the warp field W that maximizes the posterior p(W |IS , ID ) =
1 p(IS , ID |W )p(W ) Z
where Z is a normalizing constant (sometimes denoted the partition function), p(IS , ID |W ) is the likelihood term, and p(W ) is the prior. The likelihood term is based on the similarity of the warped source and destination image and may, in this formulation, be based on landmark matches [4], feature matches [15,18], object matches [2], image correlation [15], or mutual information [21]. The subject of this paper is to address the prior p(W ) that expresses our belief in the regularity of the warp field prior to identifying the images. In specific medical applications, this may be based on active shape models [9,16]. However, to construct such models, homology must be created as an in principle dense field, and the present work may also be used in this context. We wish the prior p(W ) to exhibit the specific properties that it is: • Derived from first principles, • Least committed, • Symmetric with respect to source and destination, • Invariant with respect to warps. In the following section we will formalize these properties and derive a prior on warps that we will denote Brownian Warps in analogy to Brownian motion.
Brownian Warps: A Least Committed Prior for Non-rigid Registration
3
559
Brownian Warps
We seek that distribution of warps which is the analogue of Brownian motion. We wish this distribution to be independent of warps performed earlier (i.e., invariant with respect to warps). This property is of fundamental importance particularly when determining the statistics of empirical warps, creating mean warps etc. In such cases it is required by consistency in order to avoid the use of a fiducial pre-defined standard warp. We may formulate this as: p(W = W2 ◦ W1 ) = δ(W2 − W ◦ W1−1 )p(W2 )dW1 . This corresponds to the semi-group property of Brownian motion: The distribution of positions after two moves corresponds to two independent moves and, through the central limit theorem, leads to a Gaussian distribution of positions. Since this also holds for a concatenation of many warps, we can construct a warp as N WB = lim ◦Wi , N →∞
i=0
where the Wi are independent warps. This corresponds exactly to the definition of Brownian motion if the concatenation product is replaced by an ordinary sum. In order to find this limiting distribution when all Wi are independent, we investigate motion in the neighborhood of a single point following along all the warps and make the following lemma: Lemma 1 (Local structure). Let JW = ∂xi W j be the local Jacobian of W . Then, the Jacobian of a Brownian warp JWB = lim
N →∞
N
J Wi
i=0
Proof This is obviously true due to the chain rule of differentiation.
2
Assume that an infinitesimal warp acts as the infinitesimal independent motion of points. In this case, all entries in the local Jacobian are independent and identically distributed. Hence, we may now model JWB = lim
N →∞
N
1 I + σ √ Hi , N i=0
(1)
where Hi is a D × D matrix √of independent identically distributed entries of unit spread. The denominator N is introduced to make the concatenation product finite, and σ is the spread or the “size” of the infinitesimal warps. To summarize, the limiting distribution of Eq. 1 is the distribution of the Jacobian of a Brownian Warp. In turn this defines the Brownian distribution on warps, as we have no reason to assume other structure in the distribution.
560
M. Nielsen et al.
Unfortunately, the solution to Eq. 1 is not given in the literature on random matrices. Gill and Johansen [10] solve the problem for matrices with positive entries and H¨ogn¨ as and Mukherjea [12] solve, among other cases, the situation when the matrices are symmetric. Recently, we have solved the case for two dimensions [14] and are presently considering the solution for three. Here, we present only the result: Theorem 1 (2D Brownian Jacobian). The limiting distribution of Eq.1 where Hi have independent entries of unit spread and W : IR2 → IR2 , is given as ∞ p(JWB ) = G(S/σ) gn (F/σ) cos(nθ) , (2) n=0
where G is the unit spread Gaussian, gn are related to the Jacobi functions, and the parameters are given as follows: Scaling
S = log(det(JWB ))
Skewness F =
1 JWB 22 2 det(JWB )
−j21 Rotation θ = arctan( jj12 ) 11 +j22
It is shown in [14] that the limiting distribution does not depend on features of the infinitesimal distribution other than its spread, σ. This limiting distribution is thus least committed in the sense that it arises from the sole assumption of invariance under warps. The parameter σ may be viewed as a measure of rigidity or viscocity. The effects of the parameters are shown in Fig. 1. Scaling Skew Rotation S ≈ 0.8, F = 1, θ = 0 S = 0, F ≈ 2, θ = 0 S = 0, F = 1, θ ≈ 0.5
Fig. 1. The independent action of the parameters on a unit square.
Now, we prove that the above Brownian warp distribution is invertible and symmetric with respect to source and destination. Evidently, this is true by construction; one can simply invert the infinite multiplication sequence since the final distribution depends only on the spread of the independent infinitesimal warps. However,
Brownian Warps: A Least Committed Prior for Non-rigid Registration
561
Theorem 2 (Invertability). The distribution of warps given as spatially independent Jacobians each distributed according to Eq. 2 has with probability 1 no folds. Proof A fold implies that the local Jacobian of the warp is zero or less. The 2 above distribution has a positive Jacobian with probability 1. Theorem 3 (Symmetry). The distribution of warps given as spatially independent Jacobians each distributed according to Eq. 2 is invariant under inversion of the warp. Proof The inversion of a warp W → W −1 makes the local Jacobians undergo an inversion, too: J → J −1 . Under this, the distribution parameters map as S → −S, F → F , and θ → −θ. Since the distribution is even in S and θ, it is 2 unaltered under inversion of J. Theorem 4 (Euclidean invariance). The distribution of warps given as spatially independent Jacobians each distributed according to Eq. 2 is invariant under Euclidean coordinate transformations of source and destination. Proof The individual Jacobians will transform as J → RJR−1 , where R is a rotation matrix under simultaneous and identical rotation and scaling of source and destination. This transformation leaves all three parameters S, F, θ invari2 ant. One should notice that this invariance holds under simultaneous and identical scaling and rotation of source and destination. If one wants to incorporate independent similarity invariance, it is necessary to introduce uniformly distributed global bias parameters in S and θ as done by Glasbey and Mardia [11]. For computational purposes it may be convenient to approximate the above distribution by a distribution which is also independent in F and θ. This can be done in many ways without loosing the symmetry and Euclidean invariance. However, warp invariance will no longer hold exactly. We suggest the following approximation. 0.67 p(J) ≈ Gσ (S)Gσ/√2 (θ)e−(F/σ) (3) where Gσ is a Gaussian of spread σ. This approximation has a relative error at less than 3% for all reasonable values of S, θ, F when σ > 0.4. In Figure 2 the joint distribution of F and θ is illustrated using this approximation and is compared to the analytical expression approximated up to n = 14 for σ = 0.3, 0.6, 1.0. The primary oscillating error seen for small σ is due to the cut off at n = 14 in the analytical expression.
4
Implementation
In this section we show how the above distribution can be used for maximum a posteriori (MAP) estimation of the most probable warp given a set of landmark matches.
562
M. Nielsen et al. σ = 1.0
σ = 0.6
1
10
P 0.5
1
Skew
15
Rotation
2
Skew
Rotation
2
1
6 Skew
5
4 10 15
2
0.5 E 0 -0.5 -1
10 8 6 Skew
5 Rotation
4 10 15
2
Relative Error
1 10 8
Skew 4 10 15
Relative Error
0.5 E 0 -0.5 -1
Rotation
6 5
4 10 15
Relative Error
10 8
0
6 5
4 10
40 P 20
8
0
6 5 Rotation
10
P 0.5
8
0
σ = 0.3
2
10 7.5 E 5 2.5 0
10 8 6 Skew
5 Rotation
4 10 15
2
Fig. 2. The joint distribution of F and θ for the approximation (top) and pointwise relative difference to the analytical expression approximated up to n = 14 (below) for σ = 0.3, 0.6, 1.0.
We reformulate the MAP problem as an energy minimization approach by: S 2 + 2θ2 + 2σ 1.33 F 0.67 d˜ x, E(W ) = − log p(W ) + c = Ω
where c is an arbitrary irrelevant constant and x ˜ = x det(J) are integration variables invariant under the warp chosen to ensure global as well as local warp invariance. Unfortunately, the related Euler-Lagrange equation is neither linear nor separable, and simple tricks such as eigenfunction expansions and derived linear splines are not possible. Therefore, we treat the energy minimization problem using a gradient back-projection scheme [19]. δE , δW where Ω is the image domain excluding the matched landmark points. This may easily be relaxed to matched curves without identified landmarks as in geometry-constrained diffusion [2]. We see from the energy formulation that the rigidity parameter determines the relative weight of the skewness term to the scaling and rotation terms. For illustration of the independent terms, see Fig. 3. For large deformations, the difference to spline-based methods, becomes obvious as for example thin plate splines can introduce folds in the warping (see Fig. 4). for x ∈ Ω : ∂t W = −
5
Conclusion
We have introduced a prior for warps based on a simple invariance principle under warping. This distribution is the warp analogue of Brownian motion for
Brownian Warps: A Least Committed Prior for Non-rigid Registration 20
20
40
40
60
S
2
60
80
80
100
100
120
120
140
140
160
160
180
180
200
20
40
60
80
100
120
140
160
180
200
200
20
20
40
40
60
20
40
60
80
100
120
140
160
180
200
20
40
60
80
100
120
140
160
180
200
S2, F
60
80
80
100
100
120
120
140
140
160
F
563
160
180
180
200
20
40
60
80
100
120
140
160
180
200
200
Thin-Plate
Fig. 3. Illustration of deformation of a regular grid. Two points in the center have been moved up and down respectively, while the corners are keeped fixed. We see that the scaling term (top left) aims at keeping the area constant. The skewness term (bottom left) aims at keeping the stretch equally large in all directions. Top right is a combination of scaling and skewness (σ = 1). Bottom right is a thin plate spline for comparison.
20
20
20
20
40
40
40
40
60
60
60
80
80
80
80
100
100
100
100
120
120
120
120
140
140
140
140
160
160
160
180
180
180
200
20
40
60
80
100
120
140
160
180
200
200
20
40
60
80
100
120
140
160
180
200
200
60
160 180
20
40
60
80
100
120
140
160
180
200
200
20
40
60
80
100
120
140
160
180
200
Fig. 4. Leftmost are two images of large deformations: Left is the maximum likelihood Brownian warp, right is a thin plate spline. Rightmost two images are two consequtive warps where landmark motions are inverse: Left is Brownian warps, right is thin plate spline. Brownian warps do not give the exact inverse due to numerical impressision, but closer than the thin plate spline.
additive actions. An estimation based on this prior guarantees an invertible, source–destination symmetric, and Euclidean-invariant warp. When computational time is of concern, approximations can be made which violate the basic warp invariance while maintaining invertability, source–destination symmetry, and Euclidean invariance. We suggested one such approximation being very close to the true distribution. For extremely fast implementations, we recommend an approximation including only the skewness term, as this has nice regularizing properties. We have shown computational examples on synthetic data. Future works includes applications to medical data, the development of algorithms using a mutual information data term, use as void hypothesis in shape deformations, comparisons to flows in chaotic fluid dynamics, extensions to three dimensions, extensions to spatially higher-order correlated priors, and extensions to fraction Brownian warps.
564
M. Nielsen et al.
References 1. A.A. Amini, R.W. Curwen, and J.C. Gore, Snakes and splines for tracking nonrigid heart motion, ECCV96, 1996, pp. II:251–261. 2. Per R. Andresen and Mads Nielsen, Non-rigid registration by geometry-constrined diffusion, Medical Image Analysis 6 (2000), 81–88. 3. R. Bajcsy and S. Kovacic, Multiresolution elastic matching, CVGIP 46 (1989), 1–21. 4. F.L. Bookstein, Morphometric tools for landmark data: Geometry and biology, Cambridge University Press, 1991. 5. M. Bro-Nielsen and C. Gramkow, Fast fluid registration of medical images, Proc. Visualization in Biomedical Imaging (VBC’96) (1996), 267–276. 6. P. Cachier and D. Rey, Symmetrization of the Non-Rigid Registration Probem using Inversion-Invariant Energies: Application to Multiple Sclerosis, Third International Conference on Medical Robotics, Imaging And Computer Assisted Surgery (MICCAI 2000) (Pittsburgh, Pennsylvanie USA) (A.M. DiGioia and S. Delp, eds.), Lectures Notes in Computer Science, vol. 1935, Springer, octobre 11-14 2000. 7. G.E. Christensen and J. He, Consistent nonlinear elastic image registration, MMBIA01, 2001, pp. xx–yy. 8. G.E. Christensen, M. I. Miller, and M. Vannier, A 3d deformable magnetic resonance textbook based on elasticity, AAAI Spring Symposion Series (1994), 153–156, Standford University. 9. T.F. Cootes, C.J. Taylor, D.H. Cooper, and J. Graham, Active shape models: Their training and application, CVIU 61 (1995), no. 1, 38–59. 10. R. D. Gill and S. Johansen, A survey of product-integration with a view toward application un survival analysis, The annals of statistics 18 (1990), no. 4, 1501– 1555. 11. C. A. Glasbey and Mardia K. V., A penalized likelihood approach to image warping, J. R. Statist. Soc. B 63 (2001), 465–514. 12. G. H¨ ogn¨ as and A. Mukherjea, Probability measures on semigroups, Plenum Press, 1995. 13. E. T. Jaynes, Probability theory: The logic of science, http://omega.albany.edu:8008/ JaynesBook.html, Fragmentary Edition of June 1994. 14. B. Lautrup, A. Jackson, P. Johansen, and M. Nielsen, Random maps, Tech. report, Niels Bohr Institute, University of Copenahgen, 2001. 15. J. Maintz and M. Viergever, A survey of medical image registration, 1998. 16. B.S. Morse, S.M. Pizer, and A. Liu, Multiscale medial analysis of medical images, IVC 12 (1994), no. 6, 327–338. 17. Jorma Rissanen, Stochastic complexity in statistical enquiry, World Scientific Publishing Company, Singapore, 1989. 18. K. Rohr, Landmark-based image analysis: Using geometric and intensity models, Kluwer, 2001. 19. J. B. Rosen, The gradient projection method for nonlinear programming. part I. linear constraints, SIAM 8 (1960), no. 1, 181–217. 20. D. Rueckert, L.I. Sonoda, C. Hayes, D.L.G. Hill, M.O. Leach, and D.J. Hawkes, Nonrigid registration using free-form deformations: application to breast mr images, MedImg 18 (1999), no. 8, 712–721. 21. P.A. Viola and W.M. Wells, III, Alignment by maximization of mutual information, Ph. D., 1995.
Using Points and Surfaces to Improve Voxel-Based Non-rigid Registration T. Hartkens, D.L.G. Hill, A.D. Castellano-Smith, D.J. Hawkes1 , C.R. Maurer Jr.2 , A.J. Martin3 , W.A. Hall, H. Liu, and C.L. Truwit4 1
4
Computational Imaging Sciences Group, Guy´s Hospital, King´s College London,UK 2 Department of Neurosurgery, Stanford University, Stanford, CA 3 Department of Radiology, University of California San Francisco, CA Departments of Radiology and Surgery, University of Minnesota, Minneappolis, MN Abstract. Voxel-based non-rigid registration algorithms have been successfully applied to a wide range of image types. However, in some cases the registration of quite different images, e.g. pre- and post-resection images, can fail because of a lack of voxel intensity correspondences. One solution is to introduce feature information into the voxel-based registration algorithms in order to incorporate higher level information about the expected deformation. We illustrate using one voxelbased registration algorithm that the incorporation of features yields considerable improvement of the registration results in such cases.
1
Introduction
Increasing number of studies focus on detecting temporal anatomical changes in the brain by non-rigidly registering tomographic images, e.g. [2][15][14]. The resulting deformation field is used to quantify the volume change or the displacement of the tissue. The applied registration algorithms can be divided into feature-based approaches which use point, curves, and surface information to drive the registration, or voxel-based approaches which operate directly on the image intensities and define voxel similarity measure to compare the images. While feature-based registration algorithms can reliably align anatomical boundaries and therefore can quantify the change of certain anatomical structure with high precision, voxel-based registration algorithms use the intensities throughout the whole images and therefore yield deformation values based on the image content also in regions where it is difficult to detect distinct features. We demonstrate in this paper that incorporating feature information in a voxel-based registration algorithm combines the advantages of both approaches. The voxel-based non-rigid registration algorithm with which we have the greatest experience works well on a wide variety of data including pre-and post contrast MR mamograms [11], serial MR images of the brain [7], and pre-and post resection brain images [6]. However the algorithm sometimes fails, especially when there are large changes between the images. In particular, in a series of 24 pre- and post- resection MR images of the brain, we found that 3 clearly failed on visual inspection to align corresponding features. The technique we propose enables the user to optionally provide additional constraints to improve algorithm accuracy using semi-automatic point and surface delineation. We envisage that this approach could be used either on images which are expected T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 565–572, 2002. c Springer-Verlag Berlin Heidelberg 2002
566
T. Hartkens et al.
to cause the algorithm difficulty, or could be used in cases where visual inspection shows the voxel-driven algorithm to have failed. Anatomical 3D landmarks and partial brain surfaces, respectively, are semi-automatically determined in regions where large deformation occurred. In comparison to an automatic approach, a semi-automatic approach has the advantage that the user can control the results and can introduce higher order information about the expected deformation. The feature information is combined with a voxel-based non-rigid registration algorithm. The extended registration algorithm is applied on 3D image pairs with substantial differences and which are therefore difficult to register. We demonstrate on three example images that neither the purely voxel-based registration algorithm nor using feature information only yields satisfying results and that the combination of feature and voxel information improves the registration results considerably. Previously, Johnson and Christensen [8] presented an intensity-based consistent thinplate spline algorithm for 2D images using manually detected corresponding point landmarks and image intensities. In [1] an intensity-based thin-plate spline algorithm is applied on 3D images matching curves which represent cortical sulci. Similar features are incorporated in a optical flow algorithm using squared differences in [5]. Another paper [12] introduces an attribute vector for each voxel which contains the intensity, edge type, and a set of geometric moment invariants. None of these papers presents results on images with substantial changes, for example images where the deformation is neither one-to-one nor continuous. Additionally, they consider squared differences of the voxel intensities as similarity measure, which implies the same modality for the reference and source image, whereas we use Normalized Mutual Information.
2
Methology
The non-rigid registration algorithm deforms a regular grid of control points in the reference image by moving the control points while tissue motion is described by free-form deformation using B-spline interpolation between the control points[11]. Normalised Mutual Information (NMI) is used as the voxel-based similarity measure CN M I [7], which does not restrict the images to be from the same modality. The purely voxel-based algorithm was previously evaluated for the registration of 3D breast MR images [3], and it has been shown that the displacement vectors determined in deformed brain MR image data correspond in general well with manually determined measurements[6]. However, the algorithm sometimes fails to align anatomical structures correctly due to a lack of voxel correspondence near the changes, for example the true deformation may be neither one-to-one nor continuous. Our new algorithm allows the user to overcome these difficulties by incorporating point or surface information in a semi automated way. 2.1
Corresponding Point Landmarks
As corresponding points we consider point landmarks in the images, i.e., prominent points, where the surface of anatomical structures is strongly curved, e.g. the tip of the frontal horn of the ventricular system within in the human brain. Usually, such 3D landmarks are selected manually – a task which is tedious, time-consuming, and often
Using Points and Surfaces to Improve Voxel-Based Non-rigid Registration
(a)
567
(b)
Fig. 1. (a) shows one corresponding point pair in the MNI atlas (left) and in the subject image (right). Altogether six 3D landmarks were detected in each image semi-automatically using the 3D operator (not displayed because they are in other slices). (b) shows the pre (left) and post (right) interventional MR images overlaid with a contour illustrating the segmentation obtained from the Brain Extraction Tool [13]. Only a part of the 3D surface of the post interventional image (reference image) is considered by the registration process (displayed as bold contour in the right image). This partial surface is determined manually via a GUI and extracted in a region where the displacement is substantial
lacks accuracy. The alternative we follow is a semi-automatic procedure for landmark extraction [9] which has the advantage that the user can interactively control the results. First, a coarse position of a certain landmark is determined manually. Second, a 3D operator is applied within a volume-of-interest (VOI) centred at the coarse position to detect potential landmark candidates. Third, the user selects the most promising candidate. The 3D operator used in our investigation is based on the matrix C = ∇g(∇g)T with the gradient ∇g = (gx , gy , gz )T of the image intensity function g and is defined as [10]: det(C) k= . trace(C) The extrema of the operator responses are considered as detections of the operator. In [4] it has been shown that this operator can reliably detect features including the tips of the lateral ventricle. The negative mean distance of the corresponding landmarks is taken as similarity measure 1 r pi − psi n i n
CF = −
where (pri , psi ) is the i-th corresponding point pair in the reference and source image. 2.2
Surfaces
In cases with substantial brain shift a partial brain surface is used to introduce pre-known displacement information. First, the brain was segmented using the publicly available
568
T. Hartkens et al.
Brain Extraction Tool (http://www.fmrib.ox.ac.uk/fsl) [13] and the marching cubes algorithm was applied on this segmentation to obtain a 3D brain surface representation. Second, a region of interest from this surface was extracted manually in the reference image (see Fig.1) supported by a 3D graphical user interface. The selected surface might, for example, be in the region of the resection where sub-dural air is present. Based on randomly chosen points pri on this partial surface, their closest points psi on the source surface are determined and established as corresponding point pairs (pri , psi ). The similarity of these point pairs is calculated as described in the previous section for the point landmarks. In contrast to the point landmarks these point pairs are not fixed during the registration process but in every iteration newly established. This allows the surfaces to slide over each other without changing the similarity measure, analogous to the way that the iterative closet point (ICP) algorithm works. 2.3
Combining Voxel and Feature Similarity Measure
The voxel similarity measure CN M I is linearly combined with the feature similarity measure CF : Ctotal = CN M I + τ · CF The optimisation process determines in each step the gradient of the similarity Ctotal in relation to the degrees of freedom dj , i.e. the displacement of the control points, ∇dj Ctotal = ∇dj CN M I + τ · ∇dj CF and maximises the similarity along this gradient. Typically, the absolute value of the voxel similarity gradient ∇dj CN M I has a different order of magnitude than the absolute value of the feature similarity gradient ∇dj CF . Therefore we weight the feature similarity gradients with the factor τ to scale it to the same order of magnitude. The parameter τ is calculated automatically once at the beginning of the registration process and estimated as the ratio of the similarity derivatives in the direction of their gradients: τ=
2.4
∂CN M I ∂∇CN M I ∂CF ∂∇CF
Image Data
Our new algorithm was applied to images that had proven difficult to register in previous studies. The first image was one of a series of 11 subjects in a serial MR study that had involved non-rigid registration of the MNI atlas to each subject [7](Fig. 1a) for the purpose of segmenting the lateral ventricle. Visual inspection had shown good registration for the other 10 subjects, but clear mis-registration in the vicinity of the lateral ventricles in this subject, who had an abnormal ventricular shape. We registered this subject using our new technique by semi-automatically detecting the tips of the frontal, occipital and temporal horns of both lateral ventricles. The second and third images we registered were selected from group of 24 patients imaged at the start and end of a MR guided neurosurgical intervention. Two images for which the purely voxel-based algorithm failed were chosen to apply the new approach (assigned with resection8 and resection3, Fig. 2).
Using Points and Surfaces to Improve Voxel-Based Non-rigid Registration
resection8
569
resection3
Fig. 2. Pre- and post-procedure images
(a)
(b)
(c)
(d)
Fig. 3. Subject overlaid with the contour of the registered MNI atlas. (a) rigid registration, (b) non-rigid registration using only voxel similarity, (c) approximating the FFD using only feature information, (d) non-rigid registration using both voxel similarity and feature information
3
Results
In each Fig. 3, 4, and 5 the results after rigid-registration (a), after non-rigid registration using only voxel similarity (b), after approximating the FFD using only surface information (c), and after non-rigid registration using both voxel similarity and surface information (d) are shown for the investigated images. Fig.3(d) shows that the boundary of the ventricle in the registered MNI atlas is well aligned with the ventricle of the subject image using voxel-intensities and point landmark information. The purely voxel-based algorithm 3(b) yields reasonable results, but does not match the tip of the ventricle as well as the extended registration algorithm. The subtraction image in Fig.4(b) after non-rigid registration without surface information demonstrates that the algorithm aligns the ventricular system and other structures inside the brain well, but does not detect the brain shift at the surface. In contrast, approximating
570
T. Hartkens et al.
(a)
(b)
(c)
(d)
Fig. 4. Subtraction of post- and registered pre-resection image (resection8); see description for Fig.3
the FFD grid corresponding to the surface displacement 4(c) aligns the brain surfaces well, but does not improve the results deeper in the brain in comparison to the rigid registration because of the local support of the B-splines. Introducing surface information in the registration algorithm improves the results close to the brain surface and inside the brain 4(d). The pre-procedure brain surface in case resection3 (Fig.5) is poorly aligned with the post-procedure brain surface by the non-rigid registration only 5(b). The approximated FFD yields a clear displacement of the surfaces in the axial slice 5(c). The combination of voxel-intensities and feature information 5(d) aligns the surface considerablely better than the voxel based algorithm.
4
Conclusion
Previous studies successfully register tomographic images for a wide range of applications and modalities using purely voxel-based non-rigid registration algorithms. For instance, in [6] a voxel-based algorithm registers reliablely non-rigidly interventional MR images even with considerable changes. However, in some cases the registration of quite different images can fail because of a lack of voxel intensity correspondences. In this work, we demonstrate that introducing feature information in a voxel-based algorithm using Normalised Mutual Information as a similarity measure can improve the registration results considerably in such cases. Either point landmarks or a partial brain surfaces were detected semi-automatically in the reference and source image supported by a 3D graphical user interface. The mean distance of these features was taken as a similarity measure and was linearly combined with the voxel similarity measure introducing the parameter τ . In contrast to previous approaches [8,1,5] which specify similar parameter manually, we automatically determine the parameter on the basis of the im-
Using Points and Surfaces to Improve Voxel-Based Non-rigid Registration
(a)
(b)
(c)
(d)
571
Fig. 5. Post-procedure image overlaid with the contour of the registered pre-procedure image (resection3); see description for Fig.3
age data. Additionally, our approach dynamically establishes point correspondence on the surface, permitting the surfaces to slide over each other. This might prove to be an advantage in cases where fixed point correspondences cannot be detected reliably in the images. Furthermore, the proposed semi-automatic procedure has the benefit that the user has the possibility to control the results and to introduce prior-knowledge about the expected transformation. The extended registration algorithm has been applied on three 3D images with substantial differences. It turned out that using point and surface feature in the voxel-based registration algorithm aligns the boundary of the interested structure considerably better than a purely voxel-based algorithm. Since the results in other regions remain as good as the purely voxel-based algorithm, this initial study suggests that such an approach can combine the advantages of voxel-based and feature-based registration algorithms.
References 1. Pascal Cachier, Jean-Francois Mangin, Xavier Pennec, Denis Riviere, Dimitri PapadopoulosOrfanos, Jean Regis, and Nicholas Ayache. Multisubject non-rigid registration of brain mri using intensity and geometric features. In Niessen W and M. Viergever, editors, MICCAI2001, number 2208 in LNCS, pages 734–742. Springer-Verlag, 2001. 2. William R. Crum, A. Freeborough, and Nick C. Fox. The use of regional fast fluid registration of serial MRI to quantify local change in neurodegenerative disease. In D. Hawkes, D. Hill, and R. Gaston, editors, MICCAI99, pages 25–28, Oxford, July 1999. Springer-Verlag. 3. E. R. E. Denton, L. I. Sonoda, D. Rueckert, S. C. Rankin, C. Hayes, M. Leach, D. L. G. Hill, and D. J. Hawkes. Comparison and evaluation of rigid and non-rigid registration of breast mr images. JCAT, 23:800–805, 1999.
572
T. Hartkens et al.
4. T. Hartkens, K. Rohr, and H. S. Stiehl. Performance of 3D differential operators for the detection of anatomical point landmarks in MR and CT images. In Proc. SPIE’s International Symposium Medical Imaging, Image Processing, pages 32–43, San Diego,CA,USA, Feb 1998. 5. Pierre Hellier and Christian Barillot. Cooperation between local and global approaches to register brain images. In R.M. Leahy MF. Insana, editor, IPMI2001, number 2082 in LNCS, pages 315–328. Springer-Verlag, 2001. 6. Derek L.G. Hill, Calvin R. Maurer, Alastair J. Martin, Saras Sabanathan, Walter A. Hall, David J. Hawkes, Daniel Rueckert, and Charles L. Truwit. Assessment of intraoperative brain deformation using interventional MR imaging. In Chis Taylor and Alan Colchester, editors, MICCAI’99, Lecture Notes in Computer Science 1679, pages 910–919, Cambridge, UK, September 1999. Springer-Verlag. 7. Mark Holden, Derek L.G. Hill, Erika R.E. Denton, Jo M. Jarosz, Tim C.S. Cox, Torsten Rohlginf, Joanne Goodey, and David J. Hawkes. Voxel similarity measures for 3D serial mr brain image registration. IEEE Transactions on Medical Imaging, 19(2):94–102, Feb 2000. 8. Hans J. Johnson and Gary E. Christensen. Landmark and intensity-based, consistent thinplate spline image registration. In R.M. Leahy M.F. Insana, editor, IPMI2001, number 2082 in LNCS, pages 329–343. Springer-Verlag, 2001. 9. K. Rohr, H.S. Stiehl, R. Sprengel, T.M. Buzug, J. Weese, and M.H. Kuhn. Landmark-based elastic registration using approximating thin-plate splines. IEEE Trans. on Medical Imaging, 20(6):526–534, June 2001. 10. K. Rohr. On 3D differential operators for detecting point landmarks. Image and Vision Computing, 15(3):219–233, March 1997. 11. D. Rueckert, L. I. Sonoda, C. Hayes, D. L. G. Hill, M.O. Leach, and D. J. Hawkes. Non-rigid registration using free-form deformations: Application to breast MR images. IEEE Trans. Medical Imaging, 18(8):712–721, 1999. 12. Dinggang Shen and Christos Davatzikos. Hammer: Hierarchical attribute matching mechanism for elastic registration. In Lawrence Staib, editor, IEEE Workshop on Mathematical Methods in Biomedical Image Analysis (MMBIA), pages 32–39. IEEE Computer Society, 2001. 13. S.M. Smith. Robust automated brain extraction. In Sixth Int. Conf. on Functional Mapping of the Human Brain, page 625, 2000. 14. C. Studholme, E. Novotny, I.G. Zubal, and J.S. Duncan. Estimating tissue deformation between functional images induced by intracranial electrode implantation using anatomical mri. NeuroImage, 13:561–576, 2001. 15. Paul M. Thompson, Jay N. Giedd, Roger P. Woods, David MacDonald, Alan C.Evans, and Arthur W. Toga. Growth patterns in the developing brain detected by using continuum mechanical tensor maps. Nature, 404(9):190–193, March 2000.
Intra-patient Prone to Supine Colon Registration for Synchronized Virtual Colonoscopy Delphine Nain1 , Steven Haker2 , W. Eric L. Grimson1 , Eric Cosman Jr1 , William W. Wells1,2 , Hoon Ji2 , Ron Kikinis2 , and Carl-Fredrik Westin1,2 1
Artificial Intelligence Laboratory, Massachusetts Institute of Technology Cambridge MA, USA {delfin,welg,ercosman,sw}@ai.mit.edu http://www.ai.mit.edu 2 Surgical Planning Laboratory Harvard Medical School and Brigham and Women’s Hospital {haker,hooni,kikinis,westin}@bwh.harvard.edu http://www.spl.harvard.edu
Abstract. In this paper, we present an automated method for colon registration. The method uses dynamic programming to align data defined on colon center-line paths, as extracted from the prone and supine scans. This data may include information such as path length and curvature as well as descriptors of the shape and size of the colon near the path. We show how our colon registration technique can be used to produce synchronized fly-through or slice views.
1
Introduction
Colorectal cancer is one of the most common forms of cancer, and is associated with high mortality rates. Various screening methods used to detect colorectal cancers and pre-cancerous polyps are available, each with its own costs and benefits. In particular, fiber-optic colonoscopy is a well established and highly effective screening method, but is also invasive, expensive, time consuming and uncomfortable for the patient. A more recently developed screening method is computed tomographic colonography. In this screening method, a radiologist views a sequence of CT images, typically from one or more axial volumetric scans, and inspects the colon wall for structures likely to be polyps. Keys to identifying these structures can include their shape and cross sectional image intensity profiles. The doctor may also be presented with a 3D reconstructed view of the colon in a process known as virtual colonoscopy. The idea of this approach is to simulate, using computer graphics techniques, the appearance of the colon wall as it would be seen by a doctor performing fiber-optic colonoscopy. This simulation may include a virtual “flythrough” of the colon. Recent work has indicated that these methods have the ability to provide doctors with the information needed to detect small polyps [11]. As CT technology has improved, providing higher resolution images obtained in shorter periods of time and with lower radiation doses to the patient, these virtual methods have become more attractive as routine diagnostic procedures. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 573–580, 2002. c Springer-Verlag Berlin Heidelberg 2002
574
D. Nain et al.
The presence of pseudo-polyps, material such as stool which may adhere to the colon wall and appear much like a polyp, can make the task of finding true polyps difficult. In order to better differentiate polyps from pseudo-polyps, and to better view the lumen surface in the presence of fluid, it is common practice to obtain two CT scans of the patient, one with the patient in the prone position and one in the supine. Fluid in the colon will naturally appear on the anterior colon wall when the patient is in the prone position, and on the posterior wall when in the supine. Pseudo-polyp material may also change its position between the two scans, allowing the radiologist to differentiate these structures from true polyps. Further, a second view of the colon after re-positioning may help the doctor determine if a structure is a polyp or simply a fold in the haustra [9]. Naturally, the ability to compare corresponding positions in the prone and supine scans is required before all the benefits mentioned above can be accrued. However, the change in shape and position of the colon within the body between the prone and supine scans can be surprisingly large. Peristaltic action and changes in pressures applied to the body are among the causes of these deformations. The colon, insufflated with room air or carbon dioxide, can behave more like a lightly filled bladder than a rigid structure. Registering the colonic wall between the two scans can therefore be quite challenging. In practice, the radiologist can attempt a manual registration by using anatomical landmarks such as spinal vertebrae to observe images through similar axial planes, and then scroll through adjacent slices to try to find similar structures in the colon wall. Such methods however, are difficult, inaccurate and time consuming. Figure 1 shows an axial slice that intersects the same vertebra through both the prone and supine scan. As seen on the figure, the colon and other anatomical structures are deformed and shifted due to gravity and pressure factors, and look very different from one scan to the other. The deformation and shift between the colon and other anatomical structures is non-linear, so that surrounding structures are unlikely to provide enough information to match the two colons. An automatic volumetric deformable registration of one entire grayscale scan to the other would be desirable, but is an extremely difficult and time consuming task.
Fig. 1. Axial slices through supine (left) and prone (right) scans. Although the same vertebra is pictured, the colon and other anatomical structures are not aligned.
Intra-patient Prone to Supine Colon Registration
2
575
Related Work
Some commercial tools used to view CT image slices have an option to display a supine and prone scan side-by-side, with the option to flip the prone scan, so that images are presented in the same orientation. Effectively, the clinician has to register the two volumes manually in order to find interesting corresponding landmarks. Acar [1] et al. have developed an automatic method that registers the medial axis of supine and prone colon surface models using linear stretching and shrinking operations. They sample centerlines of both colon models at 1mm intervals and examine the inferior/superior coordinates of path points to find local extrema and path inflection points. In order to find corresponding points, they linearly interpolate between inflection points using path length information. This method only takes into account the local extrema located on the inferior/superior axis. If the colon shifts obliquely when the patient changes position, then some local extrema may not be accounted for. Further, the method does not allow for colon surface information to be taken into account. Information such as radial distance from the medial axis or circumference cannot be used in the matching process. Our method by contrast, is designed to address these points.
3
Registration Methodology
In this section, we present our method for the registration of colon centerlines extracted from prone and supine colon scans. Our method uses dynamic programming and geometric information to find an optimal match between sampled centerpoints. Once the centerlines are matched, fly-throughs are generated for synchronized virtual colonoscopy by stepping incrementally through the matched centerpoints. 3.1
Motivation for Our Approach
The centerline through both colons can provide meaningful information for colon registration, such as length and curvature. Centerline registration is also a simpler and less computationally intensive problem to solve than volumetric registration, since there is only one dimension to match. However, using only centerline information such as length and curvature might not be enough for a robust solution in cases where the colon is severely stretched and deformed between the two scans. In this case, other important geometric quantities could help the centerline matching, such as radius, circumference or surface curvature information. To resolve this issue, we have developed a dynamic programming algorithm for 1dimensional point matching, and we incorporate 3-dimensional surface information into our metric, or objective function, to match each point of the centerline. Dynamic programming has the advantage of being a fast and efficient algorithm that finds a globally optimal matching of centerlines, with respect to the objective function, while preserving centerpoint ordering [2].
576
D. Nain et al.
In Section 3.2 we briefly describe a method for automatically extracting centerlines from CT scans. In Section 3.3, we describe our centerline registration method using dynamic programming. Section 4 presents our results, including the use of the matching technique for synchronized virtual colonoscopy. Section 5 describes intended future work. 3.2
Centerline Extraction
In this section, we describe the method by which we obtain centerlines of a surface model. Our starting point is a grayscale volume that is segmented and then used to produce a triangulated surface model of the colon, of the kind obtained through the use of the Marching Cubes algorithm [12], or other similar isosurface extraction algorithm. Our method for centerline extraction is based on the following physical model. Let Σ denote the tubular colon surface with open ends described by two closed space curves σ0 and σ1 . We suppose that these boundary curves are held at a constant temperature of 0 and 1 degrees respectively, and seek the steadystate distribution of temperature u across the surface. The standard theory of partial differential equations [7] tells us that this temperature distribution will smoothly vary between 0 and 1 degrees from end to end, and will be free of local maxima and minima away from the boundary curves. In fact, the function u will be harmonic, i.e. will satisfy Laplace’s equation ∆u = 0, and each level set u−1 (t), t ∈ [0, 1] will consist of a loop around the colon surface. Our centerline is then formed by the centers of mass of these loops. The numerical method used to find the temperature distribution function is based on finite element techniques [5]. In [8], there is a closely related method for colon mapping. The function u may be found by solving the sparse linear system of equations using standard methods from numerical linear algebra. We have found that the solution of this system can be found in under 5 minutes on a single processor Sun Ultra 10, for a surface consisting of 100, 000 triangles. Once the solution u is found, the center points may be found simply by dividing up the interval [0, 1] into a number of sub-intervals, and calculating for each sub-interval the center of mass of the vertices with corresponding values of u. Since each centerpoint is associated with a loop around the colon surface, surface measures such as average radial distance, circumference and curvatures can be mapped to the centerpoint for use in our dynamic programming matching technique. 3.3
Dynamic Programming
Overview. Once we have the centerpoints extracted for both colons, we match them using dynamic programming. Dynamic programming solves optimization problems by finding and recursively combining the optimal solutions to subproblems. A dynamic programming algorithm is efficient since it solves every subproblem only once and caches the solution, thereby avoiding the work of recomputing the answer every time the subproblem is encountered [2]. Dynamic programming has been used in a variety of contexts including for DNA and protein sequence alignment [10] as well as special cases of pose estimation in object recognition [4],[6].
Intra-patient Prone to Supine Colon Registration
577
Registration Algorithm. In our case, we wish to align two sets of centerpoints, P1N i.e. the centerline of the prone colon containing the points indexed from 1 to N and S1M , the centerline of the supine colon indexed from 1 to M . The subproblems of this optimization task are all the pairwise matching of the subsequences of P1i , (1 ≤ i ≤ N ) and S1j , (1 ≤ j ≤ M ). We now describe the two steps of our centerline registration algorithm. 1. Recursive Definition of an Optimal Solution Let f (Pi , Sj ) be a cost associated with matching centerpoint Pi with Sj . Let us further assume that we already have an optimal minimal cost solution for the matching of the pairs of subsequences (P1i , S1j−1 ), (P1i−1 , S1j−1 ), and (P1i−1 , S1j ). If we define F to be a metric that evaluates the matching of the argument subsequences, we can find the optimal alignment of the centerlines (P1i , S1j ) by solving the recursion: j−1 F (P1i , S1 ); (expansion at Pi ) F (P1i , S1j ) = f (Pi , Sj ) + min F (P1i−1 , S1j−1 ); (no expansion/compression) F (P i−1 , S j ); (expansion at Sj ) 1 1 (1) With this recursive expression, we fill in an N × M matrix with entries at (i, j) : F (P1i , S1j ), along with pointers in the direction of the preceding sequence matching which led to this optimal value. It is important to note that dynamic programming allows many-to-many mappings, resulting in mappings that can be locally compressions or expansions. For example, if F (P1i , S1j−1 ) is chosen as the optimal subsequence of F (P1i , S1j ), then centerpoint Pi will match to both points Sj−1 and Sj , which would mean that in the locality of point Pi , the matching to the sequence S1j is a compression. We chose to penalize the amount of stretching and compression allowed with a penalty function g() added to the first and third line of equation 1. We experimented with different values of g() = 0.0 and g() = 0.1 and experimentally found that the latter gives us a better match. 2. Extracting the Sequence Alignment By following the pointers from entry (N, M ) to entry (1, 1), we obtain a sequence of (i, j) pairs that define a point-to-point correspondence between point Pi and point Sj . Objective Function f (Pi , Sj ). As mentioned in Section 2, we use both centerline and geometrical information to give a value to each centerpoint. – The centerline information is the distance from the first centerpoint to the current centerpoint normalized to the total length of the centerline. – The geometrical information is the average radial distance from the centerpoint to the surface loop centered at the centerpoint.
578
D. Nain et al.
The objective function f (Pi , Sj ) evaluates how closely two centerpoints Pi and Sj match. We have defined it as: S 2 f (Pi , Sj ) = α(riP − rjS )2 + (1 − α)(dP i − dj )
(2)
where riP is the average radial distance at the ith centerpoint in the prone scan, S dP i is the distance along the path of the centerpoint, and similarly for rj and S dj . The parameter α is used to balance the two terms of the functional. The results for the colon registration presented in Section 4 are for α = 0.5. Other functionals incorporating other surface information can also be used.
4
Results
Here we show the results of our algorithm applied to supine and prone CT scans taken of the same patient 8 minutes apart. Both scans had 512 matrix size with slice thickness of 1mm, and 362 slices. Pixel size was 0.6mm.
Fig. 2. Sequence alignment of prone and supine centerpoints.
Figure 2 shows an objective value with α = 0.5, plotted for both colons as well as the sequence alignments found by our algorithm between points using a penalty for excessive stretching or compression (we used g() = 0.1). Each line in the middle of the figure shows a correspondence between a prone and supine point. As can be seen, there are areas of slight expansion and stretching that are detected. In order to have a preliminary evaluation of the fly-throughs produced by our algorithm, we recorded how many frames matched for different values of α out of the total number of frames (278). These results are presented in Table 1. From the results, we see that using the distance metric alone (α = 0) fails because the initial 6 centerpoints of the supine centerline do not exist on the prone
Intra-patient Prone to Supine Colon Registration
579
Table 1. Performance results with different objective functions. α % of matched frames 0 40 0.5 94 1 88
centerline. This causes a misalignment of frames throughout the fly-through. Using average radius information alone (α = 1) is considerably better, except at particular centerpoints where there is a collapse of the colon due to fluid leftovers. This causes a temporary misalignment until the next correct radius information is matched. We found that a combination of both metrics is optimal and gives us a 94 % frame match rate. 4.1
Synchronized Virtual Colonoscopy (SVC)
In order to visualize synchronized colonoscopies, we use a virtual endoscopy tool that we developed [3]. This tool displays the two surface models and the location of both virtual endoscopes relative to the surface models, as well as the views of each virtual endoscope updated simultaneously to show the same location in the colon (see Figure 3). In addition, the user has the option to display a reformatted CT slice through each volume that moves along with the virtual endoscope and stays parallel to the view plane (Figure 3, right view). This functionality allows the user to compare the grayscale image registered to match the colon data. During the fly-throughs, the positions of the virtual endoscopes
Fig. 3. Left. Virtual colonoscopy tool screenshot with surface models and endoscopic views – Right. Reformatted slice with colon model and centerpath.
are defined by the centerpoint sequence alignment. However, the rotation of the virtual endoscopes (the “ViewUp” vector) is not defined and so both views could be rotated relative to each other. In practice, if the ViewUp of both virtual endoscopes are manually matched on the first corresponding centerpoint, then
580
D. Nain et al.
the rotation difference between the two views stays small. But in the future, we would like to find a corresponding ViewUp for both endoscopes by comparing slices perpendicular to the centerline at corresponding points and aligning them by moments.
5
Future Work
In the future, we will incorporate other surface information, such as curvature and moments, into our objective functional. We also plan to test our method in a clinical setting and obtain expert classification of matched/mismatched frames.
6
Conclusion
We presented a method for automatic supine and prone colon centerline registration based on the dynamic programming principle. Our method can include information such as path length and curvature as well as descriptors of the shape and size of the colon near the path to find an optimal matching. We showed that our colon registration technique can be used to produce synchronized flythroughs.
References 1. B. Acar, S. Napel, D.S. Paik, P. Li, J. Yee, C.F. Bealieu, R.B. Jeffrey. Registration of supine and prone ct colonography data: Method and evaluation. Radiological Society of North America 87th Scientific Sessions, 2001. 2. T. H. Cormen, C. E. Leisterson, and R. L. Rivest. Introduction to Algorithms. McGraw-Hill, New York, 1998. 3. D. Nain, S. Haker, R. Kikinis, W. Grimson. An interactive virtual endoscopy tool. Satellite Workshop at the Fourth International Conference on Medical Image Computing and Computer-Assisted Intervention (MICCAI’2001), 2001. 4. E. Cosman, W. Wells. Slice-wise, non-rigid volumetric image registra tion by dynamic programming. http://www.ai.mit.edu/people/ercosman, 2001. 5. T. Hughes. The finite element method. Prentice-Hall, New Jersey, 1987. 6. A. L. Ratan. Learning visual concepts for image classification. Ph.D. Thesis, A.I. Lab, MIT, 1999. 7. J. Rauch. Partial differential equations. Springer-Verlag, New York, 1991. 8. S. Haker, S. Angenent, A. Tannenbaum, and R. Kikinis. Nondistorting flattening maps and the 3d visualization of colon ct images. IEEE Trans. on Medical Imaging, 19:pp. 665–670., 2000. 9. S.C. Chen, D.S. Lu, J.R. Hecht et al. Ct colonography: value of scanning in both the supine and prone positions. Am. J. Rad., 172:pp. 595–599., 1999. 10. J. Setubal and J. Meidanis. Introduction to Computational Molecular Biology. PWS Publishing Co., New York, 1997. 11. D. Vining. Virtual endoscopy: Is it a reality? Radiology, 200:pp. 30–31., 1996. 12. W. Schroeder, H. Martin, and B. Lorensen. The visualization toolkit. PrenticeHall, New Jersey, 1996.
Nonrigid Registration Using Regularized Matching Weighted by Local Structure Eduardo Su´ arez1 , Carl-Fredrik Westin2 , Eduardo Rovaris1 , and Juan Ruiz-Alzola1,2 1
2
Medical Technology Center, Univ. Las Palmas of GC & Gran Canaria Dr. Negr´ın Hospital, Spain Laboratory of Mathematics in Imaging, Brigham and Women’s Hospital and Harvard Medical School, USA
Abstract. We present a novel approach to nonrigid registration of volumetric multimodal medical data. We propose a new regularized template matching scheme, where arbitrary similarity measures can be embedded and the regularization imposes spatial coherence taking into account the quality of the matching according to an estimation of the local structure. We propose to use an efficient variation of weighted least squares termed normalized convolution as a mathematically coherent framework for the whole approach. Results show that our method is fast as accurate.
1
Introduction
Nonrigid registration is a crucial operation for image guided medicine. Image registration consists of putting into correspondence two or more datasets, possibly obtained with different imaging modalities. Its applications range from pathology follow-up, through a series of clinical studies, to image guided surgery, by registering pre-operative images onto intra-operative ones [1]. Moreover, nonrigid registration is also necessary in order to embed a priori anatomic knowledge into medical image processing algorithms and, specially, into segmentation schemes. In this case, a canonical atlas is usually registered onto patient specific information to help classifiers know what the possible classes are for every voxel [2]. Datasets to be registered can therefore correspond to the same or to different patients (or even to an atlas) and can also be from the same or from different imaging modalities. Putting into correspondence two anatomies that can be topologically different (for example, in the case of pathology) and where the voxel intensities measure different physical magnitudes (multimodality) poses a serious challenge that has sparked intensive research over the last years [3]. A review of alternatives is beyond the scope of this paper, and a good one can be found elsewhere [3] with a complete taxonomy of registration methods. Just to focus this work, we will mention that voxel-based registration methods, i.e. those using directly the full content of the image and not simplifying it to a set of features to steer the registration, usually correspond to one of two important families: template matching and variational. The former was popular years T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 581–589, 2002. c Springer-Verlag Berlin Heidelberg 2002
582
E. Su´ arez et al.
ago due to its conceptual simplicity [4]. Nevertheless, in its conventional formulation, it is not powerful enough to address the challenging needs of medical image registration. Variational methods rely on the minimization of a functional (energy) that is usually formulated as the addition of two terms: data coupling and regularization, the former forcing the similarity between both datasets (target, and source deformed with the estimated field) to be high while the later forcing the estimated field to fulfill some constraint (usually enforcing spatial coherence-smoothness). As opposed to variational methods, template matching do not impose any constraint on the resulting fields which, moreover, due to the discrete movement of the template are discrete fields. These facts have led to an increasing popularity of variational methods for registration while template matching has been loosing its place in this arena. In this paper we present a novel registration approach using template matching, where its major drawbacks have been explicitly addressed. The resulting method is reliable and fast and it can be a feasible alternative to computationally expensive variational approaches. First any similarity measure can be easily incorporated into the neighborhoods comparison. Then spatial regularization is imposed after template matching, by locally projecting the estimated field onto a vector space. Moreover, the quality of the matching is considered when doing the projection by means of the estimation of local structure. A very efficient variation of weighted least squares termed normalized convolution [5,6], provides a natural framework to our regularized matching approach. The structure of the paper is as follows: Section 2 presents the local structure estimation procedure. Section 3 describes the proposed nonrigid registration algorithm. Results are shown in Section 4 and conclusions in Section 5.
2
Local Structure
Our approach to nonrigid registration relies on template matching. Local structure measures the quantity of discriminant spatial information on every point of an image and it is crucial for template matching performance: the higher the local structure, the better the result obtained on that region with template matching. In order to quantify local structure, a structure tensor is defined as T(x) = (∇I(x) · ∇I(x)t )σ , where the subscript σ indicates a local smoothing. The structure tensor consists of a symmetric positive-semidefinite 3 × 3 matrix that can be associated to ellipsoids, i.e., eigenvectors and eigenvalues correspond to the ellipsoids axes directions and lengths respectively. A scalar measure of the local structure can be obtained as [7,8,9,10] structure(x) =
det T(x) . trace T(x)
(1)
Figure 1 shows an MRI T1-weighted axial slice of the brain and the estimated structure tensors overlaid as ellipsoids. Small eigenvalues indicate lack of gradient variation along the associated principal direction and, therefore, high structure
Nonrigid Registration Using Regularized Matching Weighted
583
Fig. 1. MRI T1-weighted axial slice of human brain and its structure tensors. The brighter the gray level of the ellipsoids the higher the structure.
Fig. 2. Top: MRI T1-weight cross-sections; Bottom: Local structure measure. Arrows point at higher structure regions.
is indicated by big (large eigenvalues), round (no eigenvalue is small) ellipsoids. The gray level coding represents the scalar structure measure, with brighter gray levels indicating higher structure. Figure 2 shows cross-sections of a T1-weighted MRI dataset of a human brain (top row) and the scalar measure of local structure obtained from them, represented with a logarithmic histogram correction (bottom row). Note how anatomical landmarks have the highest measure of local structure, corresponding to the points indicated by the arrows on the top row. Curves are detected with lower intensity than points and surfaces have even lower intensity. Homogeneous areas have almost no structure.
584
E. Su´ arez et al. (i−1)
Previous scale level (i)
Image 1
Image 2
(i)
Image 1 Transformed
Data Matching Local Structure
(i)
1 2 Step Deformation(i)
1 2 Global Deformation(i)
(i+1)
Next scale level
Fig. 3. Algorithm pipeline for pyramidal level (i).
3 3.1
The Registration Algorithm Algorithm and Multiresolution Pyramid
The algorithm works similar to Kovaˇciˇc and Bajcsy elastic warping [11], in which images are decomposed on Gaussian multiresolution pyramids. On the highest level, the deformation field is estimated by regularized template matching steered by local structure (details in sections below). On the next level, the source dataset is deformed with a deformation field obtained by spatial interpolation of the one obtained on the first level. The deformed source and the target datasets on the current level are then registered to obtain the deformation field corresponding to the current level of resolution. This process is iterated on every level. The algorithm implementation is summarized in figure 3. 3.2
Template Matching
Template matching finds the displacement for every voxel in a source image by minimizing a local cost measure, obtained from a small neighborhood of the source image and a set of potential correspondent neighborhoods in a target image. The main disadvantage of template matching is that it estimates the displacement field independently in every voxel and no spatial coherence is imposed to the solution. Another disadvantage of template matching is that it needs to test several discrete displacements to find a minimum. There exists some optimization-based template matching solutions that provide a real solution for every voxel, though they are slow [12]. Therefore, most template matching approaches render discrete displacement fields. Another problem associated to template matching is commonly denoted as the aperture problem in the computer vision literature [13]. This essentially consists of the inability of making a good match when no discriminant structure is available, such as in homogeneous regions, surfaces and edges. When this fact is not taken into account the matching process is steered by noise and not by the local structure, since it is not available. Our approach to nonrigid registration keeps the simplicity of template matching while it addresses its drawbacks. Indeed the algorithm presented here consists
Nonrigid Registration Using Regularized Matching Weighted
585
of a weighted regularization of the template matching solution, where weights are obtained from the local structure, in order to render spatially coherent real deformation fields. Thanks to the multiscale nature of our approach only displacements of one voxel are necessary when matching the local neighborhoods. 3.3
Spatial Regularization
The objective in image registration is to find a one-to-one spatial mapping between points of two images. Template matching provides a discrete deformation field where no spatial coherence constraints have been imposed. In this subsection this field is regularized so as to obtain a mathematically consistent continuous mapping. We will consider the deformation field to be a diffeomorphism, i.e. an invertible continuously differentiable mapping. In order to be invertible, the jacobian of the deformation field must be positive. On every scale level, the displacement is small enough to guarantee such condition. For every level of the pyramid the mapping is obtained by composing the transformation on the higher level with the one on the current level, so that the positive jacobian condition is preserved. Spatial regularization is achieved by locally projecting the deformation field provided by template matching on an appropriate signal subspace, and simultaneously taking into account the quality of the matching as indicated by the scalar measure of local structure. We propose here to use Normalized Convolution [6,5], a popular refinement of weighted-least squares that explicitly deals with the socalled signal/certainty philosophy. Essentially the scalar measure of structure is incorporated as a weighting function in a least squares fashion. The field obtained from template matching is then projected onto a vector space described by a non-orthogonal basis, i.e., the dot products between the field and every element of the basis provide covariant components that must be converted into contravariant by an appropriate metric tensor. Normalized convolution provides a simple implementation of this operation. Moreover, an applicability function is enforced on the basis elements in order to guarantee a proper localization and avoid high frequency artifacts. This essentially corresponds to weight each basis element with a Gaussian window. The desired transformation y(x) is related to the deformation field d(x) by the simple relation y(x) = d(x) + x (2) where x and y denotes coordinates in every dataset. Since the transformation is differentiable, we can write the function in different orders of approximation. y(x) y(x0 ), y(x) y(x0 ) + J(x0 ) · [x − x0 ].
(3) (4)
Equations 3 and 4 consist of linear decompositions of bases of size 3 and 12 basis elements, respectively. We have not found relevant experimental improvement of the registration algorithm by using the linear approximation instead of
586
E. Su´ arez et al.
Fig. 4. Left: certainty; Center: discrete matching deformation; Right: weight filtered deformation.
the zero-order one, probably due to the local nature of the algorithm. The basis set used is then: y1 (x) = 1 y1 (x) = 0 y1 (x) = 0 b1 y2 (x) = 0 b2 y2 (x) = 1 b3 y2 (x) = 0 (5) y3 (x) = 0 y3 (x) = 0 y3 (x) = 1 Figure 4 shows a 2-d discrete deformation field that has been regularized using the certainty on the left side and a 2-d Gaussian applicability function with σ = 0.8. 3.4
Implementation
The algorithm has been written in Matlab interfacing some external C libraries. In order to run faster the matching process, the whole dataset has been split and has been parallelized using Parallel Matlab [14].
4
Results
In order to illustrate quantitatively the performance of our registration approach, a T1-weighted MRI, size 160 × 192 × 160 with 12 bits depth and with isotropic voxel size of 1 mm, has been deformed using a set of synthetic deformation fields with a spatial bandwidth of 1 cm−1 and variable amplitudes. The original and the deformed datasets have been registered with the SSD similarity measure (Sum of Squared Differences) and a Gaussian applicability function with σ = 1.5. The Root Mean Square (RMS) errors before and after registration is shown in figure 5 in 18 experiments with different maximum displacements. Notice how the difference (RMS error) between both datasets is decreased after registration. The gain is obviously lower when the maximum displacement is bigger. Figure 6 shows from left to right the same sagittal slice of the original, synthetically deformed (maximum amplitude of 15 mm) and the original after the deformation field has been estimated an applied. The registration was done in 7 minutes (volumetric
900
900
800
800
800
700
700
700
600 500 400 300
600 500 400 300
200
200
100
100
0 0
5
10 15 20 Maximum displacement (mm)
25
30
RMS after registration
900
RMS after registration
RMS before registration
Nonrigid Registration Using Regularized Matching Weighted
0 0
587
600 500 400 300 200 100
5
10 15 20 Maximum displacement (mm)
25
30
0 0
200
400 600 RMS before registration
800
Fig. 5. RMS before and after registration of a 12 bits per voxel T1-weighted MRI 160 × 192 × 160 dataset with a synthetic deformation field of variable amplitude.
Fig. 6. Left: T1W MRI sagittal cross-section; Center: T1W MRI sagittal cross-section of the same dataset deformed with a synthetic field of 1 cm−1 of spatial bandwidth and 15 mm of maximum displacement; Right: T1W MRI sagittal cross-section of the registered dataset.
datasets) on a cluster of eight workstations (hybrid Pentium III and UltraSPARC II). In order to illustrate qualitatively the performance of our approach for multimodal registration, two volumetric datasets of identical sizes to the previous ones, corresponding to a T1-weighted simulated brain image [15] and a T2weighted patient image, have been registered using the correlation coefficient as similarity measure, and a Gaussian applicability function with σ = 1.5. Both datasets were rigidly registered prior to the application of our algorithm. Figure 7 shows, from left to right, T1-weighted, T2-weighted and the T1-weighted dataset deformed onto the T2-weighted one after the field was estimated. Notice, for example, how well the corpus callosum is deformed.
5
Conclusions and Future Work
We have presented a novel nonrigid registration scheme based on template matching, where arbitrary similarity measures can be considered and a deformation field regularization is also carried out. According to our experiments our approach can be a feasible alternative to computationally more expensive
588
E. Su´ arez et al.
Fig. 7. Left: T1W MRI sagittal cross-section from brainweb; Center: T2W MRI sagittal cross-section of a clinical patient; Right: T1W MRI sagittal cross-section of the registered dataset.
variational methods, yet rendering high accuracy in the registration even in multimodal cases. Our current implementation makes full registrations of volumetric MRI data in seven minutes using a cluster of eight conventional workstations. Nevertheless yet some improvement can be obtained by doing a full C implementation (currently we are using Matlab and some external libraries). Moreover, other structure detectors as the Harris corner detector [9] and quadrature filter based structure tensors [16] should also be tested.
Acknowledgment This research was supported by the spanish FPI grant AP98-52835909, the NIH grant P41-RR13218, and the spanish project TIC2001-3808-C02-01.
References 1. Bharatha, A., Hirose, M., Hata, N., Warfield, S.K., Ferrant, M., Zou, K.H., SuarezSantana, E., Ruiz-Alzola, J., D’Amico, A., Cormack, R.A., Kikinis, R., Jolesz, F.A., Tempany, C.M.C.: Evaluation of three-dimensional finite element-based deformable registration of pre- and intraoperative prostate imaging. Medical Physics 28 (2001) 2551–2560 2. Warfield, S., Robatino, A., Dengler, J., Jolesz, F., Kikinis, R.: 14. In: Brain Warping. Nonlinear Registration and Template-Driven Segmentation. A. Toga. Academic Press (1996) 241–262 3. Maintz, J., Viergever, M.: A survey of medical image registration. Medical Image Analysis 2 (1997) 1–36 4. Duda, R.O., Hart, P.E.: Pattern Classification and Scene Analysis. John Wiley & Sons (1973) 5. Knutsson, H., Westin, C.F.: Normalized and differential convolution: Methods for interpolation and filtering of incomplete and uncertain data. In: Proceedings of Computer Vision and Pattern Recognition, New York City, USA, IEEE (1993) 515–523 6. Westin, C.F.: A Tensor Framework for Multidimensional Signal Processing. PhD thesis, Link¨ oping University (1994)
Nonrigid Registration Using Regularized Matching Weighted
589
7. Rohr, K.: On 3D differential operators for detecting point landmarks. Image and Vision Computing 15 (1997) 219–233 8. Ruiz-Alzola, J., Westin, C., Warfield, S., Nabavi, A., Kikinis, R.: Nonrigid registration of 3D scalar, vector and tensor medical data. In: Third International Conference On Medical Robotics, Imaging and Computer Assisted Surgery. (2000) 541–550 9. Harris, C., Stephens, M.: A combined corner and edge detector. In: Proceedings of the Fourth Alvey Vision Conference. (1988) 147–151 10. Ruiz-Alzola, J., Kikinis, R., Westin, C.F.: Detection of point landmarks in multidimensional tensor data. Signal Processing 81 (2001) 2243–2247 11. Kovaˇciˇc, S., Bajcsy, R.: 3. In: Brain Warping. A. Toga. Academic Press (1996) 45–65 12. Su´ arez, E., C´ ardenes, R., Alberola, C., Westin, C.F., Ruiz-Alzola, J.: A general approach to nonrigid registration: decoupled optimization. In: 23th Annual International Conference of the IEEE Engineering in Medicine and Biology Society. (2000) 13. Poggio, T., Torre, V., Koch, C.: Computational vision and regularization theory. Nature (1985) 314–319 14. Kjems, U.: Parallel matlab (2000) http://bond.imm.dtu.dk/plab/. 15. Cocosco, C., Kollokian, V., Kwan, R.S., Evans, A.: Brainweb: Online interface to a 3D MRI simulated brain database. In: Neuroimage. Volume 5 of 425., Copenhagen (1997) 16. Knutson, H.: Representing local structure using tensors. In: The 6th Scandinavian Conference on Image Analysis, Oulu, Finland (1989) 244–251
Inter-subject Registration of Functional and Anatomical Data Using SPM P. Hellier1 , J. Ashburner2 , I. Corouge1 , C. Barillot1 , and K.J. Friston2 Projet Vista, IRISA/INRIA-CNRS, Rennes, France Functional Imaging Lab, Wellcome Department of Imaging Neuroscience, London, UK
Abstract. This paper is concerned with inter-subject registration of anatomical and functional brain data, and extends our previous work [7] on evaluation of intersubject registration methods. The paper evaluates the SPM spatial normalization method [1], which is widely used by the neuroscience community. This paper also extends the previous evaluation framework to functional MEG data. The impact of three different registration methods on the registration of somatosensory MEG data is studied. We show that the inter-subject functional variability can be reduced with inter-subject non-rigid registration methods, which is in accordance with the hypothesis that part of the inter-subject functional variability is encoded in the inter-subject anatomical variability. Keywords: Anatomical and functional atlases, non-rigid registration, spatial normalization, MR, MEG.
1
Introduction
This paper is concerned with inter-subject registration of anatomical and functional brain data. Traditionally addressed with paper-based atlases, the problem of inter-subject comparison can now be tackled with electronic brain atlases [5,10,14]. This has been made possible because digital images of the brain are now available, either anatomical or functional; and also thanks to the development of computers, which can now cope with enormous datasets. Despite this progress, one has still to face the difficult problem of building such an atlas: the registration of brains of different subjects, usually achieved via registering MR brain images. Brain atlases classically rely on a template, which can also be a reference subject. Contrary to traditional paper-based atlases, electronic atlases can evolve, since new subjects can still be included in the atlas in the following way: once registered with the template, anatomical and functional data associated with the subject (i.e. segmentation maps, or functional data) can enrich the atlas characteristics (probability of an anatomical structure being present or probability of functional activations, respectively). The quality of the atlas (in terms of accuracy and reliability) surely depends on the registration process, since the latter makes it possible to encode and decrease the anatomical inter-subject variability.Various registration methods abound in the literature, and the reader can find in [9] a comprehensive survey of these methods. Among them, SPM [1] is renowned and widespread (more than 1500 citations between 1990 and 1999 and more than 1000 SPM99 installed versions). SPM can be downloaded free of charge [3], and is a standard tool for researchers interested in neuroscience. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 590–597, 2002. c Springer-Verlag Berlin Heidelberg 2002
Inter-subject Registration of Functional and Anatomical Data Using SPM
591
This article is an extension of our previous work on validation of inter-subject brain registration [7]. In [7], we designed global and local measures to assess the registration results of 6 methods on a database of 18 subjects. The current paper evaluates the SPM99 spatial normalization method, and also extends the evaluation framework to include a validation based on dipoles localized with functional MEG.
2
SPM Registration Method
The SPM spatial normalization approach estimates warps by matching each skullstripped image to the skull-stripped reference. Registration involves minimizing the mean squared difference between the images, which had been previously smoothed by convolving with an isotropic 8mm FWHM Gaussian kernel. The first step of each registration is to estimate a 12-parameter affine transformation, in which excessive zooms and shears are penalized by adding an additional regularization term to the cost function [2]. The next step involves nonlinear registration, which corrects for gross differences in brain shape that could not be accounted for by the affine registration alone. These warps are modeled by a linear combination of low-frequency cosine transform basis functions [1]. Displacements in each direction are parameterized by 392 basis function coefficients, making a total of 1176 parameters in total. Regularization is obtained by minimizing the membrane energy of the warps. Other than matching skull-stripped images without any voxel-specific weighting, the default settings of SPM99 [3] are employed throughout. The database is composed of 18 subjects, among which one has been chosen to be the reference. The registration has been performed by matching each subject (source volume) to the reference (target volume).
3
Results on Anatomical Data
In this section, the SPM registration method is evaluated according to the global and local measures presented in our previous work [7]. For each measure, we will very briefly recall its principle. The SPM results will be discussed in relation to those previously obtained from other methods [7]. In order to assess the registration process, anatomical structures have been extracted for each subject of the database. The evaluation is based upon these structures, and more precisely how they are matched after registration. Two types of measures have been designed: global and local measures. Local measures focus on the matching of cortical sulci (extracted with the method described in [8]), which are particularly relevant to study the functional organization of the brain. It should also be noted that the registration and evaluation processes are independent, leading to an objective evaluation. 3.1
Global Measures
Average Volume. Among the subjects of the database, one subject was chosen to be the “template”, or reference subject. After registration, each subject was deformed toward the reference subject (using trilinear interpolation), and an “average” volume computed. Orthogonal sections through this average volume are presented in figure 1. The value of
592
P. Hellier et al. SPM average volume
Reference subject
Fig. 1. Average volume for SPM registration method, to be compared with the reference subject. Table 1. Tissue overlap and correlation of Lvv after registration. The mean and standard deviation of these measures through the database of subjects is computed. Overlap of grey matter Overlap of white matter Correlation of Lvv Mean - St.Dev. Mean - St.Dev. Mean - St.Dev. 94.11 - 0.062 95.71 - 0.038 0.246 - 0.0027
the mean square error between the average volume and the reference subject is 956.1 (computed only for voxels within the brain of the reference)1 . Tissue Overlap and Correlation of Lvv Volumes. We designed measures based on the overlap of grey matter and white matter after registration (the total performance measure has been retained), as well as the correlation of Lvv volumes after registration. The Lvv, which is related to the curvature information, has proved to be related to the cortical anatomy. The results of these two measures are presented in table 1. 3.2
Local Measures
In addition to global measures, we designed local measures based on the matching of cortical sulci after registration. Sulci were extracted using the method described in [8], and modeled as B-spline surfaces. They are relevant landmarks for studying the functional organization of the brain. Visualization of Deformed Sulci. We visualize in figure 2 how the sulci of each subject match the corresponding sulci of the reference subject after registration. The sulci in figure 2, one per subject, should ideally match the corresponding white sulcus. Numerical Evaluation. Beyond visualization, we can numerically assess how well sulci are matched after registration. Because sulci are defined by their control points, a distance between sulci can be defined as the distance between control points. Furthermore, we perform a Principal Component Analysis to characterize shape difference. These results are given in table 2. 1
This measure is not objective, in the sense that it is related to image intensities that are used to drive the registration process.
Inter-subject Registration of Functional and Anatomical Data Using SPM
(a) central sulci
(b) superior frontal sulci
593
(c) lateral sulci
Fig. 2. For a given sulcus, the corresponding sulcus of each subject is deformed toward the reference subject. The corresponding reference sulcus is shown in white. Neighboring sulci ((a): postcentral sulcus and precentral sulcus. (c): superior temporal sulcus) have been also represented to illustrate the order of magnitude of the variability after registration.
Table 2. Numerical evaluation of the distance between registered and corresponding reference sulci. The first column indicates the average distance for all sulci and all subjects (the distance is expressed in voxels, the spatial resolution of the voxels being 0.93mm). The last three columns indicate the normalized trace of the covariance matrix for specific sulci: central sulci, superior frontal sulci and lateral sulci. The Principal Component Analysis provides a metric for shape differences. Mean distance central superior frontal Sylvian 8.7 475 589 930
3.3
Partial Conclusion
For global measures, the results obtained with the SPM registration method are comparable to the results obtained with methods having similar degrees of freedom [7] (that is to say, the number of estimated independent variables). Local measures have provided significantly better results, in terms of matching cortical sulci. The distance measure, as well as the shape metric, have shown that the SPM method is in average 13% better than other non-rigid registration methods.
4
Results on Functional Data
Spatial normalization is a crucial step for performing measurements across or between subjects. Previous work [6] has already shown that higher dimensional warping produced averaged activations with higher amplitude and more compact spatial localization. This work intends to give an insight about how much of the inter-subject functional variability can be reduced with registration methods. The underlying assumption is that the functional inter-subject variability can be decomposed into an anatomical variability (which may eventually be estimated with registration methods) and a residual functional variability. For this study, we have chosen one of the simplest MEG activation protocols, whose anatomical localization is well-known. For each subject, the most significant dipole was retained, and deformed toward the reference subject, according to the spatial transform
594
P. Hellier et al.
obtained by registering the MR images. We studied the variability of the dipole fog after registration. It is difficult to have an idea of the initial variability, so we chose to present results with three registration methods, that are either available on the web, or straightforward to implement: – Method M, which is a rigid transformation by maximization of mutual information [4,15]. This method is evaluated to provide a “baseline” of the inter-subject functional variability. – Method P, which is the Talairach and Tournoux proportional squaring system [13]. This leads to a piecewise affine transformation, defined on 12 pieces. We wil refer to this method as the T&T registration method. – Method S, which is the SPM registration method [1]. 4.1
Functional MEG Data
For all methods, the functional data to register are MEG dipoles corresponding to a somatosensory activation of right hand fingers (thumb, index, little finger) performed for 15 volunteers out of the 18 subjects of our database, made up of 35 + / − 10 year old healthy males, all right-handed. MEG current dipoles were reconstructed using a spatiotemporal algorithm [11] and selected by choosing the most significant one in the 45+/-15 ms window. Thus, three dipoles, one per finger, are available for each subject. The somatosensory paradigm chosen here is a very simple well-known one and is thus convenient to our study, since our objective is not to explain complex physiological processing but rather to study the impact of registration methods. Despite the simplicity of the protocol, reconstruction of the sources in MEG [11] and MEG/MRI registration [12] remain challenging and generate errors. Because we aim to compare deformed dipoles with the anatomy of the reference subject (in particular sulci of the central region), we excluded dipoles that were not localized within the postcentral gyrus. It does not mean that we have eradicated reconstruction errors, but we can at least affirm that the original dipoles are correctly located. Therefore, we have kept 9 subjects for the little finger, 10 subjects for the index and 12 subjects for the thumb. As a consequence, the variability measured at the end of this process cannot be considered as an “absolute” value, but is to be trusted when comparing methods. 4.2
Localization and Variability of Deformed Dipoles
We first visualize where deformed dipoles are located according to the anatomy of the reference subject. In figure 3, the sulci of the central region are shown, along with the deformed dipoles. We can numerically assess the variability of dipoles after registration, by computing the covariance matrix and its determinant, since the latter expresses the entire variation. These numerical results are presented in table 3. Finally, we can combine visualization and numerical results in a compact and visual way: for each group of dipoles (one per method and per finger), we compute a “mean” dipole. The dispersion of dipoles can be represented around this mean dipole. Along each axis, we compute the empirical standard deviation σ of the dipoles coordinates.
Inter-subject Registration of Functional and Anatomical Data Using SPM
595
Little finger
Method M
Method P Index
Method S
Method M
Method P Thumb
Method S
Method M
Method P
Method S
Fig. 3. For each method, and for each finger (top: little finger, middle: index; bottom: thumb) the deformed dipoles can be compared with the anatomy of the reference subject (sulci of the central region). Table 3. Numerical results on the dispersion of dipoles. For each group of dipoles (one group per method and per finger), the determinant of the covariance matrix expresses the entire variation. Little finger Index Thumb Method Determinant Determinant Determinant M 49866 140585 36782 P 23561 52849 17512 S 44097 58617 21216
Under the assumption of a Gaussian distribution, more than 99.7% of dipoles are to be retrieved in the interval [−3σ, 3σ]. Visually, this amounts to tracing an ellipsoid centered on the mean dipole, whose radius along each axis is three times the standard deviation of the dipoles distribution on this axis. This is presented in Figure 4.
5
Conclusion
This paper has extended our previous work [7] on the evaluation of inter-subject nonrigid registration methods. The SPM registration method [1], among the most popular
596
P. Hellier et al. Little finger
Method M
Method P Index
Method S
Method M
Method P Thumb
Method S
Method M
Method P
Method S
Fig. 4. For each method and each finger (top: little finger, middle: index; bottom: thumb), the variability of the deformed dipoles is represented by an ellipsoid. Under a Gaussian hypothesis, the probability of a deformed dipole being in the ellipsoid is more than 0.997.
and widespread in the neuroscience community, has been evaluated with the global and local measures [7]. For global measures, results of the SPM registration method are in accordance with the dimension of the transformation, compared to previous evaluated methods [7]. Local measures, based on the matching of cortical sulci, show that the SPM registration method performs well, the results being significantly better than those obtained with previous methods [7]. This paper also investigates the impact of spatial normalization methods on the registration of functional data. Somatosensory MEG data are acquired and deformed toward the reference subject accordingly to the spatial registration results. The residual inter-subject variability can then be measured. In this study, a rigid registration method serves as a comparison basis for two registration methods: the T&T proportional squaring system and the SPM registration method. The underlying assumption is that part of the inter-subject functional variability is encoded by the inter-subject anatomical variability. The study show that T&T and SPM reduce the inter-subject variability, compared to the rigid transformation. The T&T proportional scaling system seemed to be slightly more accurate than the SPM approach for registering functional data, at least in the central area (since the T&T registration is by construction most precise in this area).
Inter-subject Registration of Functional and Anatomical Data Using SPM
597
References 1. J. Ashburner and K.J. Friston. Nonlinear spatial normalization using basis functions. Human Brain Mapping, 7(4):254–266, 1999. 2. J.Ashburner, P. Neelin, DL. Collins,A. Evans, and K.J. Friston. Incorporating prior knowledge into image registration. Neuroimage, 6:344–352, 1997. 3. http://www.fil.ion.ucl.ac.uk/spm/. 4. A. Collignon, D. Vanderneulen, P. Suetens, and G. Marchal. 3D multi-modality medical image registration using feature space clustering. In Proc. of CVRMed, pages 195–204, Nice, France, 1995. 5. A. Evans, L. Collins, and B. Milner. An MRI-based stereotaxic atlas from 250 young normal subjects. Soc. Neuroscience abstract, 18:408, 1992. 6. J.C. Gee, D.C. Alsop, and G.K. Aguirre. Effect of spatial normalization on analysis of functional data. In K. M. Hanson, editor, Proc. SPIE MI 1997: IP, SPIE (3034), pp 550-560, 1997. 7. P. Hellier, C. Barillot, I. Corouge, B. Gibaud, G. Le Goualher, D.L. Collins, A. Evans, G. Malandain, and N. Ayache. Retrospective evaluation of inter-subject brain registration. Proc. of MICCAI, LNCS number 2208, pp 258-265, 2001. 8. G. Le Goualher, C. Barillot, andY. Bizais. Modeling cortical sulci with active ribbons. IJPRAI, 8(11):1295–1315, 1997. 9. J. Maintz and MA.Viergever. A survey of medical image registration. Medical Image Analysis, 2(1):1–36, 1998. 10. J. Mazziotta, A. Toga, A. Evans, P. Fox, and J. Lancaster. A probabilistic atlas of the human brain: theory and rationale for its development. Neuroimage, 2:89–101, 1995. 11. D. Schwartz, D. Badier, JM. Bihou´e, and A. Bouliou. Evaluation with realistic sources of a new meg-eeg spatio-temporal localization approach. Brain Topography, 11(4):279–289, 1999. 12. D. Schwartz, E. Poiseau, D. Lemoine, and C. Barillot. Registration of MEG/EEG data with 3D MRI : Methodology and precision issues. Brain Topography, 9(2), 1996. 13. J. Talairach and P. Tournoux. Co-planar stereotaxic atlas of the human brain. Georg Thieme Verlag, Stuttgart, 1988. 14. P. Thompson, R. Woods, M. Mega, and A. Toga. Mathematical/computational challenges in creating deformable and probabilistic atlases of the human brain. Human brain mapping, 9:81–92, 2000. 15. P. Viola and W. Wells. Alignment by maximization of mutual information. In Proc. ICCV, pages 15–23, 1995.
Evaluation of Image Quality in Medical Volume Visualization: The State of the Art Andreas Pommert and Karl Heinz H¨ohne Institute of Mathematics and Computer Science in Medicine (IMDM) University Hospital Hamburg-Eppendorf, 20251 Hamburg, Germany {pommert,hoehne}@uke.uni-hamburg.de
Abstract. For applications of volume visualization in medicine, it is important to assure that the 3D images show the true anatomical situation, or at least to know about their limitations. In this paper, various methods for evaluation of image quality are reviewed. They are classified based on the fundamental terms of intelligibility and fidelity, and discussed with respect to the question what clues they provide on how to choose parameters, or improve imaging and visualization procedures.
1
Introduction
Volume visualization (VV) of tomographic volume data, as obtained in computer tomography (CT) or magnetic resonance imaging (MRI), is an important aid for diagnosis, treatment planning, surgery rehearsal, education, and research (fig. 1). For clinical applications, it is of course important to assure that the 3D images really show the true anatomical situation, or at least to know about their limitations. Unfortunately, the resulting images are depending on a large number of parameters, including pixel size, filter kernel of the scanner, slice distance and thickness, interpolation method, threshold (or other segmentation parameters), and gradient operators. Variation of these parameters may result in very different images.
Fig. 1. Examples of volume visualization in craniofacial surgery (left, from CT), virtual colonoscopy (middle, from MRI) and psychiatry research (right, from MRI/PET). But how good are these images? T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 598–605, 2002. c Springer-Verlag Berlin Heidelberg 2002
Evaluation of Image Quality
599
In medical image computing, validation of image quality is a major concern, as was pointed out in recent papers [13], panels [16], and dedicated meetings [7]. However, compared to the large number of papers dealing with methods and applications of VV, only few papers are focusing on the resulting image quality. Nevertheless, this field is characterized by a multitude of definitions and measures of image quality, goals, investigation methods, and considered processing steps, such that the different approaches are often difficult to compare. In this paper, the state of the art for the evaluation of image quality in medical VV is reviewed. A classification of methods is developed, and methods are discussed with respect to the question what clues they provide on how to choose parameters, or improve imaging and visualization procedures. 1.1
Aspects of Image Quality
When is an image good or bad? A straightforward definition of image quality is based on the question: How well does an image communicate an information required by an observer? This is called the intelligibility of the image [34]. For example, an image used in diagnostic imaging is good if it enables an observer to make the right diagnosis (diagnostic image quality). A more technical definition of image quality relates to the question: How much does an image deviate from an ideal image of the scene? This is called the fidelity of the image (technical image quality) [34]. Intelligibility and fidelity are determined by comparing diagnosis or 3D image to an otherwise determined ground truth (figure 2). Both aspects of image quality are discussed in the following. intelligibility (diagnostic image quality)
(diagnostic) truth
3D image processing, computer graphics reality
descriptions, photographs, measures, etc.
(patient, phantom)
observer
3D views and/or intermediate results
results (e.g. diagnosis)
fidelity (technical image quality)
Fig. 2. Evaluation of intelligibility (diagnostic image quality) and fidelity (technical image quality) by comparison to a reference (dashed lines).
2
Intelligibility
In medical imaging, the intelligibility of an image relates mostly to a diagnostic task (fig. 2). Subjective studies include the comparison of different imaging and visualization techniques, such as CT and volume rendering, for applications e.g.
600
A. Pommert and K.H. H¨ ohne
in craniofacial or ENT surgery [1,35]. These papers emphasize the better understanding of spatial relations provided by VV, but give little further insights, since the true anatomical situation is generally not known. The same limitation applies to various papers comparing different processing parameters [26]. Objective investigations of intelligibility for diagnostic tasks are based on blind studies. Image quality can thus be measured in terms of diagnostic accuracy, sensitivity, specifity, or ROC-index. With respect to VV, studies were pioneered by Vannier et al. [38], who compared various imaging modalities and visualization techniques (e.g. x-ray, CT, 3D depth shaded, 3D gradient shaded, 3D volume shaded) for various tasks, including diagnosis of craniosynostosis and fractures [14,39]. In most cases, VV compared favourably. Furthermore, it could be shown that VV accelerated the speed of establishing a diagnosis, and improved localization accuracy of the findings. For screening applications, a high sensitivity is most critical. The sensitivity of virtual colonoscopy, based on CT data, is investigated e.g. by [15,25]. Both studies show a sensitivity similar to that of a real colonoscopy. In the clinical literature, definition of image quality in terms of intelligibility is generally accepted [6]. However, from a more technical point of view, this definition has some problems. First, results strongly depend on factors outside the image, such as the observer’s experience and the task. Second, no measures are at hand for application areas other than diagnostics, such as therapy planning or surgical planning, for which VV methods are most used. Third, observer studies are extremely costly. For certain tasks such as the detection of small signals in nuclear medicine, mathematical model observers were developed [2]. However, the much more complex visual and cognitive tasks involved in understanding perspective 3D images are only little understood so far. Fourth, results of such studies give little or no clues on how to choose parameters, or improve imaging and visualization procedures. An exhaustive testing of all possible settings is hardly feasible, due to the high costs.
3
Image Fidelity
In order to avoid these problems, the more technical definition of image fidelity is used (fig. 2). In a simple case, VV images created using different parameters are compared, without precise knowledge of the anatomical situation. This approach is found in many papers, e.g. [27]. To get at least comparable results, standardized datasets are used, which my be distributed over the Internet [31]. For more thorough investigations, experimental studies may be based on cadavers, phantoms, or simulated data, with known properties. Furthermore, algorithmic or mathematical studies may be carried out. 3.1
Experimental Studies
Anatomical Specimen. In a classic paper [19], 3D images of bone from CT are compared to photographs of the specimen. The investigation also covers
Evaluation of Image Quality
601
variation of slice orientation, distance, thickness, and scanner type. This way, a first description of artifacts such as pseudoforamina and stairsteps could be obtained. More detailed studies, also including measurements of distances, are presented in [10,20,32]. All these papers also describe effects of variation of a (usually small) subset of the parameter space. Phantoms. Instead of specimen, artificial phantoms may be used, which can be designed for special purposes. In [40], a cone-shaped phantom is used to investigate step artifacts in spiral CT. In [9], a special phantom is used to investigate the visualization of a stenosis of variable size. Simulated Data. Simulated data can even easier be adjusted to certain needs, at the cost of loosing some realism. Furthermore, if they are created from a symbolic description, a perfect reference is at hand, which can be used to create error images, showing local deviations [36]. A first question arises how the simulated data are to be created. In some papers, the data are designed to be demanding for visualization algorithms, e.g. by containing high spatial frequencies [22,24,29]. However, more realistic data are obtained by modeling the point spread function of a real tomographic scanner [36], or even the whole physics of image acquisition, as in the MNI Brainweb project [12]. Another question is what should be measured. In 2D medical imaging, aspects such as image resolution or signal-to-noise ratio are often used. In volume visualization, other measures such as the accuracy of surface position or surface normal vectors [24,28,36,37] seem to be more appropriate. This way, typical ranges of error, depending on the choice of parameters, could be estimated. 3.2
Algorithmic and Mathematical Studies
Image Space. In some papers, visualization algorithms are studied in detail “on paper”, using simple example input. This way, it could be shown that the order of processing steps in volume rendering (classification or interpolation first) has a major influence of the obtained accuracy [43]. A step beyond such qualitative approaches are simple (usually 1D) quantitative models which cover major processing steps such as image acquisition, interpolation, and thresholding [5,33]. This way, it could be shown that the error of surface localization is well below voxel size, provided that a suitable threshold value is used. Furthermore, it could be shown that a poor threshold will likely cause visible artifacts, which can be used for further adjustment [33]. Other approaches include the investigation of the asymptotic error of interpolation functions, based on a Taylor series expansion [3,28]. Frequency Space. In signal and image processing, it is often useful to investigate the response of a system in frequency space [8]. In [3], the quality of
602
A. Pommert and K.H. H¨ ohne
different interpolation filters is studied by comparing their amplitude spectra to an ideal low-pass. This concept is extended in [24], where metrics for smoothing and aliasing are introduced, yielding quantitative (but not very intuitive) descriptions. This approach can also be used for gradient filters [4]. As a major drawback, there is no representation in frequency space for nonlinear operations such as thresholding, such that this kind of analysis does not cover all parts of the VV pipeline. 3.3
Predefined Error Bounds
An attractive solution with respect to image fidelity are rendering methods which guarantee the visualization results to be within certain predefined error bounds. One such approach is the controlled precision volume rendering [30,42]. However, the controlled precision relates only to a mathematical approximation of the volume rendering integral common to these algorithms, and says little more about the quality of visualization. With respect to the interpolation of volume data, a new class of interpolation filters is developed in [23]. Under certain assumptions about the data (which may not be met in any case), it is shown that the intensity difference between original and reconstructed function does not exceed a predefined error. While this filter has some practical problems, including high computational costs, it is a promising first step in this direction.
4
Related Fields
There are some other fields closely related to VV which might be of interest here. For segmentation of clinical data, a method for validation without ground truth was developed which is based on a statistical analysis of the results of segmentation by several experts. Roughly, if a segmentation algorithm is within this variation, results are accepted [11]. A tool for validation is presented in [17]. MRI brain images segmented by experts are available from Harvard University [21]. Within the Insight project, there are also proposals to use the Visible Human data for this purpose [44]. A special situation occurs for the validation of methods for image registration. Using prepared test datasets, the true transformation can be determined using external markers, which are later removed from the test data, as was done in a well-known study [41]. A comparable situation occurs for the validation of scaling methods [18].
5
Conclusions
In this paper, we presented a brief overview of the methods published so far for an evaluation of image quality in medical volume visualization. This field turned out to be very multifaceted and complex. Nevertheless, some conclusions seem obvious:
Evaluation of Image Quality
603
– Methods for (objective) evaluation of image intelligibility are the method of choice from a clinical point of view, but provide little help in optimizing visualization procedures, partly due to high costs of such studies. Artificial model observers will very likely not be available in this field for some time. – For experimental studies of image fidelity, simulated data appear promising, since they may easily be adjusted to certain needs, and provide a means to create error images, precisely showing local deviations, at a rather low cost. Essential is of course a realistic simulation of the tomographic image acquisition. – Mathematical studies, as well as processing algorithms with predefined error bounds, are currently available for certain steps of the volume visualization pipeline only, especially for interpolation. So far, no investigations are available which determine how to choose parameters for all steps of the volume visualization pipeline in order to achieve certain visualization results, e.g. with respect to size or contrast of the depicted structures. Development of such a “best practice guide” will certainly be a major task in the future.
References 1. Alder, M. E., Deahl, S. T., Matteson, S. R.: Clinical usefulness of two-dimensional reformatted and three-dimensionally rendered computerized images: Literature review and a survey of surgeons’ opinions. J. Oral Maxillofac. Surg. 53, 4 (1995), 375–386. 2. Barrett, H. H., Yao, J., Rolland, J. P., Myers, K. J.: Model observers for assessment of image quality. Proc. Natl. Acad. Sci. USA 90 (1993), 9758–9765. 3. Bentum, M. J., Lichtenbelt, B., Boer, M. A., Nijmeijer, A. G., Bosma, M., Smit, J.: Improving image quality of volume rendered three-dimensional medical data. In Kim, Y. (Ed.): Medical Imaging: Image Display, Proc. SPIE 2707. Newport Beach, CA, 1996, 32–43. 4. Bentum, M. J., Malzbender, T., Lichtenbelt, B. B.: Frequency analysis of gradient estimators in volume rendering. IEEE Trans. Visualization Comput. Graphics 2, 3 (1996), 242–254. 5. Bosma, M. K., Smit, J., Lobregt, S.: Iso-surface volume rendering. In Kim, Y., Mun, S. K. (Eds.): Medical Imaging 1998: Image Display, Proc. SPIE 3335. San Diego, CA, 1998, 10–19. 6. Bowyer, K. W.: Validation of medical image analysis techniques. In Sonka, M., Fitzpatrick, J. M. (Eds.): Handbook of Medical Imaging, 2, SPIE Press, Bellingham, WA, 2000, ch. 10, 567–607. 7. Bowyer, K. W., Loew, M. H., Stiehl, H. S., Viergever, M. A. (Eds.): Methodology of Evaluation in Medical Image Computing. Dagstuhl Seminar Report 301, Internationales Begegnungs- und Forschungszentrum f¨ ur Informatik, Schloss Dagstuhl, 2001. (ISSN 0940-1121). 8. Bracewell, R. N.: The Fourier Transform and its Applications. 3. ed. McGraw-Hill International Editions, Singapore, 2000. 9. Brink, A. J., Lim, J. T., Mang, G., Heiken, J. P., Deyoe, A. J., Vannier, M. W.: Technical optimization of spiral CT for depiction of renal artery stenosis: In vitro analysis. Radiology 194 (1995), 157–163. 10. Cavalcanti, M. G. P., Haller, J. W., Vannier, M. W.: Three-dimensional computed tomography landmark measurement in craniofacial surgery planning: Experimental validation in vitro. J. Oral Maxillofac. Surg. 57 (1999), 690–694.
604
A. Pommert and K.H. H¨ ohne
11. Chalana, V., Kim, Y.: A meethodology for evaluation of image segmentation algorithms on medical images. In Loew, M. H., Hanson, K. M. (Eds.): Medical Imaging 1996: Image Processing, Proc. SPIE 2710. Newport Beach, CA, 1996, 178–189. 12. Collins, D. L., Zijdenbos, A. P., Kollokian, V., Sled, J. G., Kabani, N. J., Holmes, C. J., Evans, A. C.: Design and construction of a realistic digital brain phantom. IEEE Trans. Med. Imaging 17, 3 (1998), 463–468. 13. Duncan, J. S., Ayache, N.: Medical image analysis: Progress over two decades and the challenges ahead. IEEE Trans. Pattern Anal. Machine Intell. 22, 1 (2000), 85–105. 14. Fox, L. A., Vannier, M. W., West, O. C., Wilson, A. J., Baran, G. A., Pilgram, T. K.: Diagnostic performance of CT, MPR, and 3DCT imaging in maxillofacial trauma. Comput. Med. Imaging Graph. 19, 5 (1996), 385–395. 15. Garry, J. L., Reed, J. E., Johnson, C. D.: Performance of computed tomographic colonography improved by total quality management techniques. In Chen, C.T., Clough, A. V. (Eds.): Medical Imaging 2000: Physiology and Function from Multidimensional Images, Proc. SPIE 3978. San Diego, CA, 2000, 183–194. 16. Gee, J. C., Haralick, R. M., Clarke, L. P., Fitzpatrick, J. M., Haynor, D. R., Ramesh, V., Viergever, M. A.: Performance evaluation of medical image processing algorithms. In Hanson, K. M. (Ed.): Medical Imaging 2000: Image Processing, Proc. SPIE 3979. San Diego, CA, 2000. 17. Gerig, G., Jornier, M., Chakos, M.: Valmet: A new validation tool for assessing and improving 3D object segmentation. In Niessen, W. J., Viergever, M. A. (Eds.): Medical Image Computing and Computer-Assisted Intervention, Proc. MICCAI 2001, Lecture Notes in Computer Science 2208, Springer-Verlag, Berlin, 2001, 516– 523. 18. Grevera, G. J., Udupa, J. K.: An objective comparison of 3-D image interpolation methods. IEEE Trans. Med. Imaging 17, 4 (1998), 642–652. 19. Hemmy, D. C., Tessier, P. L.: CT of dry skulls with craniofacial deformities: Accuracy of three-dimensional reconstruction. Radiology 157, 1 (1985), 113–116. 20. Hopper, K. D., Pierantozzi, D., Potok, P. S., Kasales, C. J., TenHave, T. R., Meilstrup, J. W., Van Slyke, M. A., Mahraj, R., Westacott, S., Hartzel, J. S.: The quality of 3D reconstructions from 1.0 and 1.5 pitch helical and conventional CT. J. Comput. Assist. Tomogr. 20, 5 (1996), 841–847. 21. Internet Brain Segmentation Repository. Massachusetts General Hospital, Harvard University, Boston, MA, 2000. http://neuro-www.mgh.harvard.edu/cma/ibsr/. 22. Kim, K., Wittenbrink, C. M., Pang, A.: Extended specifications and test data sets for data level comparisons of direct volume rendering algorithms. IEEE Trans. Visualization Comput. Graphics 7, 4 (2001), 299–317. 23. Machiraju, R., Yagel, R.: Reconstruction error characterization and control: A sampling theory approach. IEEE Trans. Visualization Comput. Graphics 2, 4 (1996), 364–377. 24. Marschner, S. R., Lobb, R. J.: An evaluation of reconstruction filters for volume rendering. In Bergeron, R. D., Kaufman, A. E. (Eds.): Proc. IEEE Visualization ’94. Tysons Corner, VA, 1994, 100–107. 25. McFarland, E. G., Brink, J. A., Heiken, J. P., Balfe, D. M., Hirselj, D. A., Pilgram, T. K., Argiro, V., Littenberg, B.: Spiral CT colonography (virtual colonoscopy): Multiobserver study of different image display techniques compared to colonoscopy. In Chen, C.-T., Clough, A. V. (Eds.): Medical Imaging 1999: Physiology and Function from Multidimensional Images, Proc. SPIE 3660. San Diego, CA, 1999, 106– 108. 26. McFarland, E. G., Brink, J. A., Loh, J., Wang, G., Argiro, V., Balfe, D. M., Heiken, J. P., Vannier, M. W.: Visualization of colorectal polyps with spiral CT colography: Evaluation of processing parameters with perspective volume rendering. Radiology 205 (1997), 701–707. 27. Meißner, M., Huang, J., Bartz, D., Mueller, K., Crawfis, R.: A practical evaluation of popular volume rendering algorithms. In Proc. 2000 Symposium on Volume Visualization and Graphics. Salt Lake City, UT, 2000, 81–90.
Evaluation of Image Quality
605
28. M¨ oller, T., Machiraju, R., Mueller, K., Yagel, R.: Evaluation and design of filters using a Taylor series expansion. IEEE Trans. Visualization Comput. Graphics 3, 2 (1997), 184–199. 29. Moorhead, R. J., Zhu, Z.: Signal processing aspects of scientific visualization. IEEE Signal Processing Magazine 12, 5 (1995), 20–41. 30. Novins, K. L., Arvo, J.: Controlled precision volume integration. In Proc. 1992 Workshop on Volume Visualization. Boston, MA, 1992, 83–89. 31. Pfister, H., Lorensen, B., Bajaj, C., Kindlmann, G., Schroeder, W., Avila, L. S., Martin, K., Machiraju, R., Lee, J.: The transfer function bake-off. IEEE Comput. Graphics Appl. 21, 3 (2001), 16–22. 32. Pommert, A., H¨ oltje, W.-J., Holzknecht, N., Tiede, U., H¨ ohne, K. H.: Accuracy of images and measurements in 3D bone imaging. In Lemke, H. U. et al. (Eds.): Computer Assisted Radiology, Proc. CAR ’91, Springer-Verlag, Berlin, 1991, 209– 215. 33. Pommert, A., Tiede, U., H¨ ohne, K. H.: Accuracy of isosurfaces in volume visualization. In Girod, B. et al. (Eds.): Vision, Modeling, and Visualization, Proc. VMV 2000, IOS Press, Amsterdam, 2000, 365–371. 34. Pratt, W. K.: Digital Image Processing. 2. ed. John Wiley and Sons, New York, 1991. 35. Remy-Jardin, M., Remy, J., Artaud, D., Fribourg, M., Duhamel, A.: Volume rendering of the tracheobronchial tree: clinical evaluation of bronchographic images. Radiology 208 (1998), 761–770. 36. Tiede, U., H¨ ohne, K. H., Bomans, M., Pommert, A., Riemer, M., Wiebecke, G.: Investigation of medical 3D-rendering algorithms. IEEE Comput. Graphics Appl. 10, 2 (1990), 41–53. 37. Udupa, J. K., Gon¸calves, R. J.: Imaging Transforms for Volume Visualization. In Taylor, R. H. et al. (Eds.): Computer Integrated Surgery: Technology and Clinical Applications, MIT Press, Cambridge, MA, 1995, ch. 3, 33–57. 38. Vannier, M. W.: Evaluation of 3D Imaging. Crit. Rev. Diagn. Imaging 41, 5 (2000), 315–378. 39. Vannier, M. W., Pilgram, T. K., Marsh, J. L., Kraemer, B. B., Rayne, S. C., Gado, M. H., Moran, C. J., McAlister, W. H., Shackelford, G. D., Hardesty, R. A.: Craniosynostosis: Diagnostic imaging with three-dimensional CT presentation. Am. J. Neuroradiology 15, 10 (1994), 1861–1869. 40. Wang, G., Vannier, M. W.: Stair-step artifacts in three-dimensional helical CT: An experimental study. Radiology 191 (1994), 79–83. 41. West, J., Fitzpatrick, J. M., Wang, M. Y., Dawant, B. M., Maurer, C. R., Kessler, R. M., Maciunas, R. J., Barillot, C., Lemoine, D., Collignon, A., Maes, F., Suetens, P., Vandermeulen, D., van den Elsen, P. A., Napel, S., Sumanaweera, T., Harkness, B., Hemler, P. F., Hill, D. L. G., Hawkes, D. J., Studholme, C., Maintz, J. B. A., Viergever, M. A., Malandain, G., Pennec, X., Noz, M. E., Maguire, G. Q., Pollack, M., Pelizzari, C. A., Robb, R. A., Hanson, D., Woods, R. P.: Comparison and Evaluation of Retrospective Intermodality Brain Image Registration Techniques. J. Comput. Assist. Tomogr. 21, 4 (1997), 554–566. 42. Williams, P. L., Max, N. L., Stein, C. M.: A high accuracy volume renderer for unstructured data. IEEE Trans. Visualization Comput. Graphics 4, 1 (1998), 37– 54. 43. Wittenbrink, C. M., Malzbender, T., Goss, M. E.: Opacity-weighted color interpolation for volume sampling. In Lorensen, W., Yagel, R. (Eds.): Proc. 1998 IEEE Symp. Volume Visualization. Research Triangle Park, NC, 1998, 135–142. 44. Yoo, T. S., Ackerman, M. J., Vannier, M.: Toward a common validation methodology for segmentation and registration algorithms. In Delp, S. L. et al. (Eds.): Medical Image Computing and Computer-Assisted Intervention, Proc. MICCAI 2000, Lecture Notes in Computer Science 1935, Springer-Verlag, Berlin, 2000, 422–431.
Shear-Warp Volume Rendering Algorithms Using Linear Level Octree for PC-Based Medical Simulation Zhenlan Wang1, Chee-Kong Chui1, Chuan-Heng Ang2, Wieslaw L. Nowinski1 1
Biomedical Imaging Lab, Singapore
[email protected] 2 School of Computing, National University of Singapore, Singapore Abstract. We describe a new algorithm for 3D, and potentially higher, volume rendering. This algorithm takes advantage of the shear-warp factorization technique to reduce matrix computing and a hierarchical data structure for better utilization of spatial coherence. The algorithm represents an improvement over existing algorithms in terms of performance, with little compromise in image quality. The algorithm benefits 3D volume rendering as well as multi-modality volume rendering. Initial work on comparison and validation of algorithm are done. The results and analysis show that the algorithm is promising. Our preliminary work in using this method for medical simulation is also discussed. This method is particularly suitable for PC-based medical simulation where the computation load is high and the raw computing power is limited.
1
Introduction
Approximately eighty percents of all information perceived by human is through the eyes, while the visual system of humans is the most complex of all sensory modalities [1]. In medicine, visual information plays an essential role for accurate diagnosis and effective therapy planning. Medical images such as computed tomography (CT), magnetic resonance imaging (MRI) and nuclear medicine imaging are increasingly used in medical simulation for pre-treatment planning and outcome prediction. There is a demand for medical simulation to be executed on cost effective desktop workstations. The research work reported here represents our effort to develop efficient visualization solutions for PC-based medical simulation systems [2, 3]. In medical image applications, direct volume rendering is considered superior over surface rendering [4]. There are two broad categories of volume rendering algorithms, known as object and image space methods. The shear-warp algorithm [5] is widely regarded as the fastest volume rendering method to date. There are various methods to speedup rendering by using parallel/distributed approaches [6], hardware acceleration [7], and texture mapping techniques [8]. 4D rendering is relatively new, particularly in medicine. The recent advances including dynamic MRI and multi-modality imaging make new demands for highly sophisticated and efficient visualization techniques. Visualization of an actual procedure with patient-specific 4D volume datasets and visualization of multi-modal datasets of the same patient is critical for accurate simulation with high confidence. Existing techniques lack the provision of high speed and low cost multi-modality volume renderer as well as 4D rendering that will be important in future medical simulation. For example, VolumePro [7] and texture mapping are restricted to a single dataset (single modality). Texture mapping does not support extensive shading. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 606–614, 2002. © Springer-Verlag Berlin Heidelberg 2002
Shear-Warp Volume Rendering Algorithms Using Linear Level Octree
607
A new volume rendering algorithm called LLO-based shear-warp algorithm is presented in this paper. Both suitable data structure and efficient algorithm are explored to achieve fast 3D volume visualization of medical images for simulation on a standard PC. The algorithm takes advantages of both the shear-warp factorization technique and the hierarchical data structure. In contrast to the conventional hierarchical data structure, the algorithm avoid the hierarchical traversal during the rendering. The volume data are encoded so that only the 3D regions containing imaging information need to be accessed and processed. The regions of both low presence and low variation can be visualized efficiently. The paper is organized as follows. Data structure and algorithm are in sections 2 and 3, respectively. Initial results are in section 4. The application of our method for medical simulation is discussed in section 5, followed by the conclusion.
2
Linear Level Octree (LLO)
2.1
Data Structure
Octree can be labeled with different schemes. Linear Level Octree (LLO) [9], which is extended from linear octree scheme, is a label scheme with many advantages. LLO inherits all the advantages of linear octree. For instance, only leaf nodes are stored, which greatly reduces not only memory consumption but also, more essentially, the nodes that are required to be processed. It implicitly encodes the location, the size of the nodes, and the path from the root to the nodes. Therefore, it is easier to estimate the relations between arbitrary spatial points and LLO nodes with lower computing cost. More benefits from LLO will be mentioned in the description of the algorithm. A volume with size 2 × 2 × 2 is put into a coordinate system, where n is the resolution of the raster. The back-bottom-left corner of the volume is located at the origin, and its edges are aligned/overlapped with the coordinate axes respectively. Each octant is labeled with a unique code key: (Li, xi, yi, zi), where Li is the level of the node, and xi, yi, and zi are the x, y, z level-coordinates of the node respectively. The code key of the root node is defined as (0, 0, 0, 0). Assume an arbitrary node A at level Li has code key (Li, xi, yi, zi). Then, the backbottom-left subnode of A at level Li + ∆L (∆L = 1, 2, …) has code key (Li0, xi0, yi0, zi0), where Li0 = Li + ∆L, xi0 = xi • 2∆L, yi0 = yi • 2∆L and zi0 = zi • 2∆L. The code keys of A’s other subnodes at level Li + ∆L are given as follows. n
n
Li0, xi0, yi0, zi0 Li1, xi1, yi1, zi1 Li2, xi2, yi2, zi2 Li3, xi3, yi3, zi3 Li4, xi4, yi4, zi4 Li5, xi5, yi5, zi5 Li6, xi6, yi6, zi6 Li7, xi7, yi7, zi7
n
= (Li, xi, yi, zi) +
0, 0, 0, 0 0, 1, 0, 0 0, 0, 1, 0 0, 1, 1, 0 0, 0, 0, 1 0, 1, 0, 1 0, 0, 1, 1 0, 1, 1, 1
We define the distance between two adjacent voxels as one unit. By exploiting the label scheme of LLO, we can infer the following properties of an arbitrary octant node A with code key (L, x, y, z):
608
Z. Wang et al.
• Location = (x • 2∆L, y • 2∆L, z • 2∆L), where ∆L = n – L. This is the absolute location coordinate (not level coordinate) of A’s back-bottomleft corner voxel in the coordinate system. Normally, it is also regarded as the location of node A. • Size = 2∆L, where ∆L = n – L. This is the side length of node A, or in other words, it is the number of voxels contained in one dimension of A. • The code key of A’s upper level (ancestor level), say level Lw (Lw < L), is: (Lw, x/2∆L, y/2∆L, z/2∆L), where ∆L = L – Lw. 2.2
Conversion of Volume to LLO
To take advantage of the spatial coherency in the volume, LLO is employed in our algorithm. Before rendering, the classified volume is converted into the representation of LLO. We defined the following leaf criteria for LLO generation: • The smallest octant is of size 2 × 2 × 2 , i.e., each dimension of the octant contains 2 voxels at least. • Let Max and Min be the maximum and minimum value/intensity of all the voxels contained in an octant, respectively. The equation below must be satisfied: Max – Min ≤ T, where T is a user predefined threshold (T ≥ 0). • If all the voxels contained in an octant are transparent, the octant is a “white” node and it will not be stored. Otherwise, it is a “black” node, and stored. Besides the leaf node criteria, information saved in each leaf node (i.e. octant data structure) is also an important factor for efficient volume rending. We have the following information saved with the leaf nodes. • Code key of each leaf node (Li, xi, yi, zi). • Eight vertex voxels of the leaf node. We have the eight voxels located at the corners of the octant, say vertex voxels, stored with the leaf node. It is based on the observation that since a group of neighboring voxels can be organized into a leaf octant, their value must be homogeneous (variation under a threshold value). Eight vertex voxels should be enough to represent the sub-volume. Thus, with the leaf node criteria and octant data structure, the whole volume is encoded into LLO and the original volume dataset is not needed any longer.
3
LLO-Based Shear-Warp Algorithm
3.1
Rendering Pipeline
Figure 1 shows the pipeline of our LLO-based shear-warp algorithm. An LLO can be either encoded from volume data or directly loaded from the storage media where an LLO is encoded and previously saved. The rendering procedure is enclosed in the dash-line square. After the LLO is built, octants are output one by one according to the current view direction. A shear-warp renderer is employed for octant rendering. The contribution of each octant is composited to an intermediate image. After all the octants are processed, the intermediate image is swapped for generation of the final image.
Shear-Warp Volume Rendering Algorithms Using Linear Level Octree Volume
609
Storage media
LLO Traversal
Octant
Display order arrays
Shear-warp renderer Intermediate image Final image
Fig. 1. Rendering pipeline of the LLO-based shear-warp algorithm
3.2
Display Order and Traversal of LLO
In this algorithm, we are taking advantage of an object-order approach to fulfill the image-order volume rendering. Octants instead of voxels become the primitive visualization elements, and the number of nonempty octants is far less than the number of voxels. Because the empty regions do not need to be processed or even checked, processing time can be reduced remarkably. It is realized based on the fact that, for a given set of parallel viewing directions, the octants can be processed in specific order without affecting the result image, and there are only finite (eight) sets of such viewing directions. To distinguish the child nodes of a parent, a distinctive number is distributed to each of them as showed in Figure 2. A vector, say (Vx, Vy, Vz,), is used to represent view directions. Thus, display orders of nodes from different view directions are given in Table 1. The underlined node-numbers in same segments imply they have same display priority so that they can be processed in different orders without affecting the result image.
Y
2 6
3 7 1
0 4
5
Z
Fig. 2. Code of child nodes
X
(Vx, Vy, Vz,)
Display Orders
(<0, <0, <0)
7, 3, 5, 6, 1, 2, 4, 0
(>0, <0, <0)
6, 2, 4, 7, 0, 3, 5, 1
(<0, >0, <0)
5, 1, 4, 7, 0, 3, 6, 2
(>0, >0, <0)
4, 0, 5, 6, 1, 2, 7, 3
(<0, <0, >0)
3, 1, 2, 7, 0, 5, 6, 4
(>0, <0, >0)
2, 0, 3, 6, 1, 4, 7, 5
(<0, >0, >0)
1, 0, 3, 5, 2, 4, 7, 6
(>0, >0, >0)
0, 1, 2, 4, 3, 5, 6, 7
Table 1. Display orders of nodes
610
Z. Wang et al.
With the node-display-order table, the volume data encoded by LLO can be traversed easily with a recursive algorithm. By further exploiting Table 1, we find that there are only four distinctive display orders, while others are just in inverse order. Therefore before rendering, the LLO can be traversed once for each of the four distinctive view directions, and the pointers of the leaf nodes are stored in four different arrays in order. Thus during rendering, it is not necessary to traverse the LLO any longer. According to the current view direction, one of the four display order arrays is selected, and each octant is visualized from the beginning to the end or inverse in the array (Figure 1). Since the traversal of the hierarchical data structure is the most time consuming operation and it is also regarded as the major disadvantage of hierarchical data structure [10], a substantial reduction of computing cost is achieved by this improvement. 3.3
Visualization of Octants with Shear-Warp Rendering
The share-warp algorithm is used for octant rendering. Octants are transformed into the sheared object space. Given the transformation of the parent node, the transformation of its sub-octant can be efficiently computed. In the sheared space, the octant is projected onto an intermediate image plane, which composite projections of all the octants. The intermediate image is run length encoded so that opaque pixels can be skipped quickly. However, run length encoding of the volume is not needed as only non-transparent sub-volumes are passed to the renderer. The shaded samples are interpolated in the octant. Since all the samples are interpolated using the same eight vertices in one octant, computing cost can be greatly reduced by incremental computation. For efficient gradient interpolation, the gradients for all eight vertices of each octant are computed and saved with each octant at a preprocessing stage. After all the octants have been processed, the intermediate image can be warped into the final image with an inexpensive bilinear filter. The rendering procedure is completed. 3.4
Multi-modality Visualization
Our LLO-based shear-warp algorithm is extended to render multiple modalities simultaneously. Octree is very efficient in set-theoretic operations. If multiple modalities have been encoded into LLO, they can be easily integrated by simple octree “OR” operations. Thus, while multiple modalities are integrated into one LLO, it can be easily visualized with the algorithm discussed in the previous section. The main advantage of this method is memory saving because there are no additional volume modalities existed. Multiple modalities could be integrated either at the data pre-processing stage, the rendering stage, or post-processing composing stage. Therefore, an integration criterion, referred to as integration function, must be defined. An integration function in rendering stage can be defined as:
SI = α ⋅SA + β ⋅SB
where SA and SB are samples drawn from volume data set A and B, respectively, at the same virtual position. SI is the final integrated sample value. α and β are the integra-
Shear-Warp Volume Rendering Algorithms Using Linear Level Octree
611
tion factors. A lookup table as shown in Table 2 can be maintained for efficient multimodality integration. Table 2. Integration factor lookup table Intensity value 1 2 …
α 0.6 0.2 …
β 0.7 0.5 …
Different samples according to their intensity values are weighted differently so that multiple structures in different modalities can be flexibly classified and visualized together without confusion. In this method, boundary check is necessary, because the integrated sample value may exceed the permitted maximum value.
4
Results and Discussion
We compared the performance of our LLO-based ray-casting renderer with that of the conventional ray-casting algorithm. We observed significant speedup in comparison with the latter and almost no change in image quality. The shear-warp algorithm is under implementation and we expect to have even better results. Our LLO-based ray-casting renderer has achieved average 3.8 times of speed acceleration with comparable image quality compared to the conventional ray-casting method, on 4 CT, MRI, and rotational angiography datasets shown in Table 3. The results are obtained on a PC with 1.4GHz Pentium III CPU and 512MB physical memory. The rendered image has resolution of 256×256. We use a homogeneous threshold of 10 for LLO generation. From Table 3, the smallest dataset brain vessel did not achieve the best speedup, whereas CT Head – Skull and MR Brain datasets achieved the best performance. There are two reasons. Firstly, the brain vessel data are most semi-transparent, while CT Head Skull and MR Brain datasets have opaque surfaces. So early ray termination due to the opacity of these surfaces has accelerated the speed of ray-casting algorithm on the latter two datasets. Secondly, the non-transparent voxels of the brain vessel dataset scatter in the volume space, while the non-transparent voxels of CT Head Skull and MR Brain datasets are scatter in small regions. There are more large empty octants in the LLOs of latter two datasets that help the volume rendering skip space more quickly and efficiently. On the other hand, because the brain vessel dataset consists of many small octants, it gains very good image quality. We also found that both the highest and the lowest speedup performance are acquired from the same dataset CT Head. By further manipulation of the dataset, we can found that there is a large noisy block located exactly behind of the head. In the Face image, the noisy block reduced the performance significantly. On the contrary, the noisy block is filtered in the Skull image, so the left large empty space encoded by LLO benefit the performance significantly. Therefore, the acceleration performance of LLO for volume rendering is often dataset-dependent, and the LLO-based rendered images are similar to the images rendered by brute force ray-casting. The preliminary results achieved so far do not demonstrate the full capability of the method. Therefore, better performance can be expected in the future.
612
Z. Wang et al. Table 3. Rendering timings and speedup performance Dataset
Brute force ray-casting
LLO-based ray-casting
Speedup
Brain Vasculature (128 x 128 x 128)
3.2
CT Head – Face (256 x 256 x 113)
2.9
CT Head – Skull (256 x 256 x 113)
4.9
MR Brain (256 x 256 x 109)
4.2
Figure 3 demonstrates our LLO-based ray-casting algorithm for multi-modality rendering. The two datasets used are VHD Male and a patient’s cerebral angiography. The former has resolution of 256 x 256 and involved a total of 85 slices with 5 mm inter-slice gap, and is acquired by multi-slice CT scanner. The latter is a reconstructed rotational x-ray angiography (XRA) data from GE. The CT images and reconstructed XRA are registered and visualized on a PC.
Shear-Warp Volume Rendering Algorithms Using Linear Level Octree
613
Fig. 3. Multi-modality rendering
5
Volume Rendering in Medical Simulation
The LLO-based volume renderer is part of our effort in developing PC-based interactive medical simulator [11, 12]. Figure 4 shows volume rendered images of cerebral vasculature and the GUI of a system that is developed to register the volume rendered images of the same patient vascular with the volume rendered CT images of the VHD Male dataset.
(a)
(b)
Fig. 4. PC-based medical simulator:(a) Cerebral vessels; (b) Cerebral vessels registered with VHD head rendered as a fluoroscopic image
6
Conclusion
In this paper, we present our initial work on a new shear-warp algorithm based on spatial data structure. The proposed algorithm has several advantages. Firstly, because of the employment of a shear-warp renderer, each octant is considered only once during rendering, and only the relevant octant is considered each time. Secondly, the algorithm takes advantage of 3D coherency of the volume data rather than 1D coherency in pure shear-warp algorithm so that in 3D regions of both low presence and low variation, the algorithm can work efficiently. Additionally, because the samples in the same octant are computed by incremental interpolation among the same set of vertices, supersampling in three directions (x, y and z) can be performed with little cost compared with pure shear-warp algorithm. Finally, this method only keeps “black” octants of one LLO instead of 3 copies of RLE-encoded data as in shear-warp algorithm. We plan to have this new LLO-based shear-warp algorithm to form the basis for multi-modal and 4D volume visualization in our PC-based simulator.
614
Z. Wang et al.
Acknowledgements Support of this research development by Agency of Science, Technology and Research is gratefully acknowledged.
References 1. Demiris, A. Mayer, H.P. Meinzer, 3-D Visualization in Medicine: An Overview, Contemporary Perspectives in Three-Dimensional Biomedical Imaging, C. Roux and J.-L. Coatrieux (Eds.) IOS Press, 1997, pp. 79-105. 2. J.H. Anderson, C.K. Chui, Y. Cai, Y. Wang, Z. Li, X. Ma, W.L. Nowinski, M. Solaiyappan, K. Murphy, A.C. Venbrux and P. Gailloud, Virtual reality training in interventional radiology, to appear in Seminars in Interventional Radiology, Thieme Medical, 2002. 3. Z. Li, C.K. Chui, J.H. Anderson, X. Chen, X. Ma, W. Huai, Q. Peng, Y. Cai, Y. Wang and W.L. Nowinski, Computer environment for interventional neuroradiology procedures, Simulation and Gaming, Vol. 32, No. 3, September 2001, pp. 405-420. 4. T.T. Elvins, A Survey of Algorithms for Volume Visualization, Computer Graphics, 26(3), 1992, pp. 194-201. 5. P. Lacroute, and M. Levoy, Fast volume rendering using a shear-warp factorization of the viewing transformation, SINGGRAPH’94, 1994, pp. 451-458. 6. W.L. Nowinski, A hybrid parallel ray caster for medical imaging, Mathematical Research, 1994, pp. 305-316. 7. H. Pfister, J. Hardenbergh, J. Knittel, H. Lauer, and L. Seiler, The VolumePro real-time ray-casting system, Proc. SIGGRAPH’99, 1999, pp. 251-260. 8. M. Meissner, U. Hoffmann, and W. Strasser, Enabling Classification and Shading for 3D Texture Mapping based Volume Rendering using OpenGL and Extensions, IEEE Visualization, 1999, pp. 207-214. 9. C.K. Chui, Z.M.Yin, and R.B. Shu, K.F. Loe, An efficient algorithm for volume display by linear level octree, Proceedings of Seminar on Computer Graphics DISCS/NUS, National University of Singapore, Singapore, November, 1991, pp. 46-62. 10. R. Yagel, Efficient Techniques for Volume Rendering of Scalar Fields, Data Visualization Techniques, Edited by C. Bajaj, Chichester: Wiley, 1999, pp.15-30. 11. C.K. Chui, Z. Li, J.H. Anderson, K. Murrphy, A. Venbrux, X. Ma, Z. Wang, P. Gailloud, Y. Cai, Y. Wang and W.L. Nowinski, Training and Planning of Interventional Neuroradiology Procedures - Initial Clinical Validation Proceedings of 10th Annual Medicine Meets Virtual Reality Conference (MMVR 2002), Newport Beach, USA, February 2002, pp. 96102. 12. W.L. Nowinski and C.K. Chui, Simulation of interventional neuroradiology procedures, Proc. Medical Imaging and Augmented Reality (MIAR 2001), Hong Kong, June 2001, pp. 87-94.
Line Integral Convolution for Visualization of Fiber Tract Maps from DTI T. McGraw1 , B.C. Vemuri1 , Z. Wang1 , Y. Chen2 , M. Rao2 , and T. Mareci3 1
2 3
Dept. of CISE, University of Florida, Gainesville, Fl. 32611 Dept. of Mathematics, University of Florida, Gainesville, Fl. 32611 Dept. of Biochemistry, University of Florida, Gainesville, Fl. 32610
Abstract. Diffusion tensor imaging (DTI) can provide the fundamental information required for viewing structural connectivity. However, robust and accurate acquisition and processing algorithms are needed to accurately map the nerve connectivity. In this paper, we present a novel algorithm for extracting and visualizing the fiber tracts in the CNS specifically, the spinal cord. The automatic fiber tract mapping problem will be solved in two phases, namely a data smoothing phase and a fiber tract mapping phase. In the former, smoothing is achieved via a weighted TV-norm minimization which strives to smooth while retaining all relevant detail. For the fiber tract mapping, a smooth 3D vector field indicating the dominant anisotropic direction at each spatial location is computed from the smoothed data. Visualization of the fiber tracts is achieved by adapting a known Computer Graphics technique called the line integral convolution, which has the advantage of being able to cope with singularities in the vector field and is a resolution independent way of visualizing the 3D vector field corresponding to the dominant eigen vectors of the diffusion tensor field. Examples are presented to depict the performance of the visualization scheme on three DT-MR data sets, one from a normal and another from an injured rat spinal cord and a third from a rat brain.
1
Introduction
Fundamental advances in understanding living biological systems require detailed knowledge of structural and functional organization. Recently MR imaging has been used to study the structural connectivity within whole living organisms. The MR measurement of water translational self-diffusion provides a method that can be used to study structural connectivity with ubiquitous indigenous material, water. In highly organized nervous tissue, like white matter, diffusion anisotropy can be used to visualize fiber tracts. Recently MR measurements have been developed to measure the tensor of diffusion. The development of diffusion tensor acquisition, processing, and analysis methods provides the framework for creating fiber tract maps based on this complete diffusion tensor analysis [9,10,12,13]. For automated fiber tract mapping, prior to estimating the diffusion tensor, the raw data must be smoothed while preserving relevant detail. The raw data T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 615–622, 2002. c Springer-Verlag Berlin Heidelberg 2002
616
T. McGraw et al.
in this context consists of seven directional images acquired for varying magnetic field strengths. Note that at least seven values at each 3D grid point in the data domain are required to estimate the six unknowns in the symmetric, rank 2 tensor and one scale parameter. The data smoothing or de-noising can be formulated using variational principles which in turn require solutions to partial differential equations (PDEs). We will limit ourselves to vector-valued image smoothing and refer the reader to [21,4] for scalar-valued image smoothing techniques. Whitaker and Gerig [23] introduced anisotropic vector-valued diffusion which was a direct extension of the work by Perona and Malik [15] to vector-valued images. In [19] Sapiro et.al., introduced a selective smoothing technique based on the Riemanian metric of the underlying manifold of the vector-valued function. This was applied to restoration of color images. A very general flow called the Beltrami flow was introduced in Kimmel et. al., [11] and was shown that most flow-based smoothing schemes may be viewed as special cases in their framework. A generalization of the total variation (TV) norm to handle vector-valued image smoothing was presented in Blomgren and Chan [2]. For more on other flow-based smoothing methods, we refer the reader to a survey by Weickert [21] and also [4]. More recently, Poupon et.al., [17] developed a Bayesian formulation of the fiber tract mapping problem. Prior to mapping the fibers, they use robust regression to estimate the diffusion tensor from the vector valued image data. Note that no image selective smoothing is performed in their work prior to application of the robust regression for estimating the diffusion tensors. Previously, Westin, et. al. [22] presented a smoothing method applied strictly in the tensor domain, after diffusion tensors had been computed from noisy raw data. We propose a novel and efficient weighted TV norm based image smoothing scheme where in the raw image data (one image for each of the 7 directions) S is smoothed using a PDE which is obtained as a consequence of a weighted TV norm minimization defined for vector valued functions. The selective term in our work is based on the eigen values of a diffusion tensor D that can be computed initially from the raw image data using the relationship S = S0 exp(− ij bij Dij ), where, S is the vector of signal/image measurements taken along seven directions X , Y , Z , XY , YZ , XZ , XYZ , S0 is a constant, bij is the magnetic field strength (which is a constant for a given direction) and Dij are the entries of the (3, 3) matrix representing the diffusion tensor measuring the diffusion of water inside the body being imaged. The selective term in this case g(s) = 1/(1 + s) where s = F A is the fractional anisotropy defined as [1]. This selection criteria preserves the dominant anisotropic direction while smoothing the rest of the data. Note that since we are only interested in the fiber tracts which correspond to the streamlines of the dominant anisotropic direction, it is apt to choose such a selective term as opposed to one that preserves edges in signal intensity as was done in [14]. Given the dominant eigen vector field of the diffusion tensor in 3D, tracking the fibers (space curves) is basically equivalent to finding the stream lines/integral curves in 3D of this vector field. Finding integral curves of vector fields is a well researched problem in the field of Fluid Mechanics [8]. The simplest solution
Line Integral Convolution for Visualization of Fiber Tract Maps from DTI
617
would be to numerically integrate the given vector field using a stable numerical integration scheme such as a fourth order Runge-Kutta integrator [18]. The problem with streamline finding methods is that there is no clean way to deal with singularities in the vector field. We propose to use a technique from Computer Graphics called the line integral convolution (LIC) which involves convolving a known texture with the vector field such that the result creates the perception of the flow/streamlines in the known texture. This may be achieved on a plane or on any desired manifold/surface. This scheme is resolution independent and can easily cope with singularities in vector fields. Previously, Chaing, et. al. [7] have used LIC to render diffusion tensor images of the myocardium. We present the visualization of reconstructed 3D vector fields rendered using the LIC technique. Images corresponding to the dominant eigen vectors of the tensor field for a normal and an injured rat spinal cord, as well as a rat brain are presented.
2
Image De-noising and Diffusion Tensor Computation
Smoothing the raw vector valued image data is posed as a variational principle involving a first order smoothness constraint on the solution to the smoothing ˆ problem. Let S(X) be the vector valued image that we want to smooth where, X = (x, y, z) and let S(X) be the unknown smooth approximation of the data that we want to estimate. We propose a weighted TV-norm minimization for smoothing the vector valued image S. The variational principle for estimating a smooth S(X) is given by min E(S) = S
Ω
[g(λ+ , λ− )
7 i=1
|∇Si | +
7 µ ˆ i |2 ]dx |Si − S 2
(1)
i=1
where, Ω is the image domain and µ is a regularization factor. The first term here is the regularization constraint on the solution to have a certain degree of smoothness. The second term in the variational principle makes the solution faithful to the data to a certain degree. The gradient descent of the above minimization is given by g(λ+ , λ− )∇Si ∂Si ˆi ) = div − µ(Si − S ∂t ∇Si ∂Si ˆ | + = 0 and S(x, t = 0) = S(x) ∂n ∂Ω×
i = 1, ..., 7 (2)
Note that we can prove the convergence of this PDE to the true solution without invoking viscosity methods. Existence and uniqueness issues of a solution have been worked out as well however, such proofs are beyond the scope of this paper. The above nonlinear PDE is solved using an efficient and stable numerical scheme namely, the Crank-Nicholson scheme [16]. 2.1
Visualizing the Stream Lines
Once the diffusion tensor has been robustly estimated, the principal diffusion direction can be calculated by finding the eigen vector corresponding to the
618
T. McGraw et al.
dominant eigen value of this tensor. The fiber tracts may be mapped by visualizing the streamlines through the field of eigen vectors. LIC is a texture-based vector field visualization method suggested by Cabral .et al. in [3]. The technique generates intensity values by convolving a noise texture with a curvilinear kernel aligned with the streamline through each pixel, such as by:
I(x0 ) =
s0 +L s0 −L
T (σ(s))k(s0 − s)ds
(3)
where I(x0 ) is the intensity of the LIC texture at pixel x0 , k is a filter kernel of width 2L, T is the input noise texture, and σ is the streamline through point x0 . The streamline, σ can be found by numerical integration, given the discrete field of eigen vectors. The result is a texture with highly correlated values between nearby pixels on the same streamline, and contrasting values for pixels not sharing a streamline. In our case, an FA value below a certain threshold can be a stopping criterion for the integration since the diffusion field ceases to have a principal direction for low FA values. Stalling and Hege [20] achieve significant computational savings by leveraging the correlation between adjacent points on the same streamline. For a constant valued kernel, k, the intensity value at I(σ(s + ds)) can be quickly estimated by I(σ(s)) + , where is a small error term which can be easily computed.
3
Experimental Results
In this section, we present three sets of experiments, two on rat spinal cord data sets and one on a rat brain. In all of the experiments, we first smooth the seven 3D directional images using the novel selective smoothing technique outlined in section 2. Following this, the diffusion tensor is estimated from the smoothed data using a standard least squares technique. The fractional anisotropy and the direction cosines of the eigen vector corresponding to the dominant eigen value are computed. The latter image depicts the standard axis (X, Y, Z) toward which direction of diffusion in the data is dominant and the dominant eigen value corresponds to the magnitude of this dominant diffusion direction. Each direction is represented by a corresponding grey value. This coding will indicate the standard direction (X, Y or Z) along which the dominant eigen vector has the strongest component. Images obtained as a result of these computations from raw data, smoothed data using a competing smoothing method outlined in Parker et.al., [14] and smoothed data using our proposed method are depicted for the three data sets. In addition, estimated fiber tracts from the smoothed data using the proposed LIC method are depicted for the data sets. The results of smoothing for the three examples are shown in Figure 1, which is organized as follows: first column contains images computed from raw (noisy) data, second column contains images computed using methods in [14] and third column contains computed images using the proposed image smoothing technique. Note the superior performance
Line Integral Convolution for Visualization of Fiber Tract Maps from DTI
(a)
(b)
(c)
(d)
(e)
(f)
(g)
(h)
(i)
619
Fig. 1. Top and bottom panel of (a), (b) & (c): FA and direction cosines for normal spinal cord. (a) Results computed from raw data. (b) Results computed using Parker’s method. (c) Results from the proposed smoothing; (d),(e) & (f): similar results for injured cord; (g), (h), (i): similar results for rat brain.
620
T. McGraw et al.
(a)
(b)
(c)
(e)
(d)
(f)
Fig. 2. Fiber tracts computed from smoothed data. (a) Normal spinal cord axial slice, and (b) sagittal slice; (c) Injured spinal cord axial slice, and (d) sagittal slice; (e) Rat brain coronal slice, and (f) axial slice.
of the proposed smoothing scheme in comparison to the method in Parker et. al., [14]. Figure 2 depicts the computed fiber tracts for the three reconstructed data sets. The intensity of the LIC texture has be modulated with the FA image to emphasize the most anisotropic region of each image. The top row shows the LIC fiber map in two perpendicular planes for the normal rat spinal cord. These fiber tracts are supposed to run along the length of the spinal cord in the white
Line Integral Convolution for Visualization of Fiber Tract Maps from DTI
621
matter which is exactly what the LIC reveals. For the axial image, the fibers are perpendicular to the image plane, so LIC in this case generates uncorrelated noise. In the sagittal view, however, the fibers lie in the image plane, and are visible as an oriented texture. In the middle row, LIC is applied to the injured spinal cord data. As evident in the sagittal slice, the injury has caused a large cavity down the length of the spine and there are no fibers in this region. The bottom row shows the results for the brain. As expected, fiber tracts are clearly visible in the region of the corpus collosum. In all three cases, the visual quality of the fiber tracts is satisfactory. In the above presented results, it should be noted is that we have demonstrated a proof of concept for the proposed data smoothing and fiber tract mapping algorithms in the case of the normal and injured rat spinal cords, and normal brain respectively. The quality of results obtained is reasonably satisfactory for visual inspection purposes but quantitative validation needs to be performed and will be the focus of our future efforts.
4
Conclusions
In this paper, we presented a new weighted TV-norm minimization formulation for smoothing vector-valued data specifically tuned to computation of smooth diffusion tensor MR images. The smoothed vector valued data was then used to compute a diffusion tensor image using standard least squares technique. Fiber tracts were estimated using the dominant eigen vector field obtained from the diffusion tensor image. Finally, results of fiber tract mapping of a normal and an injured rat spinal cord, and a rat brain were depicted using LIC. The fiber tracts are quite accurate when inspected visually. However, quantitative validation of the computed fiber tracts is essential and will be the focus of our future efforts.
Acknowledgement This research was funded in part by the NIH grant RO1-NS42075.
References 1. P. J. Basser and C. Pierpaoli ”Microstructural and Physiological Features of Tissue Elucidated by Quantitative-Diffusion-Tensor MRI,” J. Magn. Reson. B 110, 209219 (1996) 2. P. Blomgren and T. F. Chan,”Color TV: Total Variation Methods for Restration of Vector-Valued Images,” IEEE Transaction on Image Processing, Vol. 7, no. 3, pp. 304-309, March, 1998. 3. B. Cabral and L. Leedom,”Imaging Vector Fields Using Line Integral Convolution,” Proc. of SIGGRAPH ’93, pp. 263-272, 1993. 4. V. Caselles, J. M. Morel, G. Sapiro and A. Tannenbaum,IEEE Trans. on IP, special issue on PDEs and geometry-driven diffusion in image processing and analysis, Vol 7, No. 3, 1998.
622
T. McGraw et al.
5. T. Chan and P. Mulet, ”On the Convergence of the Lagged Diffusivity Fixed Point Method in Total Variation Image Restoration,” September 1997, CAM TR-97-46. 6. T. Chan and J. Shen, ”Variational restoration of non-flat image features: model and algorithm,” Technical Report, CAM-TR 99-02, UCLA, 1999. 7. P.-J. Chiang, B. Davis and E. Hsu, ”Line-Integral Convolution Reconstruction of Tissue Fiber Architecture Obtained by MR Diffusion Tensor Imaging,” BMES Annual Meeting, 2000. 8. A. Chorin, Computational Fluid Mechanics, Selected papers, Academic Press, 1989. 9. T. E. Conturo, et.al., ”Tracking neuronal fiber pathways in the living human brain,” Proc. Natl. Acad. Sci. USA 96, 10422-10427 (1999) 10. D. K. Jones, A. Simmons, S. C. R. Williams and M. A. Horsfield, ”Non-invasive assessment of axonal fiber connectivity in the human brain via diffusion tensor MRI,” Magn. Reson. Med., 42, 37-41 (1999). 11. R.Kimmel, N.Sochen, and R.Malladi, ”Images as embedding maps and minimal surfaces:movies, color and volumetric medical images,” in Proc. of the IEEE Conf. on CVPR, June 1997, pp. 350–355. 12. N. Makris, et.al., ”Morphometry of in vivo human white matter association pathways with diffusion-weighted magnetic resonance imaging,” Ann. Neurol., 42, 951962 (1999). 13. S. Mori, B. J. Crain, V. P. Chacko and P. C. M. van Zijl ”Three-dimensional tracking of axonal projections in the brain by magnetic resonance imaging,” Ann. Neurol., 45, 265-269 (1999) 14. G.J. M. Parker, J. A. Schnabel, M. R. Symms, D. J. Werring and G. J. Baker, ”Nonlinaer smoothing for reduction of systematic and random errors in diffusion tensor imaging,” Magn. Reson. Imag., 11, 702-710, 2000. 15. P.Perona and J.Malik, ”Scale-space and edge detection using anisotropic diffusion,” IEEE TPAMI, vol. 12, no. 7, pp. 629–639, 1990. 16. L. Lapidus and G. F. Pinder, Numerical solution of partial differential equations in science and engineering, John Wiley and Sons, 1982. 17. C. Poupon, C. A. Clark et.al., ”Regularization of diffusion-based direction maps for the tracking of brain white matter fascicles,” NeuroImage, 12, 184-195, 2000. 18. W.H.Press, B.P.Flannery, S.A.Teukolsky and W.T.Vetterling, [1992], Numerical Recipes in C: The Art of Scientific Computing. Cambridge University Press. 19. G.Sapiro and D.L. Ringach, ”Anisotropic diffusion of multivalued images with applications to color filtering,” IEEE TIP, vol. 5, pp. 1582–1586, 1996. 20. D.Stalling and H.C. Hege, ”Fast and Resolution Independent Line Integral Convolution,” Proc. of SIGGRAPH ’95, pp. 249-256, 1995. 21. J.Weickert, ”A review of nonlinear diffusion filtering,” in Scale-space theory in computer vision,, (Eds.) B. ter Haar Romney et.al. 1997, vol. 1252, pp. 3–28, Springer-Verlag. 22. C.-F. Westin, S.E. Maier, B. Khidhir, P. Everett, F.A. Jolesz and R. Kikinis, ”Image Processing for Diffusion Tensor Magnetic Resonance Imaging,” in Proceedings of MICCAI ’99, pp. 441-452, 1999 23. R. Whitaker and G. Gerig, ”Vector-valued diffusions,” in Geometry-driven Diffusions in Computer Vision, B. Romney et.al., (Eds.), Kluwer, 1994.
On the Accuracy of Isosurfaces in Tomographic Volume Visualization Andreas Pommert, Ulf Tiede, and Karl Heinz H¨ohne Institute of Mathematics and Computer Science in Medicine (IMDM) University Hospital Hamburg-Eppendorf, 20251 Hamburg, Germany {pommert,tiede,hoehne}@uke.uni-hamburg.de
Abstract. Results of tomographic volume visualization depend on a large number of acquisition and processing parameters. In this study, localization accuracy of isosurfaces is investigated, using simulated and phantom data. Guidelines for choosing parameters are given, and a pictorial index of typical effects resulting from various parameter settings is presented. Provided that a suitable threshold value is used, it is shown that the accuracy is almost one order of magnitude better than the scanner resolution.
1
Introduction
Visualization of volume data obtained in computer tomography (CT) or magnetic resonance imaging (MRI) is an important aid for diagnosis, treatment planning, surgery rehearsal, education, and research [5]. For clinical applications, it is of course important to assure that the 3D images really show the true anatomical situation, or at least to know about their limitations. Unfortunately, the resulting images depend on a large number of parameters, including pixel size, convolution kernel of the scanner, slice distance and thickness, interpolation method, sampling distance, and threshold value (or other segmentation parameters). Variation of these parameters may result in very different images. While image fidelity has been identified as a major research topic in volume visualization, not many investigations on this subject are available to date [4]. In this paper, we are investigating the accuracy of the visualization of isointensity surfaces (or isosurfaces, for short).
2
Materials and Methods
2.1 Volume Visualization Volume visualization is performed using a ray casting approach. For interpolation of the intensities at the sampling points, linear interpolation, quadratic interpolation [2], or cubic splines (B-spline, Catmull-Rom spline) are applied [3]. Between the sampling points, an isosurface is located with a bisection algorithm. Details of the visualization algorithms may be found in [7]. T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 623–630, 2002. c Springer-Verlag Berlin Heidelberg 2002
624
A. Pommert, U. Tiede, and K.H. H¨ohne
High contrast objects such as bone are often segmented using intensity thresholds. These are mostly interactively set, considering e.g. smoothness of the resulting surface, appearance of artifacts, or choosing the 50% value between approximate background and object intensities. To get somewhat more reproducible results, a threshold may be selected such that the integrated gradient magnitude over all surface points is maximized, as proposed in the contour spectrum approach [1]. Results of segmentation are represented by object membership labels assigned to every voxel. 2.2
Evaluation
Simulated Data. Using simulated data for the evaluation of medical volume visualization was first proposed in [6]. In this study, we use a numerically described test scene, from which tomographic volume data are generated with an oversampling and postfiltering approach [8]. This way, the partial volume effect can be modeled realistically. Neutral, smoothing and edge enhancing kernels are simulated, using Catmull-Rom spline, Gaussian, and BC-spline with B = 0, C = 1 [3], respectively. The test scene of 32 × 32 × 32 (virtual) mm3 is a three-dimensional extension of the siemens star, as used in photography, e.g. (fig. 1, left). The star has a radius of 10 mm, and the 12 cones have a maximum radius of 1.3 mm each, giving equal size of cones and spaces in between. It is slightly tilted to avoid alignment with the voxel grid. This scene is especially suitable to simulate small structures. Object and background intensities are set to 2000 and 1000, respectively. Furthermore, (rather strong) non-correlated gaussian noise (σ = 100) may be added. The white ring shown on the images is indicating positions with a cone diameter of 1.0 mm. In order to measure the accuracy of an isosurface, the localization error, i.e. the difference between actual and ideal surface position (obtained from the numerical description) is calculated along each viewing ray. Since it may be positive or negative, the localization error for a whole image is calculated as the mean absolute localization error.
Fig. 1. The test object rendered from its mathematical definition (left), and from simulated volume data, using the basic configuration of acquisition and visualization parameters (right).
On the Accuracy of Isosurfaces in Tomographic Volume Visualization
625
Phantom Data. For verification of the results on real tomographic data, a teflon cone (height 30.0 mm, diameter 10.0 mm) is used which was scanned on a Siemens Somatom Plus 4 CT scanner (voltage 120 kV, exposure 180 mAs, 87 slices, pixel size 0.098 mm, convolution kernel SP90, slice distance 0.5 mm, slice thickness 1.0 mm). A Catmull-Rom spline interpolation was performed to obtain isotropic voxels. A numerical description of the cone was created and interactively matched to the data to allow measurements of localization accuracy.
3
Results
An exploration of the whole parameter space is hardly feasible. Therefore, a reasonable basic configuration of parameters is selected and later modified. Measurements of the localization error are listed in tab. 1. Since the best threshold value is usually not known, errors are given for the 50% value of 1500 and the somewhat extreme settings of 1200 and 1800, respectively. Table 1. Measurements of the localization error for the simulated data (mean absolute error ± standard deviation). The ∗ denotes the basic configuration of acquisition and processing parameters used in this study. threshold value 1200 1500 1800 varied parameter value localization error [mm] gaussian noise, σ 0 0.32 ± 0.12 0.12 ± 0.11 0.47 ± 0.12 100∗ 0.56 ± 1.51 0.13 ± 0.13 0.47 ± 0.15 voxel size 0.25 mm 0.85 ± 2.34 0.11 ± 0.12 0.44 ± 0.16 0.5 mm∗ 0.56 ± 1.51 0.13 ± 0.13 0.47 ± 0.15 1.0 mm 0.45 ± 0.67 0.23 ± 0.18 0.66 ± 0.19 resolution 0.25 mm 0.44 ± 1.49 0.08 ± 0.10 0.28 ± 0.17 0.5 mm 0.46 ± 1.50 0.08 ± 0.09 0.32 ± 0.14 1.0 mm∗ 0.56 ± 1.51 0.13 ± 0.13 0.47 ± 0.15 convolution kernel smoothing 0.57 ± 1.53 0.22 ± 0.17 0.66 ± 0.18 neutral∗ 0.56 ± 1.51 0.13 ± 0.13 0.47 ± 0.15 edge enhancing 0.57 ± 1.51 0.09 ± 0.13 0.38 ± 0.15 anisotropic data SL 1 mm 0.56 ± 1.51 0.13 ± 0.13 0.48 ± 0.15 SL 2 mm 0.54 ± 1.54 0.20 ± 0.16 0.58 ± 0.17 SL 1 mm, PI 2 mm 0.55 ± 1.52 0.18 ± 0.15 0.55 ± 0.16 sampling distance 0.25 voxels 0.67 ± 1.80 0.13 ± 0.14 0.47 ± 0.16 0.5 voxels∗ 0.56 ± 1.51 0.13 ± 0.13 0.47 ± 0.15 1.0 voxels 0.47 ± 1.18 0.13 ± 0.13 0.45 ± 0.14 2.0 voxels 0.39 ± 0.76 0.12 ± 0.12 0.44 ± 0.12 4.0 voxels 0.38 ± 0.69 0.11 ± 0.11 0.43 ± 0.12 interpolation linear∗ 0.56 ± 1.51 0.13 ± 0.13 0.47 ± 0.15 quadratic, r=0.5 0.33 ± 0.13 0.14 ± 0.13 0.49 ± 0.13 quadratic, r=1.0 1.53 ± 3.23 0.13 ± 0.13 0.44 ± 0.17 B-spline 0.34 ± 0.13 0.15 ± 0.13 0.52 ± 0.13 Catmull-Rom 1.75 ± 3.57 0.12 ± 0.13 0.41 ± 0.16
626
3.1
A. Pommert, U. Tiede, and K.H. H¨ohne
Basic Configuration of Parameters
The basic limiting factor of tomographic image acquisition is the resolution of the scanner, which is determined (among others) by the convolution kernel. In this study, it is arbitrarily set to 1.0 mm. According to the sampling theorem, the resulting signal has to be sampled with at least two samples per mm to avoid aliasing effects. Voxel size is thus set to 0.5 mm. For resampling on the ray, a sampling distance of 1 voxel might seem√sufficient; however, if the volume is rotated, intensity changes might be as close as 1/ 3 voxels. Therefore, the sampling distance is set to 0.5 voxels. Except for a very low threshold, there are only small differences between noiseless and noisy data. Therefore, all further measurements of the localization error are based on the noisy data. However, to make the effects more apparent, the images shown are rendered from the noiseless data. 3.2 Variation of Image Acquisition Parameters Voxel Size With a smaller voxel size of 0.25 mm (fig. 2, top left), resolution is only slightly increased, as can be seen at the tips of the cones, at a price of using much more data. Improvement of the localization error is equally small. Vice versa, with the voxel size enlarged to 1 mm, the test object is strongly deformed (fig. 2, top right). Scanner Resolution With the resolution of the scanner convolution kernel improved to 0.25 mm or 0.5 mm (fig. 2, middle row), the tips of the cones are extended, and the localization error is improved. However, since the voxel size is now becoming the limiting factor, some noticeable aliasing occurs, especially for the smaller resolution. Convolution Kernel Using an edge enhancing instead of a neutral scanner convolution kernel, the localization error decreases when using a suitable threshold. This is not really surprising, considering our small test object. Vice versa, results are getting worse with a smoothing kernel. Anisotropic Data If a box shaped slice sensitivity profile is used in axial direction (top to bottom on the images), little is changed for a slice thickness (SL) of 1 mm. However, for a slice thickness of 2 mm, the anisotropic resolution is clearly visible (fig. 2, bottom left). The localization error also deteriorates. Somewhat better results are obtained with a simulated spiral CT with a slice thickness of 1 mm and a pitch (PI) of 2 (fig. 2, bottom right). 3.3 Variation of Visualization Parameters Interpolation Compared to linear interpolation, the Catmull-Rom spline gives a slightly improved resolution, while the smoothing B-spline makes it worse (fig. 3, top row). Surprisingly, localization errors are not improved for the Catmull-Rom spline, due to its higher sensitivity to noise. Quadratic interpolation methods with q = 0.5 and q = 1.0 perform comparably to B-spline and Catmull-Rom spline, respectively. Sampling Distance Increase to 1 or 2 voxels creates barely visible or noticable undersampling artifacts, respectively (fig. 3, middle row). Surprisingly, the localization error
On the Accuracy of Isosurfaces in Tomographic Volume Visualization
627
Fig. 2. Variation of tomographic acquisition parameters. Top: pixel size 0.25 mm (left) and 1.0 mm (right). Middle: increased scanner resolution of 0.25 mm (left) and 0.5 mm (right). Bottom: anisotropic data, with a slice thickness of 2 mm (left), and spiral acquisition with a slice thickness of 1 mm and a pitch of 2 (right).
628
A. Pommert, U. Tiede, and K.H. H¨ohne
Fig. 3. Variation of visualization and segmentation parameters. Top: interpolation using B-spline (left) and Catmull-Rom spline (right). Middle: increased sampling distance of 1 voxel (left) and 2 voxels (right). Bottom: threshold value of 1200 (left) and 1800 (right).
On the Accuracy of Isosurfaces in Tomographic Volume Visualization depth error
gradient magnitude
[mm]
0.6 0.5 0.4 0.3 0.2 0.1
2500 2000 1500 1000 500 1200 1400 1600 1800
629
HU
HU
1200 1400 1600 1800
Fig. 4. Gradient magnitude (left) and localization error (right) for the simulated test object. depth error 0.5
gradient magnitude 1500
[mm]
0.4
1250 1000
0.3
750
0.2
500
0.1
250 -600 -400 -200
200
400
600
HU
-600 -400 -200
200
400
600
HU
Fig. 5. Gradient magnitude (left) and localization error (right) for the teflon cone.
(measured of the still visible parts) seems little affected. This turns out to be due to the robust bisection algorithm. Object Labels If only the label of the nearest voxel is considered at a sampling position, the underlying voxel grid becomes apparent. The algorithm presented in [7] chooses the most suitable label instead, based on the intensity at a sampling point. This way, these artifacts are completely eliminated. 3.4 Variation of Segmentation Parameters Simulated Data Regardless of the other parameter settings, a poor threshold selection causes strong artifacts (fig. 3, bottom row), which also result in high localization errors. In this case, the lowest localization error is obtained for a threshold of about 1450, slightly below the 50% value (fig. 4), probably due to the small size of our test object. The gradient magnitude reaches its maximum at about 1550, yielding an error of less than 0.2 mm. For a poor threshold, the error may be larger than the voxel size. Phantom Data In this case, background and object intensities are in the approximate ranges of -1000...-950 and 900...950 Hounsfield units (HU) respectively, giving a 50% threshold of about -25 HU and a localization error of about 0.1 mm (fig. 5). The gradient magnitude reaches its maximum at about the same value. Localization error is thus in the same range as for the simulated data.
630
4
A. Pommert, U. Tiede, and K.H. H¨ohne
Conclusions
In this paper, we investigated to accuracy of isosurfaces in volume visualization, using simulated data and a phantom. With a suitable threshold selection, it could be shown that the accuracy is almost one order of magnitude better than the scanner resolution, and thus in the sub-mm range. However, acquisition and processing parameters have to be chosen carefully. Determination of a suitable threshold value remains the most critical step. As could be shown, finding the maximum of the gradient magnitude seems a suitable method to get at least near a good threshold value. Whether this also holds for clinical data remains to be shown.
Acknowledgement The CT dataset of the teflon cone is courtesy of Kornelius Kupczik, Evolutionary Anatomy Unit, University College London.
References 1. Bajaj, C. L., Pascucci, V., Schikore, D. R.: The contour spectrum. In Yagel, R., Hagen, H. (Eds.): Proc. IEEE Visualization ’97. Phoenix, AZ, 1997, 167–173. (ISBN 0-8186-8262-0). 2. Dodgson, N. A.: Quadratic interpolation for image resampling. IEEE Trans. Image Process. 6, 9 (1997), 1322–1326. 3. Mitchell, D. P., Netravali, A. N.: Reconstruction filters in computer graphics. Comput. Graphics 22, 4 (1988), 221–228. 4. Pommert, A., Höhne, K. H.: Evaluation of Image Quality in Medical Volume Visualization: The State of the Art. In Dohi, T., others (Eds.): Medical Image Computing and ComputerAssisted Intervention, Proc. MICCAI 2002, Lecture Notes in Computer Science, SpringerVerlag, Berlin, 2002. (this volume). 5. Pommert, A., Tiede, U., Höhne, K. H.: Volume Visualization. In Toga, A. W., Mazziotta, J. C. (Eds.): Brain Mapping: The Methods, Academic Press, San Diego, CA, 2002, ch. 26, 707–723. 6. Tiede, U., Höhne, K. H., Bomans, M., Pommert, A., Riemer, M., Wiebecke, G.: Investigation of medical 3D-rendering algorithms. IEEE Comput. Graphics Appl. 10, 2 (1990), 41–53. 7. Tiede, U., Schiemann, T., Höhne, K. H.: High quality rendering of attributed volume data. In Ebert, D. et al. (Eds.): Proc. IEEE Visualization ’98. Research Triangle Park, NC, 1998, 255–262. (ISBN 0-8186-9176-X). 8. Watt, A.: 3D Computer Graphics. 3. ed. Addison-Wesley, Reading, MA, 2000.
A Method for Detecting Undisplayed Regions in Virtual Colonoscopy and Its Application to Quantitative Evaluation of Fly-Through Methods Yuichiro Hayashi1 , Kensaku Mori1,2 , Jun-ichi Hasegawa3 , Yasuhito Suenaga1 , and Jun-ichiro Toriwaki1 1
2
Graduate School of Engineering, Nagoya University Furo-cho, Chikusa-ku, Nagoya 464-8603, Aichi, Japan {yhayashi,mori,suenaga,toriwaki}@toriwaki.nuie.nagoya-u.ac.jp Image Guidance Laboratories, Dept. of Neurosurgery, Stanford University 300 Pasteur Drive Stanford, CA. 94305-5327, USA
[email protected] 3 School of Computer and Cognitive Sciences, Chukyo University 101 Tokodachi, Kaizu-cho, Toyota 470-0393, Aichi, Japan
[email protected]
Abstract. This paper shows a method for detecting regions that the system did not display during the fly-through of a virtual endoscope system (VES) and quantitative evaluation of undisplayed regions in virtual colonoscopy. When we use the VES as a diagnostic tool, especially as a tool for detecting colon polyps, a user often performs automated fly-through based on automatically generated paths. In the case of automated fly-through in the colon, there are some blind areas at the backs of folds. The aim of this study is to detect undisplayed regions during flythrough and to perform quantitative evaluation. After the fly-through, the system shows undisplayed regions and notifies the user of the existence of unobserved regions together with the rate of undisplayed regions. We evaluate various kinds of automated fly-through paths generated from medial axes of the colon and flattened views of the colon from the viewpoint of the rate of undisplayed regions. The experiment results show that about 30% of colon regions are classified as undisplayed regions by the conventional automated fly-through along the medial axis and that the flattened view results in very few undisplayed regions.
1
Introduction
Virtual endoscope systems (VESs) [1,2,3] are now widely used for observing the inside of a human body. A VES visualizes anatomical structures by generating virtual endoscopic (VE) images based on 3-D medical images such as CT or MR images. The user of a VES can observe the inside of an organ from any viewpoint and view direction. Virtual endoscopy can be used for various purposes such as diagnosis, pre-operative surgical planning, endoscopic navigation, and T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 631–638, 2002. c Springer-Verlag Berlin Heidelberg 2002
632
Y. Hayashi et al.
informed consent. Many research groups have reported applications of VE as screening tools of colorectal cancers (called virtual colonoscopy (VC)), since VC does not cause patients pain during examination procedures. colorectal cancer is the second leading cause of death in the U.S.A., and several companies are releasing software for VC. In colorectal cancer screening using VC, a medical doctor diagnoses the inside of the colon by observing VE views. When the inside of the colon is observed by using a VES for detection of polyps, fly-through techniques are usually used. Automated fly-through techniques are most frequently employed, since manual fly-through requires the doctor to change the viewpoint and the view direction of the virtual camera of the VES many times. There are many studies on automated or semi-automated fly-through in VESs [2,3,4,5]. In these methods, fly-through paths are generated from the extraction results of the medial axes of colon regions. Then fly-through is performed along these paths. We also have developed a method for generating automated flythrough paths based upon medial axes generated by a thinning algorithm that uses the Euclidean distance transformation [5]. However, there are fold patterns called haustras in the colon. These fold patterns may cause some blind regions and oversighting of polyps. To reduce oversighting of colorectal polyps, the system should warn the user about regions that are not completely displayed on the screen (undisplayed regions). It is important to perform automated fly-through that does not result in undisplayed regions. Quantitative evaluation of undisplayed regions are also an important problem to solve. However, there is no report which describes quantitative evaluation of undisplayed regions of automated fly-through paths. In this paper, we describe a method to detect undisplayed regions during flythrough of VC. It is based on novel simple techniques and is useful for improving diagnosis using VC. The concept of detection of unobserved and observed regions were first proposed by the authors’ group in 2000 [5]. The detailed description of the method is presented in the reference [6]. Our concept and method has been referred to by another research group working on VC [7]. Our current paper now presents the quantitative evaluation of automated fly-through paths from the viewpoint of undisplayed regions by using the proposed method. In Section 2, we describe the detection process of undisplayed regions during fly-through. Brief descriptions of several methods for creating automated fly-through paths in VC are given in the same section. In Section 3, we compare various automated fly-through methods according to the number of undisplayed regions detected. Discussion is also presented in the same section. Section 4 contains our conclusion.
2 2.1
Methods Detection of Undisplayed Regions
Outline. This paper defines displayed regions as regions which are displayed at least once on a screen during the fly-through and undisplayed regions which are not displayed on the screen at all. Before the fly-through, the entire region of the colon is labeled as undisplayed. The basic idea of this method is to record
A Method for Detecting Undisplayed Regions in Virtual Colonoscopy
633
displayed regions for every frame and to compute undisplayed regions from the record. During the fly-through, we record areas shown on the display for each frame of the VE views. This marking process is executed for all frames. At the end of the fly-through, regions not having displayed marks are counted as undisplayed regions. Surface rendering (SurR) and volume rendering (VolR) methods are often utilized for generating VE views. Since the minimum rendering units of SurR and VolR are a triangle and a voxel, respectively, displayed regions are marked by using these units. Detection in SurR. In the SurR methods, the shape (surface) of the colon is represented as a set of triangles. These triangles are usually generated by the Marching Cubes method. VC images are obtained by rendering the triangles using suitable shading and projection techniques. Therefore we also classify the triangles into displayed and undisplayed ones during the fly-through. The method detects triangles displayed on the screen by the following steps. Prior to the fly-through, undisplayed marks are assigned to all triangles. For each frame of the VE views, we render two types of images: (a) shaded view and (b) labeled view. First, a normal VE view is generated with a proper shading procedure and displayed on the screen (shaded view). Then, we assign a different color to each triangle as the unique code of the triangle. Since these patches are rendered without shading, the color of each pixel of the screen shows the code of the displayed triangle (labeled view). This special rendering is performed on the back-buffer (frame buffer that is not currently displayed on the screen),with which conventional graphics hardware is equipped. We mark all of the triangles detected in this process as displayed. At the end of the fly-through, triangles not having displayed marks are classified into undisplayed triangles. A set of connected undisplayed-triangles are considered as undisplayed regions. Detection in VolR. In the VolR method, we extract undisplayed regions as a set of voxels which are not projected on the screen. Basically, VolR is performed by casting a ray that starts at the viewpoint and passes through the pixel on the projection screen. Prior to the rendering process, we assign the color and opacity values to each voxel of an input 3-D image. Weighted accumulation of the shaded color of each voxel is performed along the cast ray. Each pixel value of the rendered image is computed by weighted-accumulation operations. The voxel having the highest accumulated-opacity value dominates the color of the screen pixel for each ray. Therefore, we consider voxels existing between the voxel with the maximum accumulated opacity value and the viewpoint as displayed voxels. The detection process is implemented in the following manner. First, all voxels of an input image are marked as undisplayed. During the rendering process for generating virtual endoscopic views, we also simultaneously perform a checking process of displayed voxels by using the above method. This process is executed for all frames of the fly-through.
634
2.2
Y. Hayashi et al.
Automated Fly-Through
Automated Fly-through Along the Medial Axis (Method A). This method continuously changes the viewpoint and the view direction of the virtual camera of the VES along the medial axes of colon regions [5]. Many others research groups have reported on automated generation of fly-through paths based on medial axes of the colon [3,4]. Basically, these methods obtain medial axes of colon regions by applying thinning methods. In this paper, we extract medial axes of the colon by applying a thinning method that uses the result of the Euclidean distance transformation as in [5]. The fly-through path is computed by specifying two points (start and end points) on the thinned result. Viewpoints are allocated on the obtained path at a predefined interval. The tangent direction of the path at each viewpoint is used for the view direction at the point. Modified Fly-through by Using Haustra Information (Method B). Since there are many folds in the colon, it is impossible to observe the colonicwall regions existing between the folds. This method modifies the path obtained in Method A so that the virtual camera rotates 360 degrees to visualize the colonic walls between fold patterns when it comes among the folds. Fold regions are extracted by applying morphological and connected component operations to the extracted colon region. Flattened View. For comparison, we also generate flattened views of the colon along the path obtained in Method A. The flattened view is generated by modifying the paths of ray casting of the VolR. At each viewpoint along the fly-through path, we cast rays to cover 360 degrees in a direction perpendicular to the path and get one line of the rendered image (called a rendered line). We execute this casting process on each point of the path and stack the rendered lines to obtain the final image.
3 3.1
Experiments and Results Materials
We implemented the above-mentioned methods on a conventional PC (CPU: Intel Pentium 4 1.70 GHz, OS: Windows 2000, Memory: 1 Gbyte) and applied them to three colon CT data sets. Colon regions were extracted from a 3-D abdominal X-ray CT image by using a region growing method with a position variant thresholding operation. Acquisition parameters of the CT image were: 0.625∼0.684mm pixel size, 512x512 pixels, 112∼349 slices, 5.0∼2.5mm slice thickness, and 2.0∼1.25mm reconstruction pitch. We generated many paths for automated fly-through and calculated the rates of undisplayed regions over the whole regions for the paths. A flattened view of the colon was also generated for comparison.
A Method for Detecting Undisplayed Regions in Virtual Colonoscopy
(a)
(b)
635
(c)
Fig. 1. Automated fly-through path along the medial axis of colon and detected undisplayed regions. (a) Fly-through path overlaid on the outside view of colon with black solid line, (b) and (c) Uudsplayed regions. Detected regions are marked with dark color. Table 1. Experiment results of computation of RUD (the rate of undisplayed regions to the whole regions in %). Movement (voxels) 0 Case 1 +5 -5 0 Case 2 +5 -5 0 Case 3 +5 -5
x-direction SurR VolR 37.6 29.6 37.5 30.0 37.6 29.8 27.4 21.0 27.2 21.0 27.4 20.9 29.4 21.1 27.5 20.8 29.4 21.5
y-direction SurR VolR 37.6 29.6 37.9 30.2 37.5 29.3 27.4 21.0 27.0 20.5 27.6 21.2 29.4 21.1 29.0 21.0 29.6 21.4
z-direction SurR VolR 37.6 29.6 36.1 28.3 39.1 31.3 27.4 21.0 26.5 20.1 28.0 21.7 29.4 21.1 28.4 20.2 29.8 22.0
We define the rate of undisplayed regions to displayed regions (RUD) in the following way. For SurR, RUD is computed as (Number of undisplayd patches) / (Total number of triangle patches). The RUD for VolR is defined as (Number of undisplayed voxels on colonic wall) / (Total number of voxels on colonic wall). we define voxels on the colonic wall in the following way. First, we apply the fusion operation (shrink and expand type) to the colonic regions extracted by using the region growing method. Voxels on the colonic wall are determined as voxels which are connected to background voxels in six-neighborhood connectivity. 3.2
Results and Discussion
We evaluated the ratios of undisplayed regions for automated fly-through paths based on the medial axes of the colon (Method A). we also translated the paths in parallel to three axes of the CT image coordinate systems within ±5 voxels and computed the RUDs. This is because we consider that the paths generated by other research groups’ methods are located within ±5 voxels of our paths. Figure 1 shows examples of the fly-through path and undisplayed regions detected by the
636
Y. Hayashi et al.
Table 2. RUD (in %) with changed the intervals between the viewpoints in automated fly-through. Interval (voxels) 1 0.5 0.1
Case 1 SurR VolR 37.6 29.6 37.2 29.5 36.8 29.3
Case 2 SurR VolR 27.4 21.0 27.1 20.8 26.7 20.7
Case 3 SurR VolR 29.4 21.1 29.3 20.9 28.6 20.7
Table 3. RUD (in %) with changed FOV angles. FOV (degrees) 70 80 90 100 110
Case 1 SurR VolR 42.2 34.5 37.2 29.6 33.4 25.4 29.6 21.7 26.1 18.6
Case 2 SurR VolR 37.1 25.0 27.4 21.0 23.3 17.1 19.6 13.8 16.1 10.7
Case 3 SurR VolR 33.7 25.2 29.4 21.1 25.4 17.6 21.8 14.5 18.4 11.6
proposed method in the case of zero-translation of the path. In Fig.1 (b) and (c), undisplayed regions are marked with dark color. There exist many undisplayed regions between the folds. Table 1 shows RUD for paths translated along each axis. For all paths, about 30% of regions were detected as undisplayed. In the experiments on parallel translation of the paths, RUDs were almost the same rate. There is no significant difference between SurR and VolR methods. This means that there is the possibility that the automated fly-through methods proposed by many research groups including the authors’ group [5] may result in many undisplayed regions. We also changed the interval of viewpoints in the generation process of the automated fly-through path and computed RUDs for each interval. In the experiment, we set the interval of the viewpoint as 1, 1/2, and 1/10 of the voxel size. The results are shown in Table 2. If we shorten the interval of the viewpoint, RUD is slightly decreased. However, the decrease in RUD is within 1%. This means that shortening of the viewpoint interval is not an appropriate solution for reducing the rate of undisplayed regions. In VC, the appropriate choice of FOV (field of view) angle is one of the most important problems to solve. We investigated the relation between RUD and FOV. The FOV angle was changed from 70 degrees to 110 degrees with 10 degree intervals. The RUD was calculated for each FOV angle by using the automated fly-through path of Method A. The experimental results are shown in Table 3. As expected, wider angle FOV clearly reduces RUD. Although wider angle FOV reduces RUD, it causes VE views with high perspective-distortion. Highly distorted images makes the diagnosis process difficult. This is a trade-off for reduction of RUD. Round-trip paths are effective methods for reducing RUD in VC. In a roundtrip path, the virtual camera of the VES moves from the start point to the end
A Method for Detecting Undisplayed Regions in Virtual Colonoscopy
637
Table 4. RUD (in %) for one way path and round-trip path. Case 1 Case 2 Case 3 SurR VolR SurR VolR SurR VolR One-way path 37.6 29.6 27.4 21.0 29.4 21.1 Round-trip path 17.1 12.1 7.6 4.7 9.0 6.0
(a)
(b)
(c)
Fig. 2. Example of flattened views and detection results of undisplayed regions. (a) Flattened view, (b) and (c) undisplayed regions. Detected regions are marked with dark color.
point along the automated fly-through path. Then at the end point, the virtual camera changes its viewing direction by 180 degrees and follows the path back toward the start point. We quantitatively evaluated the difference between the one-way path and the round-trip path. Table 4 shows the experiment results. These results show that the round-trip path is a very effective way to reduce the rate of undisplayed regions if the double of examination is acceptable. Method B described in Section 2.2 forces the virtual camera to face the colonic wall during fly-through to prevent regions between folds from becoming undisplayed. In the experiment result for Case 1, the RUDs of Method B were 13.5% for SurR and 9.0% for VolR, while the conventional path (Method A) resulted in about 30% RUD for both SurR and VolR. Therefore, the use of fold pattern information in the process of creating fly-through paths is quite effective. Several research groups have proposed the methods for generating flattened views of the colon. Our interest here is the extent of the reduction of RUD that can be achieved by introducing the flattened view. The proposed method can quantitatively evaluate and compare undisplayed regions for both visualization techniques. Examples of the flattened view and detection results are shown in Fig. 2. In the flattened views of Case 1, the RUD value is 2.7% for VolR. This shows flattened views of the colon are quite effective for observation of the status of the colonic wall while keeping RUD low. Although the user can easily review the colon by only one or a few stretched views, the rendered views contain heavy distortion. This cause the medical doctor some difficulty when diagnoseing from these flattened views.
638
4
Y. Hayashi et al.
Conclusion
This paper proposed methods for detecting undisplayed regions in VC. We also compared automated fly-through methods from the viewpoint of the ratio of undisplayed regions. Hither to, there has been no published article about detection and quantitative evaluation of undisplayed regions for VC. Basically, undisplayed regions are detected by marking displayed triangles for SurR or displayed voxels for VolR. We consider the voxels or triangles not having displayed marks as undisplayed triangles or voxels. We also quantitatively evaluated several automated fly-through techniques by using the proposed method. The flattened view was also employed for comparison. The experiment results showed that the RUD of the conventional automated fly-through is about 30%. The RUD of the automated fly-through is much decreased by using haustra information. The flattened view shows quite low RUD values and this visualization technique is therfor useful for diagnosis of the colon.
Acknowledgments The authors would like to thank our colleagues for useful suggestions and discussion. K. Mori thanks Dr. Ramin Shahidi and Dr. Calvin Maurer, Jr., for providing him with opportunity to write this paper. Parts of this research were supported by the Grant-In-Aid for Scientific Research from the Japan Society for Promotion of Science, and the Grant-In-Aid for Cancer Research from the Ministry of Health, Labour and Welfare of the Japanese Government.
References 1. P. Rogalla, J. Terwisscha van Scheltinga, B. Hamm, eds., “Virtual endoscopy and related 3D techniques,” Springer, Berlin, 2001. 2. K. Mori, A. Urano, J. Hasegawa, et al., “Virtualized endoscope system -an application of virtual reality technology to diagnostic aid-,” IEICE Trans. Inf. & Syst., vol. E79-D, no. 6, pp. 809-819, 1996. 3. L. Hong, S. Muraki, A. Kaufman, et al., “Virtual voyage: interactive navigation in the human colon,” Computer Graphics (Proc. of Siggraph’97), pp. 27-34, 1997. 4. D.S. Paik, C.F. Beaulieu, R.B. Jeffrey, et al., “Automated path planning for virtual endoscopy,” Medical Physics, vol. 25, no. 5, pp. 629-637, 1998. 5. Y. Hayashi, K. Mori, T. Saito, et al., “Advanced navigation diagnosis with automated fly-through path generation and presentation of unobserved regions in the Virtualized Endoscope System,” Proc. of MIRU2000, pp. 331-336, 2000 (in Japanese). 6. K. Mori, Y. Hayashi, Y. Suenaga, et al., “A method for detecting unobserved regions in virtual endoscopy system,” Proc. of SPIE, Medical Imaging, vol. 4321, pp. 134-145, 2001. 7. K. Kreeger, F. Dachille, M.R. Wax, et al., “Covering all clinically significant areas of the colon surface in virtual colonoscopy,” Proc. of SPIE, Medical Imaging, vol. 4683, pp. 198-206, 2002.
3D Respiratory Motion Compensation by Template Propagation Peter R¨osch1 , Thomas Netsch1 , Marcel Quist2 , and J¨ urgen Weese1 1
2
Philips Research Laboratories, Sector Technical Systems R¨ ontgenstraße 24, D-22335 Hamburg, Germany MIMIT Advanced Development, Philips Medical Systems Nederland B.V. Veenpluis 4–6, NL-5680 DA Best, The Netherlands
Abstract. A new approach for the automatic estimation of dense 3D deformation fields is proposed. In the first step, template propagation (an advanced block matching strategy) produces not only a large set of point correspondences, but also a quantitative measure of the “registration quality” for each point pair. Subsequently, the deformation field is obtained by a method based on Wendland radial base functions. This method has been adapted to incorporate “registration quality” into regularization, where Morozov’s discrepancy principle has been applied to give intuitive meaning to the regularization parameter. The main advantage of the presented algorithm is the ability to perform an elastic registration in the presence of large deformations with minimum user interaction. Applying the method, complicated respiratory motion patterns in 3D MR images of the thorax have been successfully determined. The complete procedure takes less than one hour on a standard PC for MR image pairs (256×256×75 voxels) showing a 40 mm displacement of the diaphragm.
1
Introduction
Respiratory motion often complicates the comparison of images of the thorax, e.g. for the combination of PET and CT images [1] or for cardiac applications [2]. The interpretation of these data sets can be supported by image registration. Registration algorithms aim at finding the transformation that relates the position and orientation of anatomical structures in one image to the pose of the same structures in other images. In order to register images affected by respiratory motion, global rigid or affine transformations are not sufficient as rigid structures, e.g. ribs, move relative to each other and soft tissue is deformed so that non-rigid registration is required. One class of non-rigid registration algorithms incorporate physical tissue properties [3] and often require a model generation step which includes image segmentation and knowledge about tissue elasticity and viscosity. Another class of methods is based on image intensities alone. The majority of current gray value based non-rigid registration algorithms use interpolating base functions to parameterize the non-rigid transformation. An iterative procedure is applied to determine the control point configuration corresponding to optimum similarity between the reference image and the elastically T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 639–646, 2002. c Springer-Verlag Berlin Heidelberg 2002
640
P. R¨ osch et al.
transformed target image [1,4,5]. This requires a large number of parameters to be varied for each optimization step. In contrast to these methods, the algorithm described here follows a two-step strategy. First, a local rigid optimization scheme called template propagation [6] that has already been applied in the context of 3D respiratory motion [2] is used. Starting from a single correspondence indicated by the user, this method automatically establishes a large number of correspondences and a quantitative measure of the “registration quality” for each correspondence [8]. In the second step, this information serves as input for an interpolation scheme using radial base functions with compact support presented in [9]. This scheme has been adapted so that the “registration quality” can be incorporated efficiently by regularization. In order to keep the number of parameters small, and to give intuitive meaning to the remaining parameters, automated procedures to adapt the parameters to the properties of the image data have been introduced. The algorithm is described in the next section. Section 3 is concerned with the application of the algorithm to 3D images of the thorax, and with the discussion of the results. Finally, conclusions are drawn in section 4.
2 2.1
Algorithm for 3D Deformation Field Estimation Establishing Correspondences by 3D Template Propagation
Template propagation can be classified as an advanced block matching strategy. In contrast to other block matching methods [10], individual blocks (templates) are “aware” of each other. In particular, local rigid transformation parameters obtained for one template serve as starting estimates for the registration of its neighbors. Furthermore, the order in which the individual templates are treated is determined dynamically during registration such that the most “promising” candidate is registered next while outliers are rejected immediately. Template propagation requires a similarity measure like local correlation (LC) [11] that is applicable to small volumes and a method to quantify the “success” of each local registration result. It has been shown that a “quality measure” can be deduced from the properties of the LC optimum and that this measure is closely related to the local registration accuracy [8]. The procedure typically produces several thousand pairs of corresponding points together with the quality measure q for each pair. In order to select a homogeneously distributed subset of correspondences originating from the most successful template registration steps, all templates are discarded for which q < t · qmax holds where qmax is the largest q value found for all template pairs and 0 < t < 1 is a relative threshold specified by the user. To the remaining templates, a “thinning” procedure is applied such that a set of high quality correspondences with minimum distance dc between control points in the reference image is obtained where dc is chosen by the user. The resulting set serves as input to the procedure described in the following section.
3D Respiratory Motion Compensation by Template Propagation
2.2
641
Turning Correspondences into a Dense Deformation Field
The elastic registration scheme used here has been proposed by Fornefett et al. [9]. It is based on the Ψ -functions of Wendland [12]. In contrast to thin plate splines [13], the Wendland functions have compact support which makes sure that changes to the position of a landmark influences the deformation field only in the vicinity of this landmark. For the application presented here, the C 4 continuous Ψ3,2 Wendland function has been chosen: (1 − r)6 35r2 + 18r + 3 if 0 ≤ r < 1 (1) Ψ (r) = Ψ3,2 (r) = 0 if r ≥ 1 Given a position x in the reference image and a parameter matrix α the transformed position in the target image, y(x), is n α(p, u, a, λ)i Ψ (|x − pi |/a) (2) y(x) = i=1
where pi and ui are the positions of n corresponding landmarks in the reference and target image, respectively. The support length a could be set by an expert user who can estimate the degree of deformation in the images under consideration and thus select a value for a that is large enough to avoid folding, but small enough to describe local deformations. Fortunately, the minimum value for a that guarantees preservation of topology for isolated landmarks can be determined analytically [9] as a = 4.33∆ in 3D where ∆ is the landmark displacement. This allows for an automatic calculation of a by determining the maximum of ∆ for all control point pairs. The parameter matrix α in (2) is obtained by solving −1 K + λW α = u, Ki,j = Ψ (|pi − uj |/a) , W = diag {w1 . . . wn } (3) For Wendland functions and non-coplanar sets of 3D landmarks, it has been proven that (3) has a unique solution [12]. Setting the regularization parameter λ to zero corresponds to interpolation, i.e. y(pi ) = ui , whereas α parameterizes a global affine transformation for λ → ∞. If the correspondences have been obtained by template propagation, the weights of the individual landmarks can be set to wi = qi so that correspondences originating from accurate template matching steps contribute most to the deformation field. It is well known from the theory of ill-posed problems that λ often lacks intuitive meaning. An approach to relate λ to the localization errors σi of landmarks is Morozov’s discrepancy principle [14]. This principle can be applied here by determining the value of λ that results in a certain average deviation σ of landmark positions ui in the target image from the positions obtained by transforming the corresponding source landmarks, y(pi ), where σ should be of the same order as the landmark localization error. As σ increases monotonically with λ, the resulting equation (4) can be efficiently solved for λ with the Newton method. −1 n 2 n n qi qi ui − α(p, u, a, λ)k Ψ (|pi − pk |/a) = σ2 (4) i=1
i=1
k=1
642
P. R¨ osch et al.
Fig. 1. Axial (top), coronal (bottom) slices of 3D MR images of the human thorax. Images have been taken in three respiratory states: “exhale” (left), “intermediate” (center) and “inhale” (right). The image sizes are 256×256×75 voxels with a voxel size of 1.76×1.76×4 mm3 .
3 3.1
Experiments and Results Images Used for the Experiments
Three 3D MR images of a volunteer showing the thorax and parts of the abdomen at different diaphragm positions have been acquired using a multi slice FFE sequence. The sequence allowed 75 axial slices to be acquired during a single breathhold. Orthogonal slices of the images are shown in fig. 1. A diaphragm shift of about 40 mm between the images corresponding to “inhale” and “exhale” has been observed. As the focus of this investigation was the motion of the chest and the pulmonary vessels, image acquisition has not been ECG-triggered so that motion blurring occurred in the vicinity of the heart. 3.2
Experimental Setup and Parameters
The algorithm requires approximate local registration parameters at one or more image locations for the initialization of template propagation. This information has been obtained by interactively indicating the location of one anatomical structure in all images to be registered. In this case, the position where the pars scapularis joins the m. lattisimus dorsi is clearly visible in all images (see fig. 2). An estimate of the local translation parameter at this position is given directly by the coordinate differences. Since the orientation of the chosen structure is almost the same in all three images, and the algorithm only requires approximate starting estimates, rotation angles of zero could be used as initial estimates. As the slice thickness in the present case is relatively large, and in order to make sure that templates contain sufficient structure in image regions with little contrast, the template size has been set to 20 mm which is in the upper range of
3D Respiratory Motion Compensation by Template Propagation
643
Fig. 2. Single correspondence used for the initialization of template propagation. As it was visible in all data sets, the position where the pars scapularis joins m. latissimus dorsi indicated by the white cross hair has been chosen manually.
values used previously [6,8]. An overlap between neighboring templates has been allowed, since this results in a higher robustness compared to non-overlapping templates in the presence of large deformations [6]. In order to include templates in image areas with poor signal to noise ration (e.g. the lung) the relative “quality threshold” (see section 2.1) has been set to t = 0.2. The minimum distance between control points in the reference image has been set to dc = 20 mm. Two values for σ (see section 2.2), σ = 0 and σ = 1 mm have been used. The first setting results in an interpolating deformation field and the second value corresponds to the typical localization error for successfully registered templates found experimentally [8]. 3.3
Discussion of Experimental Results
Using the parameter settings given above, the first experiment was concerned with estimating the deformation field between the “inhale” and “intermediate” image. The even more challenging task of elastically matching the “inhale” to the “exhale” image using interpolating and approximating deformation fields has been addressed in the second and third experiment. In order to characterize the results, the deformation fields have been used to reformat one of the images. Afterwards, gray value differences between the reference and reformatted target images are assessed visually. As all images have been acquired within a couple of minutes using the same protocol, contrast variations between images are small and the gray values of the difference image are closely related to the local registration accuracy. The plausibility of the results has been illustrated by applying the deformation field to a regular grid which is then overlaid on the target image. Results are shown in figures 3 and 4. In all experiments, about 15000 templates have been selected and registered from which about 1100 correspondences have been used for the estimation of the deformation field. Support lengths of a = 79 mm and a = 192 mm have been determined automatically for the first and second/third experiment, respectively. The smaller value for the first experiment reflects the smaller degree of deformation between the “intermediate” and “inhale” image. For all experiments, the differences of the unregistered images show a significant amount of residual structures resulting from chest expansion. Furthermore,
644
P. R¨ osch et al.
Fig. 3. Difference images between “inhale” and ”intermediate” state before (left) and after (right) elastic registration with σ = 0.
motion of pulmonary vessels and the diaphragm affect the difference images. After registration using interpolating deformation fields in the first and second experiment, the presence of residual structures is reduced considerably, particularly in the chest area including the pulmonary vessels (see right column of fig. 3 and center column of fig. 4) indicating successful registration. However, inaccuracies show up close to the diaphragm and in the second experiment, close to the ribs. The main reason for the inaccuracies close to the ribs in the second experiment is that the deformation field is based on about 1000 correspondences in all cases although larger deformations need to be treated compared to the first experiment. Using a smaller value of dc and thus a larger number of corresponding points would lead to a more accurate deformation field at the cost of higher computation time. A second complication in the presence of larger deformations is that the contents of the individual templates are more strongly deformed which results in larger local registration inaccuracies. In particular, if an interpolating deformation field is estimated all correspondences are weighted equally and the incorporation of “low quality” templates degrades the result. This problem can at least partially be solved by weighting the contribution of the individual template pairs by their “registration quality” in the context of regularization (see section 2.2) as could be demonstrated in the third experiment. The right column of fig. 4 indicates that residual structures close to the ribs could be reduced by using approximation rather than interpolation. The deformation grid given at the bottom of fig. 4 reflects the expansion of the chest and shows that the mediastinum, and the heart, to some extent follow the motion of the diaphragm. In all experiments, the most striking structures in the difference images are located in the abdomen close to the diaphragm where internal structures e.g. of the liver are hardly visible in the images used here so that few reliable correspondences could be established in this area.
3D Respiratory Motion Compensation by Template Propagation
645
Fig. 4. Axial and coronal slices of 3D difference images “exhale”-“inhale” before (left) and after elastic registration using σ = 0 (center) and σ = 1 mm (right). At the bottom, the deformation grid obtained for σ = 1 mm is visualized.
Calculation times for the complete procedure including template propagation, deformation field estimation and the application of the deformation field were 32, 38 and 51 minutes on a 1.7 GHz P4 PC with 1 GB of memory for the first, second and third experiment, respectively. Compared to the first experiment, the larger support length led to a larger number of non-zero contributions in (2) and thus to an increase of calculation time in the second case. In the third experiment, a Newton optimization to find λ = 0.005 according to (4) had to be performed additionally, further increasing the computational cost.
4
Conclusions
An elastic transformation based on Wendland functions [9] has been combined with the template propagation algorithm originally described in [6]. The particular advantage of this combination is that in addition to a set of corresponding points, a quantitative measure of confidence for each point pair is taken into account for the estimation of a dense deformation field. This measure which is closely related to the local registration accuracy [8] is incorporated by regularization which is controlled by a parameter with intuitive meaning. Furthermore, the only user interaction required is the identification of a single point correspondence. The algorithm is capable of estimating accurate 3D deformation fields in the presence of large deformations without prior segmentation of the images, which has been demonstrated by successfully registering 3D MR images
646
P. R¨ osch et al.
of the chest taken at different instants of the respiratory cycle with diaphragm displacements up to 40 mm. Particularly for large deformations, the optional regularization step further improved the accuracy of the deformation field. Calculation times for the whole procedure were below one hour on a standard PC.
Acknowledgments The authors would like to thank the image processing group at the Guy’s, King’s and St Thomas’ School of Medicine, London, the participants of the IMAGINE project at the Fachbereich Informatik, Universit¨ at Hamburg, and Frans A. Gerritsen, MIMIT Advanced Development for helpful discussions. We also thank our colleagues Dirk Manke and Kay Nehrke for the acquisition of the images.
References 1. D. Mattes, D. Haynor, H. Vesselle, K. Lewellen, W. Eubank: Nonrigid multimodality image registration. Proc. of SPIE 4322 (2001) 1609–1620 2. D. Manke, K. Nehrke, P. R¨ osch, P. B¨ ornert, O. D¨ ossel: Study of Respiratory Motion in Coronary MRA. Proceedings of ISMRM’01 Elsevier (2001) 1852 3. Rabbitt, R. D., Weiss, J. A., Christensen, G. E., Miller, M. I.: Mapping of hyperelastic deformable templates using the finite element method, Proc. of SPIE 2573 (1995) 252–265 4. Rueckert, D. Sonoda, L.I., Hayes, C., Hill, D. L. G., Leach, M. O., Hawkes, D. J.: Nonrigid registration using free-form deformations: Application to Breast MR images, IEEE Trans Medical Imaging 18 (1999) 712–721 5. G. K. Rohde, A. A. Aldroubi, B. M. Dawant: Adaptive free-form deformation for inter-patient medical image registration. Proc. of SPIE 4322 (2001) 1578–1587 6. P. R¨ osch, T. Netsch, M. Quist, G. P. Penney, D. L. G. Hill, and J. Weese: Robust 3D deformation field estimation by template propagation. MICCAI 2000, Lecture Notes in Computer Science 1935 (2000) 521–530 7. Little, J. A., Hill, D. L. G., Hawkes, D. J.: Deformations incorporating rigid structures. Computer Vision and Image Understanding 66 (1997) 223–232 8. P. R¨ osch, T. Mohs, T. Netsch, M. Quist, G. P. Penney, D. J. Hawkes, J. Weese: Template selection and rejection for robust non-rigid 3D registration in the presence of large deformations. SPIE proceedings 4322 (2001) 545–556. 9. M. Fornefett, K. Rohr, and S. Stiehl: Radial base functions with compact support for elastic registration of medical images. Image and Vision Comp. 19 (2001) 87–96 10. Kostelec, P. J., Weaver, J. B., Healy, D. M. Jr.: Multiresolution elastic image registration, Med. Phys. 25 (1998) 1593-1604 11. P. R¨ osch, T. Blaffert, J. Weese: Multi-modality registration using local correlation, in: H. U. Lemke, M. W. Vannier, K. Inamura, A. G. Farman (Eds) CARS’99 Proceedings, Elsevier (1999) 228–232 12. H. Wendland: Piecewise polynomial, positive definite and compactly supported radial functions of minimal degree. Advances in Computational Mathematics 4 (1995) 389–396 13. Bookstein, F. L.: Principal warps: Thin-plate splines and the decomposition of deformations, IEEE Trans. Patt. Anal. Mach. Intell 11 (1989) 567–585 14. V. A. Morozov: Methods for solving incorrectly posed problems, Springer (1984)
An Efficient Observer Model for Assessing Signal Detection Performance of Lossy-Compressed Images Brian M. Schmanske and Murray H. Loew The George Washington University, School of Engineering and Applied Science Institute for Medical Imaging and Image Analysis Department of Electrical and Computer Engineering 801 22nd Street NW, Washington, DC 20052
[email protected] Abstract. A technique for assessing the impact of lossy wavelet-based image compression on signal detection tasks is presented. A medical image’s value is based on its ability to support clinical decisions, including detecting and diagnosing abnormalities. However, image quality of compressed images is often stated in terms of mathematical metrics such as mean square error. The presented technique provides a more suitable measure of image degradation by building on the channelized Hotelling observer model, which has been shown to predict human performance of signal detection tasks in noise-limited images. The technique first decomposes an image into its constituent wavelet subbands. Channel responses for the individual subbands are computed, combined, and processed with a Hotelling observer model to provide a measure of signal detectability versus compression ratio. This allows a user to determine how much compression can be tolerated before image detectability drops below a certain threshold.
1
Introduction
Lossy image compression used to reduce the size of large medical image files degrades image quality. Often, quality metrics such as mean square error or peak signal-to-noise ratio are used to assess image quality. However, these metrics do not adequately measure the true value of a medical image, namely, its ability to support clinical decisions, including detecting and diagnosing abnormalities. Human observers can be used to determine image quality for performing signal detection tasks, but the trials require many human observers assessing many images to make the results statistically relevant. These trials are thus very time consuming and expensive. Model observers can be used as a human surrogate if the models can be shown to accurately predict human performance.
2
Model Observers
Model observers are used to predict human observer performance in detecting signals in noise limited images that are representative of nuclear-medicine images [1],[2]. Model observers reduce an image to a decision variable that is compared to a decision
T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 647–654, 2002. © Springer-Verlag Berlin Heidelberg 2002
648
B.M. Schmanske and M.H. Loew
threshold to decide whether a signal is present or not. For a signal in Gaussian noise, the ideal observer is given by: ! t λ (x ) = K −x1 ∆µ x x (1)
(
)
where x is the image represented as a vector, K x is the image covariance matrix, and ! ∆µ x is a vector containing the difference between the mean “signal present” image and the mean “signal not present” image. λ (x ) is the decision variable produced by this decorrelating matched filter. If the image’s probability density function is unknown, the first and second order statistics can still be used to generate the covariance matrix and mean difference vector of (1). Though no longer an ideal observer, the decision variable can still predict human observer performance. This model observer is known as the Hotelling observer. A difficulty with using the Hotelling observer is the generation of an invertible covariance matrix. For an image with k pixels, the covariance matrix will have k2 elements, and require k sample images to ensure the matrix is nonsingular [2]. Even for modest sized images, the required number of sample images becomes quite large and inverting the covariance matrix becomes computationally intensive. A channelized Hotelling observer solves the problem associated with the covariance matrix. The channelized Hotelling observer operates on channel responses of the image rather than on the image itself [2]-[4]. The channel responses are generated by ! η = Tt x (2) where the columns of T contain the channel templates associated with different spatial frequency bands. The channel templates can be chosen to mimic the human visual system or to optimize results relative to the ideal observer. The channel re! ! sponses, η , are then processed with the Hotelling observer, using K η−1 and ∆µ η to generate the decision variable. In a two-alternative-forced-choice test, samples of “signal present” and “signal not present” images are used to compute observer performance. Decision variables for the samples are obtained, and the decision threshold is varied to obtain receiver operating characteristic (ROC) curves. The area under the ROC curves (AUC) is a measure of observer performance and can be used to assess image quality. The channelized Hotelling observer model has been shown to predict human observer performance in a number of tasks, including predicting degraded performance when lossy image compression is employed [5]. The compression studies looked at a limited number of compression ratios by compressing the entire image and evaluating the degraded image with the model. Since all image samples must be compressed/decompressed for each value of compression ratio, most practical problems can examine only a limited number of ratios.
3
Wavelet Compression
JPEG 2000 uses wavelets to compress images [6]. A wavelet compression algorithm first analyzes a signal, decomposing it into high-pass and low-pass subbands. The
An Efficient Observer Model for Assessing Signal Detection Performance
649
high-pass subband coefficients have lower entropy than the original signal samples, and entropy coding these coefficients yields compression gain [7]. Following the wavelet analysis, the bitmaps of the subband coefficients are entropy coded separately. The encoded bitmaps are then arranged and transmitted to deliver the spatial (subband) resolution and SNR (bitmaps) required. As subbands and bitmaps are discarded from the compressed file, the compression ratio increases and image quality decreases. In one dimension, wavelet analysis is implemented with linear subband filters: h 0 [n] represents the low-pass subband filter and h 1 [n] represents the high-pass subband filter. X[2k ] = h 0 [n]∗ x[n] n = 2 k = ! h 0 [2k − l ] x[l ] (3) l∈Z
and
X[2k + 1] = h 1 [n]∗ x[n]
n=2k
= ! h 1 [2k − l ] x[l ]
(4)
l∈Z
where x[n] are the original signal samples, X[n] are the wavelet coefficients, and Z is the set of all integers. Evaluating a convolution at even indices is equivalent to a filter followed by downsampling by 2 [7]. Therefore, X[2k ] and X[2k + 1] can be obtained from a twochannel analysis filter bank with filters H 0 and H 1 followed by downsampling by 2. The channel filter responses are denoted by v[k ] and u[k ] , where v[k ] = X[2k + 1] (5) u[k ] = X[2k ] .
(6)
v[k ] and u[k ] are the high-pass and low-pass coefficients respectively. The original signal is synthesized, or reconstructed, from the subband coefficients using filters g0[n] and g1[n]:
x[n] =
! u[k ] g 0 [n − 2k ] + ! v[k ] g 1 [n − 2k ] .
k∈Z
(7)
k∈Z
where g i [n] = h i [−n] , i=0,1. For two levels of analysis where the low-pass coefficients are analyzed a second time, the coefficients are found as follows: v1 [k ] =
u 1 [k ] = v 2 [k ] = u 2 [k ] =
! h1 [2k − l ] x[l ]
(8)
! h 0 [2k − l ] x[l ]
(9)
! h1[2k − l ] u1[l ]
(10)
! h 0 [2k − l ] u 1 [l ] .
(11)
l ∈Z
l∈Z
l∈Z
l∈Z
650
B.M. Schmanske and M.H. Loew
The subscripts on the subband coefficient vectors indicate the number of analysis steps the signal has experienced. v1, v2, and u2 would then be coded and transmitted in a compressed file. Following the multiple analysis steps, the original signal is traditionally reconstructed by synthesizing the coefficients in the reverse order that they were analyzed. For the above example, the low-pass coefficients for the first subband are synthesized from the high-pass and low-pass coefficients of the second subband u 1 [n] =
! u 2 [k ] g 0 [n − 2k ] + ! v 2 [k ] g 1 [n − 2k ] .
k∈Z
(12)
k∈Z
The original signal can then be recovered using the high-pass and low-pass coefficients of the first subband: x[n] =
! u 1 [k ] g 0 [n − 2k ] + ! v 1 [k ] g 1 [n − 2k ] .
k∈Z
(13)
k∈Z
The convolution is a linear operation and follows the distributive property. That is: f [n]∗ (a[n] + b[n] ) =
n =2k
= ! f [2k − l ](a[l ] + b[l ] )
! f [2k − l ] a[l ] + ! f [2k − l ] b[l ]
l∈Z
(14)
l∈Z
(15)
l∈Z
The low-pass coefficients of the first subband can then be computed as follows: u 1 [n] =
! u 2 [k ] g 0 [n − 2k ] + ! v 2 [k ] g 1 [n − 2k ]
k∈Z
(16)
k∈Z
= u 21 [n] + v 21 [n] .
(17)
th u jk [n] and v jk [n] represent the high-pass and low-pass coefficients for the j sub-
band, synthesized to the kth subband level. When k is zero, the coefficients represent the subband contributions to the original signal. Substituting (17) into (13), and taking advantage of the distributive property of the convolution, the original signal can be reconstructed as follows: x[n] =
= =
! u 1 [k ] g 0 [n − 2k ] + ! v 1 [k ] g 1 [n − 2k ]
k∈Z
(18)
k∈Z
! (u 21 [k ] + v 21 [k ] ) g 0 [n − 2k ] + ! v 1 [k ] g 1 [n − 2k ]
k∈Z
(19)
k∈Z
! u 21 [k ] g 0 [n − 2k ] + ! v 21 [k ] g 0 [n − 2k ] + ! v 1 [k ] g 1 [n − 2k ]
(20)
= u 20 [n] + v 20 [n] + v10 [n] .
(21)
k∈Z
k∈Z
l∈Z
Hence, the original signal can be reconstructed by summing the individual subband contributions to the original signal. Images are two-dimensional data sets. They can be analyzed with 1-D wavelets with a two-step method. First, the rows of the image are analyzed with filters H0 and H1. The resulting coefficients then are analyzed across the columns. This analysis produces four subbands for each level of analysis. One subband contains the low-
An Efficient Observer Model for Assessing Signal Detection Performance
651
pass coefficients (LL). Others contain high-pass coefficients in the horizontal direction (HL), the vertical direction (LH), and the diagonal direction (HH). Subsequent analysis can be performed on the low-pass coefficients [7]. The analysis of an image using two levels of decomposition is shown in Fig. 1. Taking advantage of the distributive property of the convolution, the subband coefficients that result from image analysis can be reconstructed so the high-pass and lowpass coefficients from each subband are synthesized separately. Fig. 2 illustrates how the high-pass coefficients from each subband can be synthesized so their contribution to the original image can be obtained. The individual subband contributions can then be summed to recover the original image: x=u20+v20+v10. Since the subband coefficients are themselves linear combinations of bitmaps, a synthesis could also be constructed to provide bitmap contributions rather than subband contributions.
4
Subband-Channelized Hotelling Observer
The standard channelized Hotelling observer, as mentioned in Section 2, is implemented by first passing a signal through a set of channels to obtain channel responses. The channel responses are then analyzed with a Hotelling observer to determine whether a signal is present. Consider an image x. The channel responses are given by ! η = Tt x . (22) where the column vectors of T each represent channel templates with different spec! tral profiles. The channel responses, η , are then processed by a Hotelling observer to ! obtain a decision variable, λ (η ) , ! ! t! λ (η ) = K η−1 ∆µ η η . (23) ! ! K η is the covariance matrix of the channel responses η . ∆µ η is the difference be-
(
)
tween the channel responses of the two classes of images: “signal present” and “signal not present”. Section 3 showed an image could be analyzed and synthesized to obtain the individual subband or bitmap contributions. For the case that considers only subband contributions, the image is represented as a sum of subband contributions: x = u 40 + v 40 + v 30 + v 20 + v 10 (24) for four levels of wavelet analysis. Substituting (24) into (22), ! η = T t (u 40 + v 40 + v 30 + v 20 + v10 )
=Ttu40+Tt v40+Tt v30+Tt v20+Tt v10 ! ! ! ! ! = η u 40 + η v 40 + η v 30 + η v 20 + η v10
(25) (26) (27)
Equation (27) shows the channel response of an image can be represented as a sum of channel responses of the image’s subband components. Effects of lossy compression can be measured by removing channel responses of component subbands. For
652
B.M. Schmanske and M.H. Loew
example, if the first subband high-pass coefficients are not included in the compressed ! ! image file, use the following to compute K η−1 , ∆µ η , and λ (η ) : ! ! ! ! ! η = η u 40 + η v 40 + η v 30 + η v 20 . (28) 2D Wavelet Analysis Rows X
H1
H0
Columns
2
2
H1
2
vHH1
H0
2
vHL1
H1
2
H0
2
vLH1 uLL1
H1
H0
2
2
H1
2
vHH2
H0
2
vHL2
H1
2
vLH2
H0
2
uLL2
2
G1
Fig. 1. Analysis of 2-D images 2D Wavelet Synthesis
vHH2
2
G1
vHL2
2
G0
vLH2
2
G1
2
G0
uLL2
2
G0
2
G0
+
2
vHH1
2
G1
vHL1
2
G0
vLH1
2
G1
2
G0
2
G0
2
G0
2
G0
G1
+
v21
u21
+
+ 2
v10
G0
v20
+ X
u20
Fig. 2. Subband synthesis of 2-D image
5
Simulation and Results
To demonstrate the utility of the subband-channelized Hotelling observer model, the AUCs of sample images are computed for various compression ratios. Four hundred simulated images were used with a correlated Gaussian noise background. Half of the images contained a signal, represented as a disk with a Gaussian intensity profile.
An Efficient Observer Model for Assessing Signal Detection Performance
653
Five channel profiles were used, defined by Laguerre-Gauss functions as described in [8]. A simple Haar wavelet was used to generate the subbands; four levels of analysis were performed. It was assumed that spatial resolution progression was used to arrange the bitmaps, so compression is achieved by sequentially removing the four high-pass subbands. Compression ratios were computed using the bitmap entropies of each subband. Fig. 3 shows the AUC vs. compression ratios for the images. The figure clearly shows the degradation in image quality relative to the signal detection task as spatial resolution decreases and compression ratios increase, allowing the user to choose the tradeoff between compression ratio and signal detection performance. 0.95
AUC v s. Compression Ratio
Area under the ROC curve
0.9 0.85 0.8
0.75 0.7 0.65 0.6 0.55 0 10
1
10
10 Compression Ratio
2
10
3
Fig. 3. Model observer performance
6
Conclusions
A technique for efficiently calculating observer performance of nuclear medicine images degraded by lossy compression has been presented. The technique is based on the channelized Hotelling observer model, which has been shown to predict human observer performance for detection tasks with noise-limited images. The new technique uses channel responses of the subbands to compute observer performance rather than the channel responses of the image itself. Plotting the AUC versus compression ratio for compression that decreases spatial resolution shows the utility of this technique. The plot allows one to determine an acceptable compression ratio for an image based on the required signal detectability. While results were shown for a subband-channelized Hotelling observer, the technique is extendable to a bitmap-channelized model since the subband coefficients themselves are linear combinations of bitmaps. The bitmap-channelized model will provide greater flexibility and control over the image quality assessment. It will permit the assessment of image compression based on either SNR or mixed progression offered by JPEG 2000, in addition to the compression based on spatial resolution progression shown in Section 5. Should the standard channelized Hotelling observer be used for this assessment, each compression ratio value would require a separate compression/decompression of the image samples. But by having to synthesize the wavelet coefficients only once,
654
B.M. Schmanske and M.H. Loew
this subband technique, and its bitmap extension, offers a computationally efficient means of assessing image quality for the optimization of lossy compression in signal detection tasks.
References 1. Jacob Beutel, Harold L. Kundel, and Richard L Van Metter ed., Handbook of Medical Imaging, Vol. 1. Physics and Psychophysics, SPIE, Bellingham Washington, 2000. 2. Harrison H. Barrett, Jie Yao, Jannick P. Roland, and Kyle J. Myers, “Model Observers for assessment of image quality,” Proc. Natl. Acad. Sci. USA, Vol. 90, November 1993. 3. Kyle J. Myers and Harrison H. Barrett, “Addition of a channel mechanism to the idealobserver model,” J. Opt. Soc. Am. A, Vol. 4, No. 12, December 1987. 4. Craig K. Abbey and Harrison H. Barrett, “Human- and model-observer performance in ramp-spectrum noise: effects of regularization and object variability,” J. Opt. Soc. Am. A, Vol 18, No. 3, March 2001. 5. Miguel Eckstein, Craig K. Abbey, Francois O. Bochud, Jay L Bartoff, James S. Whiting, “The effect of image compression in model and human performance,” SPIE Conference on Image Perception and Performance, SPIE Vol. 3663, Feb 1999. 6. Athanassios Skodras, Charilaos Christopoulos, and Touradj Ebrahimi, “The JPEG 2000 Still Image Compression Standard,” IEEE Signal Processing Magazine, Vol 18, No. 5, Sept 2001. 7. Martin Vetterli and Jelena Kovacevic, Wavelets and Subband Coding, Prentice Hall PTR, Upper Saddle River, New Jersey, 1995. 8. Brandon D. Gallas, “Signal Detection in Lumpy Backgrounds”, PhD. Dissertation, Program in Applied Mathematics, University of Arizona, 2001.
Statistical Modeling of Pairs of Sulci in the Context of Neuroimaging Probabilistic Atlas Isabelle Corouge and Christian Barillot Projet Vista, IRISA/INRIA-CNRS, Rennes, France {icorouge,cbarillo}@irisa.fr http://www.irisa.fr/vista
Abstract. In the context of neuroimaging probabilistic atlases, we propose a statistical framework to model the inter-individual variability of pairs of sulci with respect to their relative position and orientation. The approach extends previous work [3], and relies on the statistical analysis of a training set. We first define an appropriate data representation, through an observation vector, in order to build a consistent training population, on which we then apply a normed principal components analysis (normed-PCA). Experiments have been performed on pairs of major sulci extracted from 18 MR images. Keywords: Neuroimaging, probabilistic atlases, cortical sulci, statistical modeling, normed-PCA.
1
Introduction
This paper comes within the context of digital cerebral probabilistic atlases. We are particularly interested in the study of inter-individual variability of cortical structures (sulci/gyri) which are of major interest both from an anatomical as well as a functional point of view. This paper pursues previous work [3] where we proposed a statistical modeling of cortical sulci shapes and of their variations, as well as a consistent way to use it for functional data inter-individual registration purpose. We aim now at modeling relationships between major sulci in terms of their position and orientation. Thus, the final model will present a hierarchical aspect by discrimination of different types of variations, with on the one hand shape variations, and on the other hand, position and orientation variations. To grasp the high inter-individual variability implied by the studied data, we use a deformable model [10] of “active shape models” type [2], [4]. Thus, we learn the variability of the considered class of objects on a training population, and can then deduce occurrence probabilities of the studied structures. Some authors have also used point distribution models (PDMs) to model sets of cortical sulci (e.g. [1] and [9]). Our approach differs first by its matching scheme which is a very simple one. It consists in positioning oneself in a local scope in which instances to learn are naturally matched. Then, we rely on a parametric representation of the sulci which describes not only their external traces but also their buried part. This leads to a more complete model of the inter-individual cortical variability, all the more so as the buried part represents, at least, two thirds of the cortex. In this sense, our approach relates to [7], but still differs since T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 655–662, 2002. c Springer-Verlag Berlin Heidelberg 2002
656
I. Corouge and C. Barillot SF
PreC
C
PostC
L
ST
Fig. 1. Relation graph between major sulci for one hemisphere. Six major sulci are considered: superior frontal sulcus (SF), precentral sulcus (PreC), central sulcus (C), postcentral sulcus (PostC), lateral sulcus or sylvian fissure (L) and superior temporal sulcus (ST).
the authors perform a prior global registration towards stereotactic system and do not analyze sulci interactions. At last, our approach is based on a graph structure describing the involved sulci as nodes, and their relationships as arcs (see Fig. 1). This aspect can be related to works [6] and [8] where authors build a graph devoted to automatic (or semi-automatic) labeling of cortical sulci. As work is still in progress, we present here the statistical modeling of a reduced graph, and consider presently a pair of sulci and its variations in terms of position and orientation. After a brief remind of some necessary preprocessing stages in Sect. 2, we present the statistical modeling of a pair of sulci in Sect. 3. First, we define the observations (or individuals) constituent of the training population , i.e. we define an appropriate data representation to build the training set. Second, we analyze these observations by the mean of a statistical analysis, the normed principal components analysis (normed-PCA). In Sect. 4, we present experiments and discuss them as well as the approach in Sect. 5, before concluding in Sect. 6.
2
Preprocessing
The representation of sulci used in this paper, as well as the representation of the position and orientation within a brain of one sulcus, result from previous work, [5], [3], which we briefly recall here. Sulci are defined as their median surface and extracted from MRI volumes by a method now known as the “active ribbon” method [5]. It leads to a parametric representation of the sulci by cubic B-spline surfaces. The spline, parameterized by u and v, is described by nbc = nbc u∗nbc v control points where nbc u (resp. nbc v) is the number of control points in the direction associated with parameter u (resp. v). The parametric direction u represents the length of the sulcus and the direction v its depth. The position and orientation within a brain of one sulcus is then represented by a coordinate system, local to this sulcus [3]. The origin of this local coordinate system, specifying the position of the sulcus, is defined as the center of mass of the sulcal surface. The three axes, specifying the orientation of the sulcus are defined as its axes of inertia. The first axis follows the length of the sulcus and is oriented from foot to head. The second one follows the depth of the sulcus and is oriented toward the outside of the brain. The third axis, orthogonal to the two first ones, is then the normal to the regression plane of the sulcus, and follows the antero-posterior direction in case of left sulci (see Fig. 2), and postero-anterior direction in case of right sulci.
Statistical Modeling of Pairs of Sulci
3
657
Statistical Modeling of Pairs of Sulci
3.1 Training and Data Representation Learning the variability of a class of objects over a set of observations requires first of all to define what is an observation vector for this class of objects. Since the orientation and position of one sulcus within the brain is represented by its local coordinate system, we encode the relative position and orientation of two sulci by the pair of coordinate systems local to these sulci. However, the local system is dependent of the coordinate system in which the sulcus is initially expressed; and this coordinate system differs from one subject to another one. To not be limited by this dependence and, by this way, be able to match the observations as requires by the statistical analysis, we define a coordinate system, local to a pair of sulci that we call “local median coordinate system”, with respect to which we define the observation vector. Thus, by positioning oneself in a local scope, the observations are consistent over the training population to achieve the statistical analysis. Local Median Coordinate System. We consider the pair of sulci (A, B). Let RA (Oa , xa , ya , za ), resp. RB (Ob , xb , yb , zb ), be the local coordinate system to the sulcus A, resp. B. Let RM (Om , xm , ym , zm ) be the local median system. First, the origin Om is defined as the middle of [Oa Ob ]. Next, axis zm is defined as: zm =
za + zb za + zb
and is then the normal of a plane π. Then xm and ym are deduced from the projections, xp and yp , of (xa + xb ) and (ya + yb ) on π. Let α be the angle between xp and yp in the plane π oriented by zm , and let β = (π − α)/2. Then: xm = R/zm ,−β (xp ) ym = R/zm ,β (yp ) where R/zm ,β is the matrix of the rotation defined by the axis zm and the angle β. Such a rotation matrix is easily obtained thanks to the Rodrigues formula. Thus, for a given axis, n, and a given angle, ω: R/n,ω = I + sin(ω)Γ(n) + (1 − cos(ω))Γ2 (n)
(1)
where I is the identity matrix and Γ(n) is the vector product matrix, i.e.: 0 −nz ny Γ(n) = nz 0 −nx −ny nx 0 Definition of the Observation Vector. Rather than directly use the expression of RA and RB in RM as the observation vector, we have preferred to describe the relative orientation and position of (A, B) by the parameters of the transformation bringing (RA , RB ) towards RM . It is in fact, more suitable to perform a statistical analysis later
658
I. Corouge and C. Barillot
on. We define then such a transformation, enabling to locate an orthonormal basis with respect to another one. Let Bf (xf , yf , zf ) and B(x, y, z) be two orthonormal bases. Then, considering the basis Bf as fixed, to apply a composition of three rotations defined by the axis of Bf and appropriate angles, ψ, φ and θ, transforms B towards Bf : Bf = R/zf ,θ R/yf ,φ R/xf ,ψ (B)
(2)
The angles ψ, φ and θ, which are in fact Euler angles, are defined as following through their cosines and sines: z.z cos(ψ) = z⊥f , (z ∧z ).x sin(ψ) = ⊥z⊥f f , where z⊥ = (z.yf )yf + (z.zf )zf cos(φ) = z1 .zf , (3) sin(φ) = (z1 ∧ zf ).yf , where z1 = R/xf ,ψ (z) cos(θ) = x2 .xf , sin(θ) = (x2 ∧ xf ).zf , where x2 = R/yf ,φ R/xf ,ψ (x) where “.” denotes scalar product and “∧” vector product. Thus computing the six Euler angles ψRA , φRA , θRA and ψRB , φRB , θRB defined such that:
(xm , ym , zm ) = R/zm ,θR R/ym ,φR R/xm ,ψR (xa , ya , za ) A A A (4) (xm , ym , zm ) = R/zm ,θR R/ym ,φR R/xm ,ψR (xb , yb , zb ) B
B
B
enables to completely define the orientation of sulci A and B with respect to RM . As the position is concerned, we also use Euler angles to determine it. Let ψOa , φOa and −−−−→ θOa be the Euler angles defining the direction of Om Oa . They are defined such that: zm
−−−−→ Om Oa = R/zm ,θ R/ym ,φ R/xm ,ψ ( −−−−→ ) Oa Oa Oa Om Oa
Note that, in this case, θOa = 0. Accordingly only ψOa and φOa are used to characterize −−−−→ the direction of vector Om Oa . Let d be the distance between Om and the origins of −−−−→ −−−−→ the local systems, d = Om Oa = Om Ob . Then knowing d, ψOa , φOa enables to completely define the position of sulcus A with respect to RM . The position of sulcus B with respect to RM is similarly computed and characterized by d, ψOb and φOb . Eventually, the observation vector, encoding the relative orientation and position between two neighboring sulci is: e = (d, ψOa , φOa , ψOb , φOb , ψRA , φRA , θRA , ψRB , φRB , θRB )
(5)
Note that for a given e, the origin and the axis of RA and RB can be completely recovered since (2) is reversible: B = R/xf ,−ψ R/yf ,−φ R/zf ,−θ (Bf )
(6)
Statistical Modeling of Pairs of Sulci
659
3.2 The Normed-PCA To analyze the variations in orientation and position between two sulci over a population, and by this mean, model the inter-individual variability at the level of the considered couple, we use a principal component analysis (PCA). Such an analysis reveals in fact the main modes of variation relatively to a mean observation. It enables to represent data in a new basis, also orthogonal, but which supresses the redundancy of information of the original data in the sense that, variables in the new basis are not correlated. Since the vector e characterizing the relative position and orientation of two sulci is not homogeneous (i.e. its first element is a distance, whereas the other ones are angles), we use a normed-PCA. It is an appropriate analysis technique when variables do not have same unit. As a matter of fact, the normed-PCA consists in diagonalizing the centered and normed data covariance matrix, i.e. the correlation data matrix, rather than the original data covariance matrix like PCA does. Thus, the distance between two individuals does not depend on the variables units, and balance between variables is restored since they have then all unit variance. Analysis. Let P be the training population made up of n observations ei , defined by p variables. In our case, an observation ei is defined by (5) and p = 11. Let E = (eij ) be the ¯ be the mean observation vector, e ¯ = n1 i=n n × p matrix of the observations, and e i=1 ei . Let D1/σ be the p × p diagonal matrix of the inverse standard deviations σj of the i=n original variables, σj = n1 i=1 (eij − e¯j )2 . Then, the matrix X of centered-normed data is: X = (E − 1¯ e)D1/σ where 1 is a n × 1 vector having all its elements equal to 1. Diagonalizing the covariance matrix, C, of the centered-normed data provides the new basis U: C=
1 t X X = UΛUt , where Λ = diag(λ1 , . . . , λp ) with λ1 ≥ λ2 ≥ . . . ≥ λp n
In this new basis, the observations are expressed as: F = XU, what leads to the following reconstruction formula: ¯ + fi Ut D−1 ei = e 1/σ , i = 1, . . . , n
(7)
Synthesis. Relation (7) can be used to generate new instances of the studied class of objects. First, since the eigenvalue λj represents the variance along the the j th mode, a modal approximation can be achieved by retaining only the first t,(t ≤ p), eigenvectors of the modal basis U. The quality of the approximation can be measured by the proportion, τ , of the whole variance, λT , explained by the retained modes: j=t τ=
j=1
λT
λj
where λT =
j=p j=1
λj
660
I. Corouge and C. Barillot
Second, under the assumption that the distribution of centered-normed observations xi is gaussian, i.e xi ∼ N (0, C), it comes that fi ∼ N (0, Λ) and fij ∼ N (0, λj ). Accordingly, new instances consistent with the learnt observations can be synthesized, by the variation of fij,j=1,...,t in a suitable range, which is typically such that:
− 3 λj ≤ fij ≤ +3 λj (8)
In fact, if fij ∼ N (0, λj ), then P (fij ≤ 3 λj ) = 99.7%, and thus (8) can be considered as a condition of representativity of the class of objects of interest.
4
Experiments
Our database is made up of 18 subjects (35 + / − 10 years old healthy males, all righthanded) who have underwent a T1-MR SPGR 3D study. Six major sulci have been extracted for each hemisphere. They are superior frontal sulcus (SF), precentral sulcus (PreC), central sulcus (C), postcentral sulcus (PostC), lateral sulcus or sylvian fissure (L) and superior temporal sulcus (ST). For each of them, local coordinate systems have been computed (see Fig. 2). Statistical modeling experiments have been performed on pairs (C, PreC), (C, PostC) and (L, ST) of the left hemisphere. We can on see on Fig. 2 the 3 local median coordinate systems corresponding to these pairs of sulci. In Table 1, we exhibit the percentage of cumulative variance explained along the modes of variations. The variations due to the first, as well as those due to the third mode mode are illustrated on Fig. 3 for the three considered pairs of sulci.
Fig. 2. For both figures, the 6 sulci are extracted from one subject and are, from left to right and then top to bottom: SF, PreC, C, PostC, L, ST, of the left hemisphere. Left: a view of the 6 considered major sulci with their local systems of coordinate superimposed. Right: a view of the 6 considered major sulci with the local median system of each pair superimposed.
5
Discussion
First, we remark on Table 1 that for each experimented pair, the whole variance is explained by 7 modes whereas original data are expressed by 11 variables. As a matter
Statistical Modeling of Pairs of Sulci
661
Table 1. Percentage, p, of cumulative variance according to the number of modes retained, t, for j=t
pairs (C, PreC), (C, PostC) and (L, ST); p =
1 2 3 4 5 6 7
(C, PreC) 48.844 79.105 90.5307 97.1499 99.0582 99.9343 100
j=1
λT
λj
× 100.
(C, PostC) 42.5621 68.697 87.5777 95.1251 98.3103 99.6695 100
(L, ST) 36.7084 61.9032 84.3016 94.5972 98.3764 99.9023 100
a
b
c
d
e
f
√ λ1 ≤ f1 ≤ Fig.√3. Top row: variations of the first mode around the mean observation: −2 √ +2√λ1 . Bottom row: variations of the third mode around the mean observation: −2 λ3 ≤ f3 ≤ +2 λ3 . For all figures: white: local median system, dark gray: synthesized local coordinates systems of sulcus A around its mean coordinate system, light gray: synthesized local coordinates systems of sulcus B around its mean coordinate system. a, d: pair (preC, C); b, e: pair (C, postC); c, f: pair (L, ST).
of fact, the observation vector e contains some redundant information since 4 angles out of the 10 ones used in this vector can be expressed, by construction, as a linear combination of the other ones (i.e. ψRB = −ψRA , φRB = −φRA , ψOb = ψOa − π and φOb = −φOa ). Second, modeling the variability of the pairs of sulci we considered in the experiments is relevant by itself since these quasi-parallel sulci define gyri. For
662
I. Corouge and C. Barillot
example, the central and postcentral sulci bound the postcentral gyrus, and modeling their interaction is a way of modeling the gyrus variability. As the other arcs are concerned, since they represent some “plis de passage”, their modeling will rather find its interest in the study of the whole graph.
6
Conclusion
We have proposed a statistical framework to model the inter-individual variability of pairs of sulci with respect to their relative position and orientation. The modeling is performed by a modal analysis (normed-PCA) on a consistent training population. Work in progress aims first at extending the inter-individual fusion scheme proposed in [3]. This one consists in registering functional activitions under the constraint of anatomical landmarks: the sulci. Until now, the constraint was limited to one sulcus, whereas the functional are located in one gyrus, and in this sense are under the influence of the two sulci bounding this gyrus. So we will use the statistical modeling presented here to introduce a constraint extended to a pair of sulci. Second, we intend to extend the proposed modeling framework to the whole graph.
References 1. Caunce, A., Taylor, C.J.: Building 3D Sulcal Models Using Local Geometry. Medical Image Analysis, 5:69–80 (2001) 2. Cootes, T.F., Taylor, C.J., Cooper, D.H., Graham, J.: Active Shape Models - their training and application. CVIU, 61(1):38–59 (1995) 3. Corouge, I., Barillot, C., Hellier, P., Toulouse, P., Gibaud, B.: Non-linear local registration of functional data. MICCAI, LNCS, 2208:948–956 (2001) 4. Kervrann, C., Heitz, F.: A hierarchical statistical framework for the segmentation of deformable objects in image sequences. IEEE CVPR, 724–728 (1994) 5. Le Goualher, G., Barillot, C.,Bizais, Y.: Modeling cortical sulci with active ribbons. IJPRAI, 8(11):1295–1315 (1997) 6. Le Goualher, G., Procyk, E., Collins, L., Venegopal, R., Barillot C., and Evans, A.: Automated extraction and variability analysis of sulcal neuroanatomy. IEEE Trans. on Medical Imaging, 18(3):206-217 (1999) 7. Le Goualher, G., Argenti, A.-M., Duyme, M., Baare, W.F.C., Hulshoff Pol, H.E., Barillot, C., Evans, A.C.: Statistical sulcal shape comparisons: application to the detection of genetic encoding of the central sulcus shape. NeuroImage, 11(5):564–574 (2000) 8. Rivi`ere, D., Mangin, J-F., Papadopoulos, D., Martinez, J-M., Frouin, V., R´egis, J.: Automatic Recognition of cortical culci using a congregation of neural networks. MICCAI, LNCS, 1935:40–49 (2000) 9. Tao, X., Han, X., Rettmann, M.E., Prince, J.L., Davatzikos, C.: Statistical study on cortical sulci of human brains. IPMI, LNCS, 2082:475–487 (2001) 10. McInerney, T., Terzopoulos, D.: Deformable Models in Medical Image Analysis: A Survey. Medical Image Analysis, 1(2):91–108 (1996)
Two-Stage Alignment of fMRI Time Series Using the Experiment Profile to Discard Activation-Related Bias L. Freire1,2,3 and J.-F. Mangin1 2
1 Service Hospitalier Fr´ed´eric Joliot, CEA, 91401 Orsay, France Instituto de Biof´isica e Engenharia Biom´edica, FCUL, 1749-016 Lisboa, Portugal 3 Instituto de Medicina Nuclear, FML, 1649-028 Lisboa, Portugal
Abstract. In this paper, we show that the standard point of view of the neuroimaging community about fMRI time series alignment should be revisited to overcome the bias induced by activations. We propose to perform a two-stage alignment. The first motion estimation is used to infer a mask of activated areas. The second motion estimation discards these areas during the similarity measure estimations. Simulated and actual time series are used to show that this dedicated approach is more efficient than standard robust similarity measures.
1
Introduction
In functional MRI (fMRI) activation studies, motion correction is a required pre-processing step, in order to accurately compensate for subject motion during data acquisition. However, it is known that standard registration is often not sufficient to correct for all signal changes due to subject motion. Serious confounds may appear due to the “spin history” effect, or due to interaction between motion and susceptibility artifacts. We have also recently shown that the presence of activated regions may introduce a systematic bias in the motion correction parameters when using L2-metrics based similarity measures, even in the absence of subject motion [1]. According to its amplitude, this motionindependent artifact, which stems from the fact that activated areas behave like biasing outliers, may create spurious activations after the time series alignment, especially along high contrast edges. A second study has shown that robust similarity measures could highly reduce the amplitude of this task correlated motion correction artifact [2]. While this amplitude reduction was sufficient to get rid of spurious activations in the studied cases, the estimated motion parameters were still correlated with the cognitive task timing. This bias prevents the use of the motion parameters as regressor of non-interest during the following activation inference. Moreover this observation means that the problem is not fully overcome by the robust similarity measures. This weakness is due to the fact that the signal variations occurring in activated areas are often at the noise level. Hence, the simple mechanisms T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 663–670, 2002. c Springer-Verlag Berlin Heidelberg 2002
664
L. Freire and J.-F. Mangin
underlying robust similarity measures are not sufficient to discard the influence of the whole activated area. In this paper, we propose to go one step further to devise an fMRI alignment method completely robust to the presence of activations. This approach is maybe the more straightforward at first glance: discard the voxels included in the activated area detected by a first rapid statistical inference. This approach is more cumbersome than the standard ones because it requires a first realignment and some knowledge about the experiment timing. The computation time remains however reasonable because the first realignment/inference sequence can be performed with very simple methods. The resulting activated area used to modify the similarity measure, indeed, does not need to be perfect. This approach is illustrated first using a motion-free simulated time series including artificial activation-like signal changes based on a periodical paradigm. The improvement is finally highlighted with three actual time series obtained from a 3T magnet. All the experiments are performed using five different standard similarity measures.
2 2.1
Materials and Methods fMRI Acquisitions
The three fMRI time series used in the paper were acquired on a Brucker scanner operating at 3T. Volume geometry consists of 18 contiguous slices (slice array 64 × 80), with in-plane resolution of 3.75 mm and slice thickness of 5.00 mm. The experimental paradigm was based on an on/off design consisting of two alternating visual stimuli, with period of 18 frames (2 s/frame). The time series include 10 periods plus 12 initial frames used to reach the scanner steady state. 2.2
Similarity Measures
In this work, five very usual similarity measures were chosen to evaluate the benefit introduced by the proposed approach. Each similarity measure belongs to a different family of the taxonomy described by Roche [3], namely: intensity conservation (difference of squares denoted by LS [4]), intensity conservation with outliers (Geman-McClure M-estimator denoted by GM) [5]; affine relationship (ratio of image uniformity denoted by RIU) [6]; functional relationship (correlation ratio denoted by CR) [7]; and statistical relationship (mutual information denoted by MI) [8,9]. All the registration methods share the same computational framework (implemented in C language), which includes a cubic spline based interpolation method [10,11] and a Powell like optimization method. For GM, the cut-off value, C, was set to 0.5% of mean brain value. The smoothing Gaussian kernels applied to the data before registration to increase robustness have 8 mm width for LS, and 4 mm width for GM and RIU. No smoothing was applied for CR and MI, according to the experiments described in [2].
Two-Stage Alignment of fMRI Time Series
2.3
665
Discarding Activated Regions
The proposed new approach is based on the identification of a mask supposed to include all activated areas. All the voxels of this mask are discarded from the similarity measure’s evaluation during the final motion estimation. It should be noted that this mask may include some spurious activations stemming from a biased initial motion correction, which is not a problem. Whatever the time series, the sketch of the performed experiment is the following: 1. Apply the standard motion estimation procedures using each of the 5 standard similarity measures. These estimations are the references to which will be compared the results of the new approach; 2. Use the initial motion estimation given by LS to resample the time series (here we have chosen to perform only one initial resampling according to the estimation given by the fastest but also the less robust registration method; hence the same mask was used in each case; of course, in actual applications, the same robust registration method could be used for the two stages); 3. Perform statistical inference from the resampled time series, in order to obtain a coarse estimation of activated regions (A1 activation pattern); in this paper, SPM99 was used but simpler methods should be sufficient; 4. Dilate the activation pattern A1; for the experiments described in this paper, A1 26-neighborhood was added to the mask. This dilation is used first to reduce the number of false negatives, second to help the registration methods that perform a smoothing before the measure estimation. This smoothing, indeed, corrupts a lot of voxels with the activation profile; 5. Perform a second motion estimation from the initial time series using each of the 5 adapted similarity measures. This adaptation consists of discarding the voxels of the dilated activation pattern; 6. For each registration method, compute the cross-correlation of the 6 estimated motion parameters with the cognitive task profile.
3 3.1
Experiments Simulated Activations
The benefit of the proposed approach is firstly evaluated with an artificial time series designed to simulate an activation area in the absence of subject motion. This was done duplicating the reference image of the first actual time-series 40 times and adding an activation-like signal change in order to mimic a cognitive activation process. The activation pattern was obtained from statistical inference of one actual time series related to a visual experiment. In order to simulate the effects of thermal noise, Rician noise obtained with two Gaussian distributions in real and complex axis with SD corresponding to 2% of mean brain value, was added. The results indicate a clear reduction in correlation values for the activation detection-based method whatever the similarity measure (see Fig. 1 and Table 1). The number of activated voxels discarded from the similarity measure estimation, given as a percentage of the brain, was 14%.
666
L. Freire and J.-F. Mangin
Fig. 1. Registration parameters ty and rx for simulated time series. From top to bottom: activation profile, LS, GM, RIU, CR and MI. For each similarity measure, (1) stands for conventional registration method and (2) for the proposed approach.
Two-Stage Alignment of fMRI Time Series
667
Table 1. Correlation values for the simulated time-series. For each similarity measure, (1) refers to the conventional registration method and (2) for the proposed approach. param. tx ty tz rx ry rz
3.2
LS 1 0.09 0.84 0.84 0.79 0.04 0.25
LS 2 0.01 0.38 0.16 0.17 0.17 0.13
GM 1 0.06 0.45 0.49 0.03 0.28 0.13
GM 2 0.07 0.12 0.12 0.01 0.21 0.20
RIU1 0.07 0.82 0.79 0.79 0.19 0.22
RIU2 0.14 0.32 0.04 0.24 0.05 0.29
CR 1 0.20 0.34 0.18 0.07 0.05 0.10
CR 2 0.05 0.03 0.40 0.03 0.13 0.06
MI 1 0.04 0.40 0.25 0.23 0.03 0.07
MI 2 0.33 0.05 0.09 0.28 0.16 0.01
Experiments with Actual Time Series
The same experiment was performed with three different actual time series. For these data, the activation profile used to compute cross-correlation was obtained by the convolution of the task timing with SPM99 hemodynamic model. A moving average was removed from the estimated parameter before computing the correlation in order to discard slow motion trends features. The number of activated voxels in the mask is respectively 19%, 22% and 18% of the brain size. Fig. 2 presents the results for ty and rx (pitch), for one of the time series. Clear reductions in the correlation with activation paradigm are visible for the different methods, particularly for LS and RIU, the most biased ones. Table 2 summarizes the correlation coefficients for the three data sets.
4
Discussion
The work presented in this paper has been triggered by recurrent difficulties observed in our institution relative to alignment by SPM of time series acquired with our 3T magnet [1]. The observation of estimated motion parameters perfectly correlated with the cognitive experiment led us to suspect a bias of the registration method. This bias, however, was not systematic. For the experiment used in this paper, a high amplitude bias was clearly observed for only 3 different subjects among 14. Hence, while some simulations had shown that activated areas could father a similar bias, the existence of an actual motion for these 3 subjects was still possible. The results obtained from the experiments performed in this paper definitively rule out this explanation. Discarding indeed about 20% of the voxels almost remove the correlation with the task. Since most of the current fMRI realignments are performed using LS based method (SPM and AIR), our result is rather alarming. Fortunately, the bias is highly related to the field strength, and we did not observe such problem with our 1.5T magnet. Our results, however, call for a refinement of the current packages, which should not be difficult. A few parameters have also to be tuned to make our approach more robust. For instance, the dilation of the activated area seems too ad hoc and should be replaced by an activation detection taking into account the smoothing performed during motion estimation. The mask dilation has an
668
L. Freire and J.-F. Mangin
Fig. 2. Registration parameters ty and rx for the first actual time series. From top to bottom: activation profile, LS, GM, RIU, CR and MI. For each similarity measure, (1) stands for conventional registration method and (2) for the proposed approach.
Two-Stage Alignment of fMRI Time Series
669
Table 2. Correlation values for the three actual time-series. For each similarity measure, (1) stands for conventional registration method and (2) for the proposed approach.
SET1 tx ty tz rx ry rz
LS 1 0.27 0.65 0.46 0.72 0.02 0.01
LS 2 0.27 0.17 0.14 0.10 0.05 0.13
GM 1 0.14 0.32 0.16 0.35 0.14 0.18
GM 2 0.13 0.12 0.34 0.03 0.09 0.15
RIU1 0.29 0.64 0.17 0.74 0.07 0.13
RIU2 0.23 0.12 0.25 0.13 0.07 0.18
CR 1 0.04 0.27 0.12 0.36 0.06 0.07
CR 2 0.05 0.01 0.27 0.01 0.03 0.13
MI 1 0.02 0.41 0.19 0.58 0.14 0.18
MI 2 0.14 0.06 0.42 0.06 0.07 0.01
SET2 tx ty tz rx ry rz
0.17 0.57 0.63 0.72 0.05 0.20
0.24 0.27 0.29 0.37 0.16 0.03
0.03 0.45 0.27 0.53 0.12 0.03
0.02 0.18 0.02 0.26 0.04 0.02
0.20 0.58 0.45 0.73 0.21 0.03
0.30 0.18 0.19 0.33 0.20 0.07
0.13 0.25 0.11 0.41 0.01 0.01
0.01 0.09 0.15 0.02 0.02 0.18
0.01 0.34 0.17 0.57 0.10 0.08
0.02 0.04 0.05 0.15 0.20 0.06
SET3 tx ty tz rx ry rz
0.36 0.67 0.64 0.69 0.01 0.38
0.40 0.29 0.11 0.05 0.04 0.17
0.03 0.51 0.33 0.44 0.06 0.13
0.15 0.17 0.03 0.01 0.03 0.09
0.36 0.62 0.46 0.68 0.10 0.29
0.41 0.22 0.04 0.14 0.02 0.22
0.03 0.17 0.33 0.23 0.04 0.12
0.09 0.08 0.06 0.08 0.15 0.19
0.29 0.32 0.23 0.40 0.07 0.08
0.21 0.07 0.10 0.13 0.01 0.06
important positive effect. Using a non-dilated mask, for instance, the correlation coefficients for LS and RIU are much higher (0.26, 0.52, 0.07, 0.41, 0.09 and 0.20) and (0.26, 0.52, 0.10, 0.49, 0.06 and 0.27), respectively, for the first data set. In the standard registration packages, in fact, a large smoothing is applied to reduce the occurrence of local minima during the estimation of the motion parameter. It has been shown, however, that this smoothing was largely increasing the bias amplitude [1,2]. A careful implementation of the method presented in this paper should allow to discard these smoothing related problems. Another important point is related to the issue of motion estimation variability. Increasing the number of voxels to be discarded is bound to decrease the method accuracy, because the remaining voxels have a non-uniform localization in the brain. The rigid body hypothesis, indeed, does not take into account the distortions induced by susceptibility artifacts, which depend on the head localization in the magnet. Hence, removing some parts of the brain could penalize them with regard to the remaining ones. Some experiments may also lead to very large activations (and consequently, masks), which would result in more local minima due to the reduction in the number of samples. Finally, the approach described in this paper could raise some more problems with complex experiments where different brain areas are activated according to different activation profiles. Such experiments call for some improvements of our framework. Anyway, using some robust estimators or mutual information
670
L. Freire and J.-F. Mangin
could be a simpler choice in such cases. In return, for standard simple clinical brain mapping experiments used in the context of surgery planning, our approach would be very easy to apply and would decrease the risk of false positive disturbing the surgeon thinking.
5
Conclusion
In this paper, we have shown that the standard point of view of the neuroimaging community about fMRI alignment should be revisited. A lot of teams, indeed, are in the process of upgrading the field strength of their scanners. The refinements of the registration methods proposed in this paper are relatively simple and should discard any bias related to activations. While a lot of generic robust similarity measures have been proposed for registration, our work has shown that dedicated approaches have to be designed for each problem. Using a priori knowledge about the cognitive experiment, indeed, is the simplest way to get rid of experiment related outliers.
References 1. Freire, L., and Mangin, J.- F.: Motion correction algorithms may create spurious brain activations in the absence of subject motion. NeuroImage 14, (2001) 709–722 2. Freire, L., Roche, A. and Mangin, J.-F.: What is the best similarity measure for motion correction of fMRI time series?. IEEE Trans. Med. Imag. 21 (2002) 470-484 3. Roche, A.: Recalage d’images m´edicales par inf´erence statistique. PhD Thesis, Universit´e de Nice-Sophia Antipolis, Projet Epidaure, INRIA, (2001) 4. Friston, K. J., Ashburner, J., Frith, C. D., Poline, J.- B., Heather, J. D., and Frackowiak, R. S. J.: Spatial registration and normalization of images. Hum. Brain Mapp. 2 (1995) 165–189 5. Nikou, C., Heitz, F., Armspach, J.-P., Namer, I.- J., and Grucker, D.: Registration of MR/MR and MR/SPECT brain images by fast stochastic optimization of robust voxel similarity measures. NeuroImage 8 (1998) 30–43 6. Woods, R. P., Cherry, S. R., and Mazziotta, J. C.: Rapid automated algorithm for aligning and reslicing PET images. J. Comput. Assist. Tomogr. 16 (1992) 620– 633 7. Roche, A., Malandain, G., Pennec, X., and Ayache, N.: The correlation ratio as a new similarity measure for multimodal image registration. in Proc. MICCAI’98, 1998, Cambridge, USA, LNCS-1496, Springer Verlag, 1115-1124. 8. Wells, W. M., Viola, P., Atsumi, H., Nakajima, S., and Kikinis, R.: Multi-modal volume registration by maximization of mutual information. Med. Image Anal. 1 (1996) 35–51 9. Maes, F., Collignon, A., Vandermeulen, D., Marchal, G., and Suetens, P.: Multimodality image registration by maximization of mutual information. IEEE Trans. Med. Imag. 16 (1997) 187–198 10. Unser, M., Aldroubi, A., and Eden, M.: B-Spline Signal Processing: Part I-Theory. IEEE Trans. Signal Process. 41 (1993) 821–833 11. Unser, M., Aldroubi, A., and Eden, M.: B-Spline Signal Processing: Part IIEfficient Design and Applications. IEEE Trans. Signal Process. 41 (1993) 834–848
Real-Time DRR Generation Using Cylindrical Harmonics Fei Wang, Thomas E. Davis, and Baba C. Vemuri Department of Computer Science and Engineering University of Florida, Gainesville FL, 32611 {fewang,tedavis,vemuri}@cise.ufl.edu
Abstract. In this paper, we present a very fast algorithm for generating Digitally Reconstructed Radiographs(DRRs) using cylindrical harmonics. Real-time generation of DRRs is crucial in intra-operative applications requiring matching of pre-operative 3D data to 2D X-ray images acquired intra-operatively. Our algorithm involves representing the preoperative 3D data set in a cylindrical harmonic representation and then projecting each of these harmonics from the chosen projection point to construct a set of 2D projections whose superposition is the DRR of the data set in its reference orientation. The key advantage of our algorithm over existing algorithms such as the ray-casting or the voxel projection or the hybrid schemes is that in our method, once the projection set is generated from an arbitrarily chosen point of projection, DRRs of the underlying object at arbitrary rotations are simply obtained via a complete exponentially weighted superposition of the set. This leads to tremendous computational savings over and above the basic computational advantages of the algorithm involving the use of truncated cylindrical harmonic representation of the data. We present examples of DRR synthesis with fanbeam projection geometry for synthetic and real data. As an indicator of the speed of computation of one DRR from an arbitrary projection point, only 2-3 CPU seconds are required on a DELL Precision420 using MATLAB as the program development environment.
1
Introduction
The registration of pre-operative volumetric datasets (CT data) to intra-operative two-dimensional projections (x-rays) of the represented object is a common problem in a variety of medical applications, including image guided surgery, medical image analysis, etc. With 2D/3D registration methods, the position and orientation of a 3D image can be determined with respect to the projection geometry used to acquire the 2D x-ray image. One of the key challenges for solving the 2D-3D registration problem is the need for an appropriate way to compare input images that are of different dimensions. Because similarity information is difficult to extract, it is very hard, to attack the registration problem directly based on the 2D and 3D images. The most common approach is to simulate the 2-D images given the 3-D volume dataset and estimate their relative spatial T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 671–678, 2002. c Springer-Verlag Berlin Heidelberg 2002
672
F. Wang, T.E. Davis, and B.C. Vemuri
relationship, so that the images can be compared within the same dimension. Thus simulating x-ray images of the volumetric dataset is essential to the registration process between 3D CT data and 2D x-ray. Simulated projection images, which represent the x-ray projection acquisitions from the volumetric CT data, are called Digitally Reconstructed Radiographs (DRRs). DRRs and the methods used to generate them are critically important in the 2D/3D registration problem, because they largely determine both the end result registration accuracy and the computation effort needed to achieve it. In the following section we will briefly review some of the methods typically used to generate DRRs. 1.1
Existing DRR Computation Algorithms
The most popular method used to generate DRRs is the ray-casting algorithm, which simulates ideal radiographic image formation by modeling the attenuation that x-rays experience as they pass through a dense object. Rays are constructed between points of the imaging plane and the imaging source. Thus each ray corresponds to an individual image plane point and each intensity value in the image plane is computed by integrating (summing) the attenuation coefficient along the corresponding ray. Because the projection rays usually do not coincide with the 3D data set coordinate system, interpolation is required to implement the projection and the chosen interpolation scheme determines the resulting accuracy of the generated projection. If a sufficiently large sampling rate along the rays is used, this algorithm tends to give accurate gray-value DRRs but at an enormous computational cost. This method is frequently also called volume rendering. As it must visit every voxel in the 3D dataset when computing the projection image, it tends to be extremely computationally intensive. The accuracy of the DRR produced by the ray-casting scheme is limited by the chosen interpolation scheme. The interpolation method is crucial to the resolution of ray-casting DRRs. Because several DRRs from different projection geometries must be generated in the typical 2D/3D registration problem, the computational complexity of the DRR calculation algorithm is critical. Several research studies have focussed on searching for more practical ways to improve on the the ray-casting algorithm. One effective computation reduction technique is the hierarchical approach, wherein the image is downsampled and smoothed before the registration [1][2]. In [3][4], the region of interest in the pre-operative CT image is segmented out prior to the generating the DRR, so that the alignment algorithm is applied only to that sub-image, thus reducing the computational burden. Larose et.al.,[11] presented a novel intermediate data representation called Transgraph. The basic idea is to precompute some DRRs for sampled viewpoints, and then the DRRs for arbitrary viewpoints can be generated by interpolating the existing DRRs. This method can greatly speed up the computation of DRRs by selecting only coarse samples. However, the accuracy is limited by the the density of samples and there is a tradeoff between the speedup and accuracy. The voxel-projection DRR algorithm generates the DRR by accumulating the image plane projections for each voxel in the volume data set. The voxels are processed in storage order, and each is projected onto the image plane according to the projection geometry and its position in the volume data set coordinate
Real-Time DRR Generation Using Cylindrical Harmonics
673
system. Processing the voxels in storage order (without interpolating the volume data set into a new coordinate system) allows faster traversal of the data set. The reconstruction filter controls the distribution of the voxel’s projected data value or its corresponding visual parameter values to the pixels of the projection image, thereby taking advantage of the coherence if a voxel projects onto more than one pixel,[6][7]. In the volume rendering literature, this approach is called an object space method because the DRRs are generated by traversing 3D image voxel by voxel in storage order. These methods are faster than the ray-casting methods because each voxel is visited sequentially in the memory order and no interpolation process is required. Therefore, the resolution of the Voxel-Projection algorithm is better than that of the Ray-Casting method. Recently, a novel hybrid DRR generation method has been presented in [8]. The method introduces shear-warp factorization [9], a new method proposed by Lacroute et.al.[9] in volume rendering, into the DRR generation methodology. It decomposes the viewing transformation into a shear matrix and a warp matrix, the viewing transformation matrix being their composition. The shear matrix transforms the volume data into sheared 3D volume voxels so that all viewing rays are parallel to each other. A 2D intermediate image, which is dependent on the viewing direction, is generated by summing the sheared volume voxels along the projection axis. The DRR is then obtained by warping the intermediate image to the projection plane. This algorithm exploits the merit of scanline-order volume rendering in both the pre-operative CT volume data and the projection image, which is more efficient than ray casting algorithms because the latter must perform analytic geometry calculations before sampling the data along each ray. Accuracy of the Shear-Warp algorithm is limited by the number of times an interpolation technique is applied - as well as the accuracy of the interpolation process - which in turn effect the resolution of the resulting DRR. Finally, all the existing methods for generating DRRs suffer from the fact that whenever the underlying object pose is changed to obtain a new DRR, the entire process of DRR construction must be restarted, which makes these methods unsuitable for real-time applications. In contrast, the algorithm we present makes use of the cylindrical harmonic projections (DRR set) generated from any given point of projection in generating DRRs from any other points of projection. This key and unique feature of our algorithm leads to tremendous computational advantages over existing algorithms in literature.
2
Generating DRRs Using Cylindrical Harmonics
Let f (x, y, z) be a CT scan volume density function and let an φn (x, y, z) f (x, y, z) =
(1)
n
be its expansion in cylindrical harmonics about the Z-axis. Thus, {φ(x, y, z)} are the cylindrical harmonics basis functions, resampled back into the Cartesian coordinate system, and we assume that the expansion coefficients ({an }) are
674
F. Wang, T.E. Davis, and B.C. Vemuri
selected so that < φm , φn >= δm,n , i.e. {φn } are orthonormal. Then, if fθ (x, y, z) denotes f (x, y, z) rotated through angle θ about the expansion axis, it follows that fθ (x, y, z) = e−inθ (an φn (x, y, z)) (2) n
i.e., that fθ (x, y, z) is the superposition of linearly phase shifted versions of the given cylindrical harmonics. If θ is expressed on a discrete rotation grid with K uniformly distributed points in the interval [0, 2π), then Equation (2) becomes fk (x, y, z) = e−ink2π/K (an φn (x, y, z)) (3) n
Now, let g(r, s) be the projection of f (x, y, z) for some specified projection geometry and let P denote the projection operator. Then since P is linear, an φn (x, y, z) = an ψn (r, s) (4) g(r, s) = P f (x, y, z) = P n
n
That is, the projection g(r, s) is simply the superposition of the projections of the cylindrical harmonics of f (x, y, z), i.e. P φn (x, y, z) = ψn (r, s). The significance of this result is that the projection of fk (x, y, z), can be expressed as the superposition of linearly phase shifted versions of the cylindrical harmonic projections, as follows gk (r, s) = P fk (x, y, z) = e−ink2π/K an ψn (r, s) (5) n
Thus, the algorithm for computing DRRs via cylindrical harmonics can be summarized as follows: (1) Compute cylindrical harmonics of the given 3D CT scan data set (2) Compute projections(DRRs) of the cylindrical(spherical) harmonics(via the desired arbitrary projection geometry) (3) Compute DRRs as weighted(phase shifted) superpositions of the harmonic DRRs Note that one need not compute all the harmonics when using the harmonic expansion, instead, the harmonic expansion may be truncated in accordance with the desired accuracy of the DRR, as discussed later.
3
Experiments and Discussion
In this section, we presents two sets of experiments. In the the first set, we use the well-known 3D shepp-logan model to quantify the DRR fidelity. The second experiments is carried out with a 3D CT data set. Reconstructions are compared with those obtained from the use of the ray casting algorithm [5][10]. Finally, we comment on some additional features of our algorithm that can be exploited
Real-Time DRR Generation Using Cylindrical Harmonics
20
(a)
20
(b)
20
(c)
20
40
40
40
40
60
60
60
60
80
80
80
80
100
100
100
100
120
120 20
40
60
80
100
120
120 20
40
60
80
100
120
675
(d)
120 20
40
60
80
100
120
20
40
60
80
100
120
Fig. 1. Experiments using 3D Shepp-Logan Model. (a) Analytically generated Projection with the rotation angle 45◦ ; (b) DRR generated by cylindrical harmonics using the nearest neighbor interpolation; (c) using linear interpolation and; (d) using cubic spline interpolation with 45◦ angle rotation.
to further reduce the computation time per DRR generation making it an ideal candidate for real-time applications. First, we present the results on the accuracy of our DRR computation algorithm with the aid of a synthetic data example. A rotated and translated version of the 3D Shepp-Logan Phantom was used in our experiments. The slice residing in the xy-plane at z=0 is identical to the familiar 2D Shepp-Logan phantom. The reason we use the Shepp-Logan Model is because it is easy to compute the projections analytically for this model and use it for comparison with the DRRs obtained using our algorithm. In figure 1, we show the analytically generated DRR and the DRR generated by our algorithm using cylindrical harmonics. In this example, we take the size of the 3D Shepp-Logan model to be 128 ∗ 128 ∗ 128. Also we choose N = 1024 cylindrical harmonics when computing DRRs using our algorithm. We employ two measures to quantify the error, namely, (1) the normalized cross-correlation between the analytically generated projection and the DRR obtained using our algorithm. (2) variance of the difference (image) between analytic projection and the DRR obtained using our algorithm. The normalized cross-correlation is a standard measure and is given by (i,j)Ω (Iana (i, j) − I DRR )(IDRR (i, j) − I ana ) (6) Ncc = 2 2 (i,j)Ω (Iana (i, j) − I DRR ) (i,j)Ω (IDRR (i, j) − I ana ) Where Ω is image domain, I ana and I DRR are the mean values of the analytic DRR and our DRR images respectively. The normalized cross-correlation would be 1 when the two images are identical. Another method we used to quantify the accuracy of the DRR generated by cylindrical harmonics is the variance value and the maximum absolute value of the difference between the two generated images in the region of interest. From table 1, it is evident that better interpolation methods yield lower variance of the difference image and hence a higher accuracy. Large numerical values of the maximum absolute value in the difference image would be attributed to the size of the sampling interval in the interpolation scheme used. When dealing with the registration of 3D pre-operative CT with 2D intraoperative fluoro images, real time generation of DRRs from the given CT volume is crucial. The second set contains experiments using 3D CT data set of temporal
676
F. Wang, T.E. Davis, and B.C. Vemuri
Table 1. 3D Shepp-Logan Similarity Measures. Ncc denotes normalized crosscorrelation of the analytic DRR and the DRR generated using our method. Maximum absolute value and the variance are computed over a specific region of the difference image. Interpolation Method Nearest Neighbor Linear Piecewise Cubic Spline
Ncc Maximum Absolute Value Variance 0.9996 0.3349 6.0092e-4 0.9997 0.1802 2.0530e-4 0.9998 0.1576 1.8220e-4
bones obtained from Virtual Medical Laboratory at University College London [13]. The size of the CT dataset from which we created the simulated projection images is (330x330x83). In order to show that the DRRs generated using cylindrical harmonics can achieve the comparable resolution as the DRRs generated via ray-casting with the same interpolation scheme, we take the DRRs generated using ray-casting as the groundtruth. Figure 2 shows the comparison of DRRs generated using different methods. Fig 2(a) is a DRR generated using ray-casting algorithm, viewing along the sagittal axis; Fig 2(b) is the resulting DRR using our cylindrical harmonics method, viewing in the same direction as in Fig 2(a). Fig 2(c) is the difference between Fig 2(a) and Fig 2(b). Figs 2(d),(e),(f) are another set of DRRs generated using different viewpoints. A bicubic interpolation scheme was used in generating all these DRRs. We observed that the ratio of the maximal absolute value of the difference image and the DRRs generated by ray-casting is 0.0045. That’s why the difference images as shown in Fig 2(c & f), are almost everywhere of very low magnitude in intensity and had to be rescaled by a factor of 10 for display purposes. This indicates the accuracy of our DRR generation algorithm in comparison to the ray-casting method. The typical running time to generate one DRR using our algorithm for a CT image of size 330x330x83 is 2.21 seconds on a DELL Precision420. While ray-casting algorithm takes about 96.6 seconds (when using the same linear interpolation in both techniques). Generating new DRRs in our method is quite easily accomplished via the accumulation of complex exponentially weighted basis projections. Note that there is no need to recompute the new DRR from ground zero as in other existing methods. In addition to the already mentioned computational advantages of our algorithm, we can exploit the following additional features to accrue further computational savings. – Symmetry of the rays in the fan-beam projection with respect to the y axis. The rays on the right half of the plane and rays on the left half are symmetric, exploiting this property can reduce the interpolation time by a factor of 2. – Conjugate characteristics for cylindrical harmonics. ak = a∗N −k f or k = 1 → 511 (7) where N=1024 represents the total number of the cylindrical harmonics. – Spectrum Truncation: With the increase of the order of cylindrical harmonics, the magnitude of the cylindrical harmonic coefficients decrease sharply. We can use a truncated set of cylindrical harmonic instead of the whole set
Real-Time DRR Generation Using Cylindrical Harmonics (a)
(b)
(d)
(e)
677
(c)
(f) 10
20
30
40
50
60
70
80 50
100
150
200
250
300
Fig. 2. Comparison of DRRs generated via different methods and viewpoints from temporal bone CT data.(a) DRR generated using ray-casting algorithm, viewing from the sagittal axis; (b) using cylindrical harmonics viewing from the same direction as in (a); (c) difference between (a) and (b). (d)DRR generated using ray-casting viewing from a 135◦ rotated sagittal axis; (e)using cylindrical harmonics viewing form the same viewpoint as in (d); (f) difference between (d) and (e). For display purpose, we have rescaled the difference image (c) and (f) by a factor of 10.
and thus cutting the processing time for generating the DRRs further. Fig.3 depicts the spectrum truncation error when using a truncated set of the cylindrical harmonics. From this figure, it is evident that to ensure a high fidelity DRR, 500 harmonic coefficients suffice. Note that the figure shows only half the number of harmonics due to the conjugate symmetry property.
4
Conclusions
In this paper, we presented a very fast algorithm for synthesis of DRRs from a given CT scan. The algorithm was based on constructing projections of each harmonic in a cylindrical harmonic expansion of the given data. A key advantage of the algorithm over the existing algorithms is that after computing a DRR from a single projection point, DRRs from other projection points can be computed very efficiently without having to restart the application of the algorithm from scratch. Instead, all that is needed is the weighted superposition of the basis projections. Examples involving DRR generation from synthetic and real data was shown and as evident, the accuracy achieved in the synthesis is quite high. Future work will involve matching the synthesized DRR with the given X-ray data. Acknowledgement This research was in part funded by the NIHRO1-NS42075. The CT data were obtained from http://www.vml.ucl.ac.uk/ at University College London Virtual Medical Laboratory.
678
F. Wang, T.E. Davis, and B.C. Vemuri
Quantify the Spectrum Truncation Error
1
Spectrum Truncation Error
0.8
0.6
0.4
0.2
0
−0.2
0
100 200 300 400 500 Number of the Cylindrical Harmonics we choose to compute DRR
600
Fig. 3. Quantify the Spectrum Truncation Error.
References 1. P.Viola and W.Wells. Allignment by maximization of mutual information International Journal of Computer Vision, 1997; 24:137-154 2. J.P.W.Pluim, J.B.A.Maintz and M.A.Viergever Image registration by maximiztion of combined mutual information and gradient information Preceedings of MICCAI 2000 p452-461 3. J.Weese, T.M.Buzug, C.Lorenz, and C.Fassnacht, An approach to 2D/3D registration of a vertebra in 2D X-ray fluoroscopies with 3D CT images, Proc. CVRMed/MRCAS, 1997, pp.119-128 4. G.P.Penney, J.Weese, J.A.Little, P.Desmedt, D.L.G.Hill, D.J.Hawkes, A comparision of similarity measures for use in 2D-3D medical image registration, IEEE Transaction on Medical Imaging 17(4),1998,pp.586-595 5. L.Zollei, E.Grimson, A.Norbash, W. Wells: 2D-3D rigid registration of x-ray fluoroscopy and CT images using mutual information and sparsely sampled histogram estimators.IEEE CVPR, 2001 6. E.Cosman, Jr.Rigid registration of MR and biplanar fluoroscopy Masters Thesis Massachusetts Institute of Technology,2000 7. L. Westover, Footprint evaluation for volume rendering Proceedings of SIGGRAPH’90, Computer Graphics 24,August 1990, pp.367-376 8. J.Weese,R.Goecke,G.P.Penney, P.Desmedt, T.M. Buzug, H.Schumann, Fast voxelbased 2D/3D registration algorithm using a volume rendering method based on the shear-warp factorization. Medical Imaging 1999: Image Processing K.M.Hanson, ed., Proceedings of SPIE Vol.3661, 1999, pp. 802-810 9. P.Lacroute, M. Levoy, Fast volume rendering using a shear-warp factorisation of the viewing transformation. In Preceedings of SIGGRAPH’94, Orlando, July 1994 10. L.M.G.Brown, Registration of planar film radiographs with computed tomography IEEE Proceedings of MMBIA 1996 11. D.LaRose, Iterative x-ray/CT registration using accelerated volume rendering Ph.D. Thesis Carnegie Mellon University; May 2001 12. A. C. Kak and Malcolm Slaney, Principles of computerized tomographic imaging, Society of Industrial and Applied Mathematics, 2001. 13. http://www.vml.ucl.ac.uk/links/wwwroot/vmlweb/databank, Virtual Medical Laboratory at University College London.
Strengthening the Potential of Magnetic Resonance Cholangiopancreatography (MRCP) by a Combination of High-Resolution Data Acquisition and Omni-directional Stereoscopic Viewing Tetsuya Yamagishi1, Karl Heinz Höhne1, Taku Saito2, Kimihiko Abe2, Jiro Ishida3, Ryuko Nishimura3, and Tadashi Kudo3 1
Institute of Mathematics and Computer Science in Medicine (IMDM), University Hospital Hamburg-Eppendorf, House S14, Martinistraße 52, 20251 Hamburg, Germany _]EQEKMWLMLSILRIa$YOIYRMLEQFYVKHI LXXT[[[YOIYRMLEQFYVKHIMRWXMXYXIMQHQMHZ 2 Department of Radiology, Tokyo Medical University Hospital, Nishi-Shinjuku 6-7-1, Shinjuku-ku, 160-0023 Tokyo, Japan 3 Department of Radiology, Saiseikai-Kawaguchi Hospital, Nishi-Kawaguchi 5-12-1, Kawaguchi-shi,332-8558 Saitama, Japan
Abstract. In this paper, we present a combination of high-resolution (HR) data from magnetic resonance cholangiopancreatography (MRCP) and omnidirectional viewing of stereoscopic maximum-intensity-projection. Since the presented method takes advantage of the property of magnetic resonancehydrography (MR-hydrography) as one of the most selective signal acquisitions, the segmentation process is very simple, and loads in data-processing is drastically reduced. The source data acquisition was performed with a slice thickness of 0.8mm. Images created by this method are precise and realistic enough to enhance the diagnostic capabilities of MRCP. The stereoscopic viewing and omnidirectional surveying can facilitate an accurate grasp of the three-dimensional extent of the entire pancreaticobiliary system, even in the regions which can not be assessed, due to signal-overlay of either neighboring organs or between pancreaticobiliary ducts. We are convinced that the presented post-processing is less time consumption and have a potential of optimizing HR-MRCP data sets.
1
Introduction
For numerous clinical tasks, for instance screening, diagnosis and surgical planning, it is essential to understand a complex and often malformed structure of the pancreaticobiliary system. Endoscopic retrograde cholangiopancreatography (ERCP) formerly held an unchallenged position as the only approach to get a precise delineation of the pancreaticobiliary ducts. ERCP, as an invasive modality, is accompanied by various
T. Dohi and R. Kikinis (Eds.): MICCAI 2002, LNCS 2489, pp. 679–686, 2002. © Springer-Verlag Berlin Heidelberg 2002
680
T. Yamagishi et al.
risks, such as endoscopic technical failures [1], iatrogenic pancreatitis and adverse effect of contrast media as well as X-ray exposure. In diagnostic imaging of the pancreaticobiliary discipline, we have seen an explosive development during the last decades. Although magnetic resonance cholangiopancreatography (MRCP) has become a standard tool and has come into the limelight as one of the non-invasive modalities to obtain information about the pancreaticobiliary system, a precise description of the morphological structure remains difficult by conventional scanning [1, 2]. Because of false positive and/ or negative findings of MRCP [1, 2], ERCP is, even now, entrusted particularly as a modality for the final diagnosis. The inferior image quality of MRCP also depends on its low spatial resolution [1] especially in the longitudinal body axis due to the source image data thickness up to 25mm [1-4]. With the advent of modern magnetic resonance imaging (MRI) scanners, a thin-slice data acquisition integrating high-speed sequence acquisition and a breathtriggered technique [4] with a precision sensor has become available. Thus spatial resolution of MRI has drastically improved with a slice thickness below 1mm, which is called high-resolution (HR) data acquisition. Even abdominal section, HR images with less unexpected position discrepancies between slices or less motion artifacts have become possible in routine scanning. An imaging technique of MRCP, or ‘hydrography’ as it is commonly called, is one of the most selective data acquisitions of MRI, which is following the principle of detecting the intensified degree of free-water MR-signals such as from the pancreaticobiliary system without contrast medium. Therefore, three-dimensional (3D) volume models derived from HR-MRCP get a potential of playing an increasing role not only for research analyses but also for clinical application. This paper is to show that the combination of HR data acquisition and omni-directional stereoscopic viewing of MIP provides a decisive increase of diagnostic value of MRCP.
2
Methods
2.1
MR-Hydrography Data Acquisition
Image data sets were obtained from five male healthy volunteers (40-53 years old) with fully informed consent. They had no disease history of the pancreaticobiliary system. The contraindication of iron particle, anti-cholinergic agents or secretin was excluded. A pre-medication protocol was performed, 10-15 minutes before scanning, by an oral administration of ferrite ammonium citrate (0.2% water solution of 600mg, for reducing T2-weighted signal from gastrointestinal tracts [1]), a 20mg of intravenous infusion of scopolamine butylbromide for avoiding motion-artifact due to increasing peristalsis, and a 50-unit of intravenous infusion of secretin for inducing pancreatic juice production for adequate visualization of the pancreatic duct system. Non-contrast MR images were obtained with a heavily T2-weighted fast-spin echo technique (TR/ 1639, TE/ 900) in coronal sections (total 250-270 slices) with a fatsignal-suppression technique by chemical-shift saturation). Each cross-sectional
Strengthening the Potential of Magnetic Resonance Cholangiopancreatography
681
source image acquisition was performed with a slice thickness of 0.8mm and with no gap between two adjacent slices. The image matrix was 256×256. A Philips Gyroscan Intera scanner (1.5T) was used. 2.2
Processing
Image data sets exported from the scanner were stored in CD-ROM as compressed DICOM format via the intra-hospital network. Then, all data sets were imported to the visualization system ‘VOXEL-MAN’, developed at the Institute of Mathematics and Computer Science in medicine (IMDM), Hamburg, Germany [5-8]. Interactive segmentation was performed for the purpose of defining three objects (the pancreaticobiliary structure and, the duodenum and the jejunum, for important landmarks [1] as adjoining alimentary tracts). The segmented alimentary tracts were removed for the purpose of avoiding an overlay with the pancreaticobiliary structure [5], in order to facilitate inspections. Due to remaining partial volume voxels [8] the duodenum and jejunum are still visible as like a transparent. The Red/ Green stereoscopic MIP rendering and sequential rotating movie projected from every-degree were created. The reconstructed images were evaluated at least on around two axes: the longitudinal body axis (Y axis) and the lateral body axis (X axis). All processing can be completed in the ‘VOXEL-MAN’ system. Animated sequences are displayed by commercial movie software (Quick Time Movie), which is also available by a slide-show mode of a commercial image viewer (ACDsee). 2.3
Evaluation of Performance
The created movies displayed on PC were compared to the conventional MIP on plain films by 3 radiologists and 2 board-certified endoscopists. Image quality and ease of recognition of the following points were assessed: 1. the pancreaticobiliary junction 2. the cystic duct 3. the distal segment of the main pancreatic duct (MPD) An assessment of usefulness for film-reading was made according the following score: 1: Excellent (persuasive), 2: Fair, 3. Good, 4. Poor in every aspect Easiness of image computing and value for clinical application was also estimated.
3
Results
The entire morphological visualizations of the detailed contours of the pancreaticobiliary ducts can be substantially improved. Images created by this method are also realistic (Fig. 1-4), and by omni-directional observation, even in regions which can not be assessed due to signal-overlay with neighboring organs or between pancreaticobiliary ducts, accurate recognition of the ducts structure has been facilitated (Fig. 1-4).
682
T. Yamagishi et al.
The actual scanning time is around 15 minutes, which depend on examinee respiration. The actual data-processing time is within one hour.
Fig. 1. The frontal view of a 40-year-old male. In this viewing direction, the different orifice of the common bile duct (CBD) and the MPD can easily be seen. But the distal segment of the MPD can hardly recognized. The small cystic lesion located in the tail of the pancreas can be suspected, but the precise location can not be identified
The very thin slice data acquisition enables a non-cleaved gapless MIPreconstruction even in a perpendicular projection compared to the original scanning direction of the source data acquisition, which is impossible under the condition of the conventional scanning with thick slices and gaps between slices (Fig.2).
Strengthening the Potential of Magnetic Resonance Cholangiopancreatography
683
Fig. 2. The lateral view of the same case as shown in Fig. 1. By confronting with Fig. 1, we can firmly recognize the relative location between the distal segment of the MPD and the small cyst, which can not be seen in the frontal view by interference overlaid jejunum. The outlines of all objects are described without any gap or step edges because of the very thin slice of the source data acquisition
Panoramic viewpoints yield exact and continuous observation of the entire field of the pancreaticobiliary system (Fig.3). A small-degree rotation induces a benefit of intuitive understanding. (In a static MIP image we would not know which is in front or back). Furthermore, the advantages of visual continuity throughout the pancreaticobiliary system reduce the radiologists’ burden of film reading as compared to the conventional single-shot images of MRCP. All the radiologists and endoscopict participated in this study evaluated the created movies as excellent in the all aspects, except for one endoscopist evaluated fair in the recognition of the cystic duct in the case shown in Fig 3 and 4.
684
T. Yamagishi et al.
Fig. 3. The sequential stereoscopic view of a 46-year-old male. The left lane is around the vertical axis rotation and the right is transverse. The relative location of important landmarks of the pancreaticobiliary system can be recognized decisively, for example, the cystic duct and the CBD. In the stereoscopic view the perception is substantially improved
(These images must be viewed with RED/GREEN glasses).
Strengthening the Potential of Magnetic Resonance Cholangiopancreatography
685
Fig. 4. The oblique view of the same case as shown in Fig. 3. We can easily pay attention to the separate channels of the pancreaticobiliary junction and the minor duodenal papilla is located distal as usual
4
Discussion
The quality of the final 3D images, in any rendering method, greatly depends on the excellence of the original data (the scanner strength; spatial resolution). In previous work concerning 3D MRCP, the source data slice thickness of 2-5mm did not allow the rendering of precise and realistic images [1-4]. In spite of the image quality loss with decreasing slice thickness in MRI, the signal-to-noise ratio of hydrography is high enough to offer the potential of creating clear 3D images. The HR data itself represents a large amount of images, so that if simply output on the plain films, film reading becomes a tedious task for diagnostic radiologists. We are convinced that the presented post-processing is an adequate tool for routine use of HRMRCP which is able to reduce the burden of film reading.
686
T. Yamagishi et al.
We also want to point out that the speed of post-processing is very high, which has direct influence on availability for routine clinical use. Due to the high signal-to-noise ratio of the pancreaticobiliary system, that is to say the selectivity of the source data acquisition, we can expedite the segmentation, the first step of post-processing [5], because there is no necessity for the otherwise complicated procedure of defining target objects by specifying the lowest cut-off value of the gray scale. This dataprocessing requires neither complicated process nor a special tool. Ease of handling may contribute to cost-effectiveness and may boost the number of examinations with minimal manpower. By omni-directional stereoscopic view, we are able to avoid false negative and /or positive findings due to overlaid high signal intensities. Improved image quality based on the HR data acquisition provides the possibilities for using in various clinical situations, such as the preoperative information and the early detection of neoplasm in the biliary and pancreatic region. Since the thin-slice data acquisition requires long scanning time, an excellent breath-triggering system of the MR scanner is mandatory for HR data acquisition [4]. The developments of the modern scanners are making the solution of these problems. We are convinced that the omni-directional stereoscopic HR-MRCP has the possibility of being considered as a useful option prior to ERCP and of reducing the cases of ERCP.
References 1. Arslan A. et al: Pancreaticobiliary diseases. Comparison of 2D single-shot turbo spin-echo MR cholangiopancreatography with endoscopic retrograde cholangiopancreatography. Acta Radiologica 41: 621-626, 2000 2. Watanabe Y. et al: High-resolution MR cholangiopancreatography. Crit Rev Diagn Imaging 39: 115-258, 1998 3. Holzknecht N. et al: Breath-hold MR cholangiography with snapshot techniques; prospective comparison with endoscopic retrograde cholangiography. Radiology 206: 657-664, 1998 4. Papanikolaou N. et al: Magnetic resonance cholangiopancreatography: comparison between respiratory-triggered turbo spin echo and breath hold single-shot turbo spin echo sequences. Magn-reson-imaging 17: 1255-1260, 1999 5. Schiemann T. et al: Interactive 3D-Segmentation: Visualization in Biomedical Computing 1808: 376-383, 1992 6. Hoehne KH. et al: A realistic model of human structure from the visible human data: Method Inform Med 40: 83-89, 2001 7. Pommert A. et al: Creating a high-resolution spatial/ symbolic model of the inner organs based on the Visible Human: Medical Image Analysis 5: 221-228, 2001 8. Hoehne KH. et al: 3D Visualization of tomographic volume data using the generalized voxel model: The Visual computer 6: 28-36, 1990
Author Index
Abe, K., II,679 Ablitt, N., I,612 Acquaroli, F., II,308 Adhami, L., I,272 Aharon, S., II,211 Alberola, C., II,397 Albert, M.S., I,785 Alderliesten, T., II,245 Aldridge, R., I,596 Althuser, M., I,138 Amrith, S., II,339 Anderson, J.H., II,339 Ang, C.-H., II,606 Arnold, D.L., I,363 Asami, K., I,248 Asano, T., II,133 Ashburner, J., II,590 Atalar, E., I,91 Aubert-Broche, B., I,500 Avants, B., II,381 Axel, L., I,620; I,706; I,753 Ayache, N., I,323; I,363; I,467; I,548; I,714; II,235 Ayoubi, J.-M., I,17; I,138 Bach Cuadra, M., I,290; I,380 Baert, S.A.M., I,404; II,101 Bajka, M., II,202 Banovac, F., I,200 Bansal, R., I,659 ´ Bardinet, E., I,323; I,548 Barillot, C., II,590; II,655 Bauer, M., I,396; II,549 Baumgartner, W.A., I,155 Bello, F., II,219 Bemmel, C.M. van, II,36 Benali, H., I,500; II,509 Bergmann, H., II,44 Bergtholdt, M., I,572 Berkelman, P.J., I,17 Bethea, B.T., I,155 Bihan, D. Le, I,475 Birkfellner, W., II,44 Bj¨ornemo, M., I,435 Bloch, I., I,427
Borschneck, D.P., I,232 Boutault, F., II,315; II,323 Breeuwer, M., I,651 Bruin, P.W. de, II,348 Brun, A., I,435 Bruyns, C., I,282 Buckner, R.L., I,508 Bullitt, E., I,372 Burcher, M., I,208 Butz, T., I,290 Buvat, I., I,500 Byrne, J.V., II,501; II,517 Cachia, A., I,427 Cai, Y., II,339 Camp, J., I,264 Cardenas, V., I,492 Carroll, J.D., I,604 Cash, D.M., II,533 Castellano-Smith, A.D., I,307; II,565 Castellanos, A.E., I,66 Cathier, P., I,323 Chabanas, M., II,315 Chandrashekara, R., I,722 Chaney, E.L., I,347 Chen, J.Z., I,347 Chen, S.J., I,604 Chen, Y., II,615 Chihoub, A., II,193 Chinzei, K., I,114; I,216 Chiyoda, S., I,75 Chui, C.-K., II,339; II,606 Chung, A.C.S., II,525 Cinquin, P., I,17; I,138 Ciuciu, P., II,509 Claridge, E., I,730 Cleary, K., I,200 Cobb, J., I,256 Cocosco, C.A., I,516 Cointepas, Y., I,475 Colchester, A.C.F., I,524 Collie, D., I,524 Collins, D.L., I,363 Comaniciu, D., I,572 Corouge, I., II,590; II,655
688
Author Index
Cosman Jr, E., II,573 ` Coste-Mani`ere, E., I,25; I,272 Cotin, S., I,35 Cotton, S., I,730 Courr`eges, F., I,138 Cox, T.C., II,501; II, 517 D’Agostino, E., II,541 Daifu, S., I,173 Dale, A., I,508 Dario, P., II,170 Darzi, A., II,219 Davatzikos, C., II,452 Davies, B.L., I,256 Davis, B., I,264 Davis, T.E., II,671 Dawant, B.M., I,380 Dawson, S., I,35 Degenhard, A., I,307 Deguet, A., II,77 Delingette, H., I,714; II,235 Desai, J.P., I,66 Desbat, L., II,364 Devita, C., I,770 DiMaio, S.P., II,253 Dohi, T., I,83; I,99; I,107; I,122; I,192; I,224; II,85; II,164; II,227 Doignon, C., I,9 Dormont, D., I,323; I,548 Drangova, M., II,268 Dubb, A., II,381 Duncan, J.S., I,588; I,682 Egorova, S., I,785 Ehman, R.L., II,293 Eiho, S., II,477 Elkington, A., I,612 Ellis, R.E., I,232; I,331 Ersbøll, B., II,373 Ertl, T., I,411 Evans, A.C., I,516 Ewers, R., II,44 Fahlbusch, R., I,396 Fan, L., I,746 Fan, S., II,389 Fichtinger, G., I,91; II,77 Figl, M., II,44 Finnis, K.W., II,69 Fischl, B., I,508
Fisher, E., I,483 Flamm, S., I,690 Flandin, G., I,467; II,509 Fleute, M., II,364 Forest, C., I,714; II,235 Francis, S.J., I,363 Freire, L., II,663 Friston, K.J., II,590 Fujie, M.G., I,83; II,125 Fujimoto, K., II,133 Fukui, Y., I,44 Fukumoto, I., I,674 Funka-Lea, G., I,659 Furukawa, T., I,52 Furushiro, N., II,109 Galloway, R.L., II,533 Gangloff, J., I,9 Gao, J., I,612 Gee, J.C., I,698; I,762; I,770; II,381 Geiger, B., II,331 Gerig, G., I,372 Gering, D.T., I,388 Gerovichev, O., I,147 Gerritsen, F.A., I,651 Gibaud, B., I,500 Gilmore, R.L., II,436 Glossop, N., I,200 Gobbi, D.G., II,156 Goh, P.-S., II,339 Golland, P., I,355; I,508 Gomes, P., I,256 Gomez, J., I,380 Gott, V.L., I,155 Graves, E., I,739 Greiner, G., II,549 Grimson, W.E.L., I,355; I,388; I,508; I,564; II,525; II,573 Grossman, M., I,770 Grova, C., I,500 Guerraz, A., I,138 Guimond, A., I,564; I,785 Gur, R., II,381 Guttmann, C.R.G., I,785 Habets, D., II,268 Hagio, K., I,241; I,339; II,261 Hagmann, P., I,380 Hajnal, J.V., I,532 Haker, S., I,459; II,573
Author Index Hall, P., I,730 Hall, W.A., II,565 Han, L., I,208 Hanafusa, A., I,224; II,227 Hanel, R., II,44 Haque, H.A., II,52 Harders, M., II,20 Harris, S.J., I,256 Hartkens, T., I,532; II,565 Hasegawa, I., I,762; Hasegawa, J., II,631 Hasegawa, T., I,173 Hashimoto, R., I,44 Hashizume, M., II,148 Hasizume, M., I,122 Hastreiter, P., I,396; I,411; II,549 Hata, N., I,122; I,192; II,52; II,85; II,164 Hata, Y., II,28 Hatabu, H., I,762 Hato, T., I,52 Hattori, A., I,60; I,241; II,261; II,356 Hawkes, D.J., I,307; I,651; II,501; II,517; II,565 Hayashi, Y., II,631 Hayashibe, M., II,356 Hayes, C., I,307 Hellier, P., II,590 Higashikawa, H., I,173 Hilger, K.B., II,428; II,444 Hill, D.L.G., I,307; I,532; I,651; II,565 Hipwell, J.H., II,501; II,517 Hoffman, E.A., II,1; II,12 H¨ohne, K.H., II,186; II,598; II,623; II,679 Hojjat, A., I,524 Holdsworth, D.W., II,268 Hong, J.-S., I,122 Hori, M., II,493 Hose, D.R., I,307 Hu, T., I,66 Hu, Z., I,706 Hummel, J., II,44 Hutton, C., I,443 Ichikawa, H., I,182 Ida, N., II,477 Ikuta, K., I,163; I,173; I,182 Imamura, H., II,477 Imawaki, S., II,28 Inada, H., I,107; II,133 Inoue, K., II,477
689
Inubushi, T., II,52 Iseki, H., I,107; I,114 Iserhardt-Bauer, S., I,411 Ishida, J., II,679 Ishihara, K., I,130 Ishikawa, M., II,28 Isomura, T., I,224; II,227 Iwahara, M., II,85 Iyun, O., I,232 Jackson, A.D., II,557 Jakopec, M., I,256 Janke, A., I,363 Jannin, P., I,500 Jaume, S., I,451 Ji, H., II,573 Jinno, M., I,52 Johansen, P., II,557 Johkoh, T., II,493 Jolesz, F.A., I,315 Joshi, S., I,347 Joskowicz, L., II,60 Kakadiaris, I.A., I,690 Kakutani, H., I,1 Kan, K., I,83 Kanwisher, N., I,508 Kasai, K., I,564 Kataoka, H., I,216 Kaus, M.R., I,315 Kawakami, H., II,261 Kawata, Y., I,793 Kemkers, R., I,604 Khamene, A., II,116 Kherif, F., I,467; II,509 Kikinis, R., I,315; I,355; I,388; I,435; I,508; I,564; II,405; II,573; Killiany, R.J., I,785 Kim, D., I,99; I,107 Kimura, A., I,264 Kimura, E., I,130 Kishi, K., I,83 Kitajima, M., I,52; I,155 Kitamura, T., I,248 Kitaoka, H., II,1 Kitney, R.I., II,219 Kobashi, S., II,28 Kobatake, H., I,556 Kobayashi, E., I,99; I,107; II,133 Kobayashi, Y., I,75
690
Author Index
Koikkalainen, J., I,540 Koizumi, N., I,60 Komeda, M., I,248 Komori, M., II,178 Kondo, K., II,28 Konings, M.K., II,245 Konishi, K., I,122; II,148 Koseki, Y., I,114 Kosugi, Y., II,93 K¨ostner, N., I,411 Koyama, T., I,248; II,125 Kraats, E.B. van der, I,404 Krieger, A., I,91 Krupa, A., I,9 Kudo, T., II,679 Kukuk, M., II,331 Kuroda, T., II,178 Kurumi, Y., II,52 Kusuma, I., II,339 Lake, D.S., II,293 Lamecker, H., II,421 Lange, T., II,421 Lapeer, R.J., I,596 Larsen, R., II,373; II,428; II,444 Laugesen, S., II,373 Lautrup, B., II,557 Lavall´ee, S., II,364 Leach, M.O., I,307 Leemput, K. van, I,372 Lehmann, G., II,268 Lemij, H., I,777 Lenglet, C., II,211 Leonard, C.M., II,436 Leroy, J., I,9 L´etoublon, C., I,17 Levy, E., I,200 Li, Z., II,339 Liao, H., II,85 Likar, B., II,461 Lin, M.P., I,770 Lin, N., I,682 Lindisch, D., I,200 Liu, C., II,339 Liu, H., I,634; II,565 Livyatan, H., II,60 Loew, M.H., II,647 Lorenzo-Vald´es, M., I,642; I,722 L¨otj¨onen, J., I,540 Luboz, V., II,323
Ma, B., I,331 Macq, B., I,451 Maes, F., II,541 Malandain, G., I,467; I,548 Manduca, A., II,293 Mangin, J.-F., I,427; I,475; II,663 Marayong, P., I,147 Marcacci, M., II,170 Marecaux, C., II,315 Mareci, T., II,615 Marescaux, J., I,9 Markham, R., II,413 Mart´ın, M., II,397 Martelli, S., II,276; II,308 Martin, A.J., II,565 Masamune, K., I,107; II,77 Masuda, K., I,130 Masumoto, J., II,493 Masutani, Y., II,109; II,300 Mathelin, M. de, I,9 Matsuda, T., II,178 Matsuhira, N., I,52 Matsuka, D., II,77 Matsumura, T., I,192 Maudsley, A., I,492 Maurer Jr., C.R., II,469; II,565 Mazzoni, M., II,170 McGraw, T., II,615 McLaughlin, R.A., I,419; II,517 McLeish, K., I,532 McLennan, G., II,1 Meeteren, P.M. van, II,348 Megali, G., II,170 Meier, D., I,483 Metaxas, D.N., I,620; I,706; I,753 Miga, M.I., II,533 Mirmehdi, M., II,413 Miyagawa, T., I,52 Miyakawa, K., I,556 Miyamoto, M., II,148 Miyata, N., I,107 Miyazaki, F., I,1 Mizuhara, K., I,216 Mochimaru, M., I,44 Mohamed, A., II,452 Mohiaddin, R., I,642; I,722 Momoi, Y., II,125 Moncrieff, M., I,730 Monden, M., I,1
Author Index Montillo, A., I,620 Moon, N., I,372 Morel, G., I,9 Mori, K., II,631 Morikawa, O., I,44 Morikawa, S., II,52; II,164 Morikawa, Y., I,52 Moriyama, N., I,793 Mourgues, F., I,25 Murakami, T., II,493 Musha, T., II,93 Muthupillai, R., I,690 Nagaoka, T., II,261 Nain, D., II,573 Naka, S., II,52 Nakajima, S., II,85 Nakajima, Y., II,125; II,148; II,485 Nakamoto, M., II,148 Nakamura, H., II,493 Nakamura, Y., I,75; II,356 Nakao, M., II,178 Nakazawa, K., I,52 Naraghi, R., I,396 Negoro, D., I,1 Nemoto, Y., I,83 Netsch, T., II,639 Neumann, P., I,35 Nielsen, C., II,373 Nielsen, M., II,557 Niessen, W.J., I,404; II,36; II,101; II,245 Niki, N., I,793 Nimsky, C., II,549 Nishii, T., I,339 Nishikawa, A., I,1 Nishimura, R., II,679 Nishizawa, K., I,83 Nissen, U., I,411 Noble, J.A., I,208; I,419; I,580; II,517 Noble, N.M.I., I,651 Noe, A., I,698 Norbash, A., II,525 Nowinski, W.L., II,339; II,606 Ntziachristos, V., I,739 Ochi, T., I,241; I,339; II,125; II,261; II,485 O’Donnell, L., I,459 Ohashi, K., I,192 Ohmatsu, H., I,793 Okada, M., I,75
Okamura, A.M., I,147; I,155; I,216 Oota, S., I,248 Osareh, A., II,413 Otake, Y., I,241 Ottensmeyer, M., I,35; I,282 Ourselin, S., I,548; II,140 Oyama, H., II,178 Ozawa, S., I,52 Pal´agyi, K., II,12 Paloc, C., II,219 Papadopoulos-Orfanos, D., I,427 Parain, K., I,548 Park, K., I,753 Park, Y., II,1 Parrent, A.G., II,69 Paulsen, R., II,373 Payan, Y., II,315; II,323 Pednekar, A., I,690 Pedrono, A., II,323 Pellisier, F., I,138 Pennec, X., I,467; I,714; II,140 Penney, G.P., II,501 Pernuˇs, F., II,461 Peters, T.M., II,69; II,156; II,268 Pinskerova, V., II,308 Pizer, S.M., I,347 Platel, B., I,290 Pohl, K.M., I,564 Poisson, G., I,138 Poline, J.-B., I,467; II,509 Pollo, C., I,380 Pommert, A., II,598; II,623 Post, F.H., II,348 Poupon, C., I,475 Prima, S., I,363 Qian, J.Z., Quist, M.,
I,746 II,639
Rangarajan, A., II,436 Rao, A., I,722 Rao, M., II,615 Rattner, D., I,35 Razavi, R., I,651 Reinhardt, J.M., II,1; II,12 R´egis, J., I,427 Ripoll, J., I,739 Rivi`ere, D., I,427 Robb, R., I,264
691
692
Author Index
Roche, A., I,323 Rodriguez y Baena, F., I,256 Rohlfing, T., II,469 Roper, S.N., II,436 R¨osch, P., II,639 Rovaris, E., II,581 Rudan, J.F., I,331 Rueckert, D., I,532; I,642; I,722 Ruiz-Alzola, J., II,581 Sadikot, A.F., II,69 Saito, T., II,109; II,679 Sakaguchi, G., I,248 Sakuma, I., I,99; I,107; I,192; II,85; II,109; II,125; II,133 Salcudean, S.E., II,253 Sanchez-Ortiz, G.I., I,642; I,722 Sasaki, K., I,163 Sasama, T., II,125 Sato, K., II,52 Sato, Y., II,125; II,148; II,485; II,493 Sauer, F., II,116 Schempershofe, M., I,411 Schmanske, B.M., II,647 Schnabel, J.A., I,307; I,651 Seebass, M., II,421 Sekiguchi, Y., I,224; II,227 Sekimoto, M., I,1 Sermesant, M., I,714 Seshan, V., II,52 Shenton, M.E., I,355; I,508; I,564 Shi, P., I,634 Shimada, M., II,148 Shimada, T., I,163 Shimizu, A., I,556 Sierra, R., II,202 Simon, O., II,509 Simone, C., I,216 Sing, N.W., II,389 Sinha, T.K., II,533 Siow, Y.-S., II,339 Smith, S.M., I,532 Solanas, E., I,290 Soler, L., I,9 Sonka, M., II,1; II,12 Soza, G., II,549 Spettol, A., II,308 Spiridon, M., I,508 Spreeuwers, L.J., II,36 Staib, L.H., I,588
Starreveld, Y.P., II,69 Stefanescu, R., II,140 Stegmann, M.B., II,444 Steiner, P., II,186 Sternberg-Gospos, N. von, II,186 Studholme, C., I,492 Sturm, B., I,483 Stylopoulos, N., I,35 Su´arez, E., II,581 Sudo, K., I,83 Suenaga, Y., II,631 Suetens, P., II,541 Sugano, N., I,241; I,339; II,125; II,261; II,485 Sugimoto, N., II,477 Sumiyama, K., I,60 Sunaoshi, T., I,52 Sundaram, T., I,762 Susil, R.C., I,91 Suzuki, K., I,182 Suzuki, N., I,60; I,241; II,261; II,356 Suzuki, Y., I,130 Swider, P., II,323 Sz´ekely, G., II,20; II,202; II,284 Szczerba, D., II,284 Tajima, F., I,83 Tajiri, H., I,60 Takahashi, H., I,224; II,227 Takahashi, T., II,178 Takai, Y., II,133 Takakura, K., I,107 Takamoto, S., I,83 Takashina, M., I,339 Take, N., II,93 Takeda, H., I,83 Takeuchi, H., I,83 Takiguchi, S., I,1 Tamura, S., II,125; II,148; II,485; II,493 Tamura, Y., II,485 Tan, A.C., I,596 Tanacs, A., I,91 Tanaka, D., I,200 Tand´e, D., I,548 Tani, T., II,52 Tannenbaum, A., I,667 Tanner, C., I,307 Tashiro, T., II,485 Tateishi, N., I,130 Taylor, R., II,77; II,452
Author Index Tek, H., I,572 Teo, J., II,339 Thiran, J.-Ph., I,290; I,380 Tholey, G., I,66 Thomas, B., II,413 Thorel, P., I,138 Tiede, U., II,186; II,623 Timoner, S.J., I,355 Tokuda, J., II,164 Tokuyasu, T., I,248 Tomandl, B., I,396; I,411 Tomaˇzeviˇc, D., II,461 Tondu, B., I,138 Tonet, O., II,170 Toriwaki, J., II,631 Troccaz, J., I,17; I,138 Truwit, C.L., II,565 Tsagaan, B., I,556 Tschirren, J., II,1; II,12 Tsuji, T., I,107 Turner, R., I,443 Uchiyama, A., I,60 Uematsu, H., I,762 Ueno, K., II,477 Umemura, S., I,83 Uno, H., I,44 Urayama, S., II,477 Vandermeulen, D., II,541 Vascellari, A., II,170 Vemuri, B.C., II,436; II,615; II,671 Vermeer, K., I,777 Viergever, M.A., II,36 Vilchis, A., I,138 Villemure, J.-G., I,380 Visani, A., II,276; II,308 Vita, E. De, I,443 Vogt, S., II,116 Vohra, N., II,436 Voon, L.K., II,389 Vos, F., I,777; II,348 Vossepoel, A.M., I,177; II,348 Wang, F.,
II,671
693
Wang, Z., II,606; II,615 Wanschitz, F., II,44 Warfield, S.K., I,298; I,315; I,451; I,564 Washio, T., I,114; I,216 Watabe, K., I,75 Watanabe, M., II,405 Watzinger, F., II,44 Weese, J., II,639 Wei, G.-Q., I,746 Weil, R.J., II,533 Weiner, M., I,492 Weishaupt, D., II,20 Weissleder, R., I,739 Wells III, W.M., I,298; I,315; I,355; I,564; II,525; II,573 Westin, C.-F., I,435; I,459; II,405; II,573; II,581 Whitcomb, L.L., I,91 Wie, Y., I,130 Wildermuth, S., II,20 Williams, J., I,572 Wink, O., I,604 Wrobel, M.C., II,428 Wyatt, P.P., I,580 Yahagi, N., I,107; I,192 Yamagishi, T., II,679 Yamamoto, K., I,163 Yamashita, J., I,44 Yamauchi, Y., I,44 Yang, G.-Z., I,612 Yang, J., I,588 Yaniv, Z., II,60 Yasuba, C., II,28 Yasui, M., I,1 Yelnik, J., I,548 Yezzi, A., I,667 Yokoyama, K., I,44 Yonenobu, K., I,241; II,125; II,261; II,485 Yoshikawa, H., I,339; II,261; II,485 Yu, W., I,682 Zavaletta, V., I,690 Zijdenbos, A.P., I,516 Zou, K.H., I,298; I,315