Geometric Computing
Eduardo Bayro-Corrochano
Geometric Computing For Wavelet Transforms, Robot Vision, Learning, Control and Action
ABC
Eduardo Bayro-Corrochano CINVESTAV Unidad Guadalajara Dept. Electrical Eng. & Computer Science Av. Científica 1145 45010 Colonia El Bajío Zapópan, JAL México
[email protected] http://www.gdl.cinvestav.mx/edb
ISBN 978-1-84882-928-2 e-ISBN 978-1-84882-929-9 DOI 10.1007/978-1-84882-929-9 Springer London Dordrecht Heidelberg New York British Library Cataloguing in Publication Data A catalogue record for this book is available from the British Library Library of Congress Control Number: 2010921295 c Springer-Verlag London Limited 2010 ° Apart from any fair dealing for the purposes of research or private study, or criticism or review, as permitted under the Copyright, Designs and Patents Act 1988, this publication may only be reproduced, stored or transmitted, in any form or by any means, with the prior permission in writing of the publishers, or in the case of reprographic reproduction in accordance with the terms of licenses issued by the Copyright Licensing Agency. Enquiries concerning reproduction outside those terms should be sent to the publishers. The use of registered names, trademarks, etc., in this publication does not imply, even in the absence of a specific statement, that such names are exempt from the relevant laws and regulations and therefore free for general use. The publisher makes no representation, express or implied, with regard to the accuracy of the information contained in this book and cannot accept any legal responsibility or liability for any errors or omissions that may be made. Cover design: KuenkelLopka GmbH Printed on acid-free paper Springer is part of Springer Science+Business Media (www.springer.com)
My Three Dedication Strophes I. To the social fighters Nelson Mandela who helped to eliminate the African apartheid and Evo Morales who worked to eliminate the Andean Indian apartheid. Ama Sua, Ama Qhella, Ama Llulla, Ama Llunk’a. II. To all scientists who don’t work for the development of weapons and technology destined to occupy and dominate countries; for all who work for education, health, water, alternative energy, preservation of the environment and the welfare of the poor people. III. To my beloved wife Joanna Jablonska and two sets of my adored children: Esteban, Fabio, Vinzenz and Silvana; and Nikolai, Claudio and Gladys.
Foreword
Geometric algebra (GA) is a powerful new mathematical system for computational geometry. Although its origins can be traced back to Hermann Grassmann (1844), its development as a language for space–time geometry with applications to all of physics did not begin until 1966. Suddenly, in the year 2000 it was recognized that a specialized version called conformal geometric algebra (CGA) was ideally suited for computational Euclidean geometry. CGA has the great advantage that geometric primitives (point, line plane, circle, sphere) can be directly represented, compared, and manipulated without coordinates, so there is an immediate correspondence between algebraic objects and geometric figures. Moreover, CGA enhances and smoothly integrates the classical methods of projective, affine, and metric geometry with the more specialized methods of quaternions, screw theory, and rigid body mechanics. Applications to computer science and engineering have accumulated rapidly in the last few years. This book assembles diverse aspects of geometric algebra in one place to serve as a general reference for applications to robotics. Then, it demonstrates the power and efficiency of the system with specific applications to a host of problems ranging from computer vision to mechanical control. Perceptive readers will recognize many places where the treatment can be extended or improved. Thus, this book is a work in progress, and its higher purpose will be served if it stimulates further research and development. Physics & Astronomy Department Arizona State University September 2009
David Hestenes
vii
Preface
This book presents the theory and applications of an advanced mathematical language called geometric algebra that greatly helps to express the ideas and concepts, and to develop algorithms in the broad domain of robot physics. In the history of science, without essential mathematical concepts, theories would have not been developed at all. We can observe that in various periods of the history of mathematics and physics, certain stagnation occurred; from time to time, thanks to new mathematical developments, astonishing progress took place. In addition, we see that the knowledge became unavoidably fragmented as researchers attempted to combine different mathematical systems. Each mathematical system brings about some parts of geometry; however, together, these systems constitute a highly redundant system due to an unnecessary multiplicity of representations for geometric concepts. The author expects that due to his persistent efforts to bring to the community geometric algebra for applications as a meta-language for geometric reasoning, in the near future tremendous progress in robotics should take place. What is geometric algebra? Why are its applications so promising? Why should researchers, practitioners, and students make the effort to understand geometric algebra and use it? We want to answer all these questions and convince the reader that becoming acquainted with geometric algebra for applications is a worthy undertaking. The history of geometric algebra is unusual and quite surprising. In the 1870s, William Kingdon Clifford introduced his geometric algebra, building on the earlier works of Sir William Rowan Hamilton and Hermann Gunther Grassmann. In Clifford’s work, we perceive that he intended to describe the geometric properties of vectors, planes, and higher-dimensional objects. Most physicists encounter the algebra in the guise of Pauli and Dirac matrix algebras of quantum theory. Many roboticists or computer graphic engineers use quaternions for 3D rotation estimation and interpolation, as a pointwise approach is too difficult for them to formulate homogeneous transformations of high-order geometric entities. They resort often to tensor calculus for multivariable calculus. Since robotics and engineering make use of the developments of mathematical physics, many beliefs are automatically inherited; for instance, some physicists come away from a study of Dirac theory with the view that Clifford’s algebra is inherently quantum-mechanical. The goal of this book is to eliminate these kinds of beliefs by giving a clear introduction of geometric algebra and showing this new and promising mathematical framework ix
x
Preface
to multivectors and geometric multiplication in higher dimensions. In this new geometric language, most of the standard matter taught to roboticists and computer science engineers can be advantageously reformulated without redundancies and in a highly condensed fashion. Geometric algebra allows us to generalize and transfer concepts and techniques to a wide range of domains with little extra conceptual work. Leibniz dreamed of a geometric calculus system that deals directly with geometric objects rather than with sequences of numbers. It is clear that by increasing the dimension of the geometric space and the generalization of the transformation group, the invariance of the operations with respect to a reference frame will be more and more difficult. Leibniz’s invariance dream is fulfilled for the nD classical geometries using the coordinate-free framework of geometric algebra. The aim of this book is precise and well planned. It is not merely an expos´e of mathematical theory; rather, the author introduces the theory and new techniques of geometric algebra by showing their applications in diverse domains ranging from neural computing and robotics to medical image processing.
Acknowledgments Eduardo Jos´e Bayro Corrochano would like to thank the Center for Research and Advanced Studies (CINVESTAV, Guadalajara, Mexico) and the Consejo Nacional de Ciencia y Tecnolog´ıa (SEP-CONACYT, Mexico) for their support of this book. I am also very grateful to my former Ph.D. students Julio Zamora-Esquivel, Nancy Arana-Daniel, Jorge Rivera Rovelo, Leo Reyes Hendrick, Luis Eduardo Falc´on, Carlos L´opez-Franco, and Rub´en Machucho Cadena for fruitful discussions and technical cooperation. Their creative suggestions, criticism, and patient research work were decisive for the completion of this book. In the geometric algebra community, first of all I am indebted to David Hestenes for all his amazing work in developing the modern subject of geometric algebra and his constant encouragement to me for tackling problems in robot physics. Also, I am very thankful to Garret Sobczyk, Eckard Hitzer, Dietmar Hildebrand, and Joan Lasenby for their support and constructive suggestions. Finally, I am very thankful to the people of Mexico, who pay for my salary, and which made it possible for me to accomplish this contribution to scientific knowledge. CINVESTAV, Guadalajara, Mexico 27 August 2009
Eduardo Bayro Corrochano
How to Use This Book This section begins by briefly describing the organization and content of the chapters and their interdependency. Then it explains how readers can use the book for selfstudy or for delivering a regular graduate course.
Preface
xi
Chapter Organization – Part I: Fundamentals of Geometric Algebra. Chapter 1 gives an outline of geometric algebra. After preliminary definitions, we discuss how to handle linear algebra and simplexes, and multivector calculus is briefly illustrated with Maxwell and Dirac equations. In Chap. 2, we explain the computational advantages of geometric algebra for modeling and solving problems in robotics, computer vision, artificial intelligence, neural computing, and medical image processing. – Part II: Euclidean, Pseudo-Euclidean, Lie and Incidence Algebras and Conformal Geometries. Chapter 3 begins by explaining the geometric algebra models in 2D, 3D, and 4D. Chapter 4 presents the kinematics of points, lines, and planes using 3D geometric algebra and motor algebra. Chapter 5 examines Lie group theory, Lie algebra, and the algebra of incidence using the universal geometric algebra generated by reciprocal null cones. Chapter 6 is devoted to the conformal geometric algebra, explaining the representation of geometric objects and versors. Different geometric configurations are studied, including simplexes, flats, plunges, and carriers. We briefly discuss the promising use of ruled surfaces. Chapter 7 discusses the main issues of the implementation of computer programs for geometric algebra. – Part III: Geometric Computing for Image Processing, Computer Vision, and Neurocomputing. Chapter 8 presents a complete study of the standard and new Clifford wavelet and Fourier transforms. Chapter 9 uses geometric algebra techniques to formulate the n-view geometry of computer vision and the formation of 3D projective invariants for both points and lines in multiple images. We extend these concepts for omnidirectional vision using stereographic mapping onto the unit sphere. In Chap. 10, we present the geometric multilayer perceptrons and Clifford support vector machines for classification, regression, and recurrence. – Part IV: Geometric Computing of Robot Kinematics and Dynamics. Chapter 11 presents a study of the kinematics of robot mechanisms using a language based on points, lines, planes, and spheres. In Chap. 12, the dynamics of robot manipulators is treated, simplifying the representation of the tensors of Euler– Lagrange equations. The power of geometric algebra over matrix algebra and tensor calculus is confirmed with these works. – Part V: Applications I: Image Processing, Computer Vision, and Neurocomputing. Chapter 13 shows applications of Lie operators for key point detection, the quaternion Fourier transform for speech recognition, and the quaternion wavelet transform for optical flow estimation. In Chap. 14, we use projective invariants for 3D shape and motion reconstruction, and robot navigation using n-view cameras and omnidirectional vision. Chapter 15 uses tensor voting and geometric algebra to estimate nonrigid motion. Chapter 16 presents experiments using real data for robot object recognition, interpolation, and the implementation of a Clifford SVD recurrent system. Chapter 17 shows the use of a geometric selforganizing neural net to segment 2D contours and 3D shapes. – Part VI: Applications II: Robotics and Medical Robotics. Chapter 18 is devoted to line motion estimation using SVD and extended Kalman filter techniques. Chapter 19 presents a tracker endoscope calibration and the calibration of sensors
xii
Preface
Fig. 0.1 Chapter interdependence: fundamentals ! theory of applications ! applications ! appendix
with respect to a robot frame. Here, we use purely a language of lines and motors. Chapter 20 illustrates visual-guided grasping tasks using representations and geometric constraints developed and found using conformal geometric algebra. Chapter 21 describes a 3D map reconstruction and relocalization using conformal geometric entities exploiting the Hough space. Chapter 22 presents the application of marching spheres for 3D medical shape representation and registration. – Part VII: Appendix. Chapter 23 includes an outline of Clifford algebra. The reader can find concepts and definitions related to classic Clifford algebra and related algebras: Gibbs vector algebra, exterior algebras, and Grassmann–Cayley algebras.
Interdependence of the Book Chapters The interdependence of the book chapters is shown in Fig. 0.1. Essentially, there are four groups of chapters: – Fundamentals of geometric algebra: Chaps. 1, 3, and 4. Chapter 6 is optional. – Theory of the applications areas using the geometric algebra framework. – Chapter 8 Clifford–Fourier and wavelet transforms – Chapter 9 Computer vision
Preface
xiii
– Chapter 10 Geometric neural computing – Chapters 11 and 12 Kinematics and dynamics – Applications using real data from cameras, laser, omnidirectional vision, robot manipulators, and robots – Chapter 13 Application of Clifford–Fourier and wavelet transforms – Chapters 14 and 15 Applications in computer vision – Chapters 18, 19, and 21 Robot vision – Complementary material: – Chapter 2 Modeling robot physics – Chapter 7 Programming issues – Appendix: Chap. 23 Clifford algebras and related algebras: Gibbs’ vector algebra, exterior algebras, and Grassmann–Cayley algebras The reader should start with the fundamentals of geometric algebra and then select a domain of application, read its theory, and get more insight by studying the related applications. For the more mathematically inclined reader, Chaps. 5 and 23 are recommended. However, a reader who wants to know more about some definitions and concepts of the classical Clifford algebra can consult Chap. 23 as well. Chapters 2 and 7 provide very useful material to get an overall overview of the advantages of the geometric algebra framework and to know more about the key programming issues and consider our suggestions.
Audience Let us start with a famous saying of Clifford: : : : for geometry, you know, is the gate of science, and the gate is so low and small that one can only enter it as a little child. William Kingdon Clifford (1845–1879)
The book is aimed at a graduate level; thus, a basic knowledge of linear algebra and vector calculus is needed. A mind free of biases or prejudices caused by oldfashioned mathematics like matrix algebra and vector calculus is an even better prerequisite to reading this book. Ideas and concepts in robot physics can be advantageously expressed using an appropriate language of mathematics. Of course, this language is not unique. There are many different algebraic systems, with their own advantages and disadvantages. Furthermore, treating a problem combining excessively different mathematical systems results in a fragmentation of the knowledge. In this book, we present geometric algebra, which we believe is the most powerful available mathematical system to date. It is a unifying language for treating robot physics; as a result, the knowledge is not fragmented and we can produce compact and less redundant mathematical expressions that are prone to be optimized for real-time applications.
xiv
Preface
One should simply start to formulate the problems using multivectors and versors and forget the formalism of matrices. However, bear in mind that when developing algorithms for particular problems, it is often a matter of integration of techniques. For example, one will model the problem with geometric algebra to find geometric constraints and optimize the resulting code for real-time execution; but it is also possible that one may resort to certain computations for other algorithms that were not necessarily developed using the geometric algebra framework, for example, Cholesky decomposition or image preprocessing using a morphological filter. No prior knowledge in computer vision, robotics, wavelets theory, or neural computing of graphics engineering is indispensable. If you are acquainted with the topics of these fields, even better! The book is well suited for a taught course or for selfstudy at the postgraduate level. For the more mathematically inclined reader, Chaps. 5 and 23 are complementary. Readers wanting to know more about some definitions and concepts of the classical Clifford algebra should consult Chap. 23.
Exercises The book offers 158 exercises, so that the reader can gradually learn to formulate equations and compute by hand or using a software package. The exercises were formulated to teach the fundamentals of geometric algebra and also to stimulate the development of readers’ skills in creative and efficient geometric computing. Chapter 1 includes 38 exercises for basic geometric algebra computations to learn to get along with the generalized inner product, wedge of multivectors, and some exercises in differential multivector calculus. Chapter 3 offers 29 exercises for 2D, 3D, and 4D geometric algebras, particularly to get acquainted with Lie and bivector algebras, power series involving the exterior exponent, and rotor and motor operations for practical use. Chapter 4 has 20 exercises related to the 2D and 3D kinematics of points, lines, and planes using 3D and 4D geometric algebras. It demands exercises about bivector relations, motion of geometric entities, velocity using motors, and flags for describing configurations of geometric entities attached to an object frame. Chapter 5 presents 13 exercises related to Lie group theory, Lie algebra, and the algebra of incidence using the reciprocal null cones of the universal geometric algebra, directed distance, and the application of flags in the affine plane. Chapter 6 provides 42 exercises, consisting of basics computations with versor techniques and geometric entities as the point, line, plane, circle, planes, spheres, flats, plunges, and carriers, and computations in the dual space. There are also exercises to prove classical theorems in conformal geometric algebra. Chapter 9 has 14 exercises related to topics in projective geometry, algebra of incidence, projective invariants, and the proof of classical theorems, as well as computations involving tensors of the n-uncalibrated cameras. The computations have to be done using the geometric algebra G 3;1 with the Minkowski metric and the 3D Euclidean geometric algebra G3 for the projective space and projective plane, respectively.
Preface
xv
Use of Computer Geometric Algebra Programs In recent years, we have been using different software packages to check equations, prove theorems, and develop algorithms for applications. We recommend using CLICAL for checking equations and proving theorems. CLICAL was developed by the Pertti Lounesto’s group and is available at http://users.tkk.fi/ppuska/mirror/Lounesto/CLICAL.htm. For proving equations using symbolic programming, we recommend using the Maple-based package CLIFFORD of Rafal Ablamowicz, which supports N 9 and is available at http://math.tntech.edu/rafal. For practicing and learning geometric algebra computing, and to try a variety of problems in computer science and graphics, we suggest using the CCC-based CLUCAL of Christian Perwass, available at http://www.perwass.de/cbup/clu.html. and GAIGEN2 of Leo Dorst’s group, which generates fast CCC or JAVA sources for low-dimensional geometric algebra, available at http://www.science.uva.nl/ga/ gaigen/. A very powerful multivector software for applications in computer science and physics is the CCC MV 1.3.0 to 1.6 sources supporting N 63 of Ian Bell and is available at http://www.iancgbell.clara.net/maths/index.htm. The reader can also download our CCC and Matlab programs, which are being routinely used for applications in robotics, image processing, wavelet transforms, computer vision, neural computing, and medical robotics. These are available at http://www.gdl.cinvestav.mx/edb/GAprogramming.
Use as a Textbook In the last decade, the author has taught in various universities a postgraduate course on the topic of geometric algebra and its applications: at the Institut f¨ur Informatik und Angewandte Mathematik der Christian Albrechts Universit¨at (WS 1998), Kiel, Germany, CINVESTAV Department of Electrical Engineering and Computer Science (annually since 2001) in Guadalajara, Mexico, and at the Informatyk Institut der Karlsruhe Universit¨at (SS 2008, 2 SWS, LV.- No. 24631), in Karlsruhe, Germany. The author has been involved with his collaborators in developing concepts and computer algorithms for a variety of application areas. At the same time, he has established a course on geometric algebra and its applications that has proven to be accessible and very useful for beginners. In general, the author follows a sequence of topics based on the material in this book. From the very beginning, students use software packages to try what is taught in the classroom. – Chapter 1 gives an introduction to geometric algebra. – Chapter 3 begins explaining the geometric algebra models for 2D, 3D, and 4D. This allows the student to learn to represent linear transformations as rotors; represent points, lines, and planes; and apply rotors and translators using these geometric entities. The duality principle is explained.
xvi
Preface
– Chapter 4 presents the kinematics of points, lines, and planes using 3D geometric algebra and motor algebra. The big step here is the use of homogeneous transformations to linearize rigid motion transformation. Once the student has understood motor algebra, he or she becomes proficient in modeling, and applies the language of points, lines, and planes to tackle various problems in robotics, computer vision, and mechanical engineering. – After these experiences, the student should move to treating projective geometry in the 4D geometric algebra G3;1 (the Minkowski metric). The role of the projective split will be explained. If the student has attended a lecture on computer vision where n-view geometry is taught [84], it is a huge advantage to better understand the use of geometric algebra for projective geometry. – Chapter 6 is devoted to conformal geometric algebra, where the experiences gained with motor algebra and G3;1 for projective geometry are key to understanding homogeneous representations, null vectors, and the use of the null cone in the conformal geometric algebra framework. The role of the conformal additive and multiplicative splits is explained. The representation of geometric objects and versors is also shown. Different geometric configurations are studied, including simplexes, flats, plunges, and carriers. We briefly discuss the promising use of ruled surfaces and the role of conformal mapping in computer vision. In each theme, the lecturer illustrates the theory with real applications in the domains of computer vision, robotics, neural computing, medical robotics, and graphic engineering. For this we use the chapters in Part III (Complementary theory for applications) and those of Parts IV and V for illustrations of real applications. The analysis of the application examples is a good practice for students, as they will go through well-designed training illustrations. In this way, they will gain more insight into the potential of geometric algebra, learn how it has to be applied, and finally devise algorithms that are directly applicable in their research and engineering projects. If lecturers around the world want to hold a course on geometric algebra and its applications, they can use this book, and a Powerpoint presentation for lecturers is available for free use at http://www.gdl.cinvestav.mx/edb/GAlecture.
Contents
Foreword .. . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . vii Preface .. . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . ix Part I Fundamentals of Geometric Algebra 1
Introduction to Geometric Algebra.. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.1 History of Geometric Algebra .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.2 What Is Geometric Algebra? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.2.1 Basic Definitions .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.2.2 Nonorthonormal Frames and Reciprocal Frames . . . . . . . . 1.2.3 Some Useful Formulas . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.2.4 Multivector Products.. . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.2.5 Further Properties of the Geometric Product . .. . . . . . . . . . . 1.2.6 Dual Blades and Duality in the Geometric Product . . . . . . 1.2.7 Multivector Operations . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.3 Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.4 Simplexes.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.5 Geometric Calculus .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.5.1 Multivector-Valued Functions and the Inner Product . . . . 1.5.2 The Multivector Integral .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.5.3 The Vector Derivative.. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.5.4 Grad, Div, and Curl . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.5.5 Multivector Fields . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.5.6 Convolution and Correlation of Scalar Fields .. . . . . . . . . . . 1.5.7 Clifford Convolution and Correlation .. . . . . . . . .. . . . . . . . . . . 1.5.8 Linear Algebra Derivations.. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.5.9 Reciprocal Frames with Curvilinear Coordinates . . . . . . . . 1.5.10 Geometric Calculus in 2D .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.5.11 Electromagnetism: The Maxwell Equations .. .. . . . . . . . . . . 1.5.12 Spinors, Shr¨odinger Pauli, and Dirac Equations . . . . . . . . . 1.5.13 Spinor Operators .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 1.6 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .
3 3 5 5 7 8 8 11 19 20 22 23 25 25 26 27 27 28 29 29 30 31 31 32 35 37 39
xvii
xviii
2
Contents
Geometric Algebra for Modeling in Robot Physics . . . . . . . . . . . .. . . . . . . . . . . 2.1 The Roots of Geometry and Algebra . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2.2 Geometric Algebra: A Unified Mathematical Language .. . . . . . . . . . . 2.3 What Does Geometric Algebra Offer for Geometric Computing? . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2.3.1 Coordinate-Free Mathematical System . . . . . . . .. . . . . . . . . . . 2.3.2 Models for Euclidean and PseudoEuclidean Geometry . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2.3.3 Subspaces as Computing Elements . . . . . . . . . . . .. . . . . . . . . . . 2.3.4 Representation of Orthogonal Transformations . . . . . . . . . . 2.3.5 Objects and Operators . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2.3.6 Extension of Linear Transformations . . . . . . . . . .. . . . . . . . . . . 2.3.7 Signals and Wavelets in the Geometric Algebra Framework .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2.3.8 Kinematics and Dynamics .. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 2.4 Solving Problems in Perception and Action Systems . . . .. . . . . . . . . . .
45 45 47 48 48 49 50 51 51 52 53 53 54
Part II Euclidean, Pseudo-Euclidean, Lie and Incidence Algebras, and Conformal Geometries 3
2D, 3D, and 4D Geometric Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.1 Complex, Double, and Dual Numbers . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.2 2D Geometric Algebras of the Plane.. . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.3 3D Geometric Algebra for the Euclidean 3D Space . . . . .. . . . . . . . . . . 3.3.1 The Algebra of Rotors .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.3.2 Orthogonal Rotors . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.3.3 Recovering a Rotor . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.4 Quaternion Algebra .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.5 Lie Algebras and Bivector Algebras . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.5.1 Lie Group of Rotors . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.5.2 Bivector Lie Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.5.3 Complex Structures and Unitary Groups . . . . . .. . . . . . . . . . . 3.5.4 Hermitian Inner Product and Unitary Groups .. . . . . . . . . . . 3.6 4D Geometric Algebra for 3D Kinematics . . . . . . . . . . . . . . .. . . . . . . . . . . 3.6.1 Motor Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . C 3.6.2 Motors, Rotors, and Translators in G3;0;1 . . . . . .. . . . . . . . . . . 3.6.3 Properties of Motors . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.7 4D Geometric Algebra for Projective 3D Space . . . . . . . . .. . . . . . . . . . . 3.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 3.9 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .
63 63 64 66 67 70 71 72 74 75 76 77 78 80 80 82 85 87 88 88
4
Kinematics of the 2D and 3D Spaces . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 93 4.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 93 4.2 Representation of Points, Lines, and Planes Using 3D Geometric Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 93
Contents
xix
4.3
Representation of Points, Lines, and Planes Using Motor Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 95 Representation of Points, Lines, and Planes Using 4D Geometric Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . 96 Motion of Points, Lines, and Planes in 3D Geometric Algebra . . . . 97 Motion of Points, Lines, and Planes Using Motor Algebra .. . . . . . . . 98 Motion of Points, Lines, and Planes Using 4D Geometric Algebra .. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .100 Spatial Velocity of Points, Lines, and Planes . . . . . . . . . . . . .. . . . . . . . . . .101 4.8.1 Rigid-Body Spatial Velocity Using Matrices . .. . . . . . . . . . .101 4.8.2 Angular Velocity Using Rotors. . . . . . . . . . . . . . . . .. . . . . . . . . . .105 4.8.3 Rigid-Body Spatial Velocity Using Motor Algebra . . . . . .108 4.8.4 Point, Line, and Plane Spatial Velocities Using Motor Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .109 Incidence Relations Between Points, Lines, and Planes .. . . . . . . . . . .110 4.9.1 Flags of Points, Lines, and Planes . . . . . . . . . . . . .. . . . . . . . . . .111 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .112 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .112
4.4 4.5 4.6 4.7 4.8
4.9 4.10 4.11 5
Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .117 5.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .117 5.2 Geometric Algebra of Reciprocal Null Cones . . . . . . . . . . . .. . . . . . . . . . .118 5.2.1 Reciprocal Null Cones .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .118 5.2.2 The Universal Geometric Algebra Gn;n . . . . . . . .. . . . . . . . . . .119 5.2.3 The Lie Algebra of Null Spaces.. . . . . . . . . . . . . . .. . . . . . . . . . .120 5.2.4 The Standard Bases of Gn;n . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .122 5.2.5 Representations and Operations Using Bivector Matrices . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .123 5.2.6 Bivector Representation of Linear Operators ... . . . . . . . . . .124 5.3 Horosphere and n-Dimensional Affine Plane . . . . . . . . . . . .. . . . . . . . . . .125 5.4 The General Linear Group.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .127 5.4.1 The General Linear Algebra gl.N / of the General Linear Lie Group GL.N / . . . . .. . . . . . . . . . .129 5.4.2 The Orthogonal Groups . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .130 5.5 Computing Rigid Motion in the Affine Plane . . . . . . . . . . . .. . . . . . . . . . .133 5.6 The Lie Algebra of the Affine Plane . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .134 5.7 The Algebra of Incidence.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .138 5.7.1 Incidence Relations in the Affine n-Plane . . . . .. . . . . . . . . . .140 5.7.2 Directed Distances .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .141 5.7.3 Incidence Relations in the Affine 3-Plane . . . . .. . . . . . . . . . .142 5.7.4 Geometric Constraints as Flags . . . . . . . . . . . . . . . .. . . . . . . . . . .144 5.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .144 5.9 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .145
xx
Contents
6
Conformal Geometric Algebra .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .149 6.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .149 6.1.1 Conformal Split . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .150 6.1.2 Conformal Splits for Points and Simplexes.. . .. . . . . . . . . . .151 6.1.3 Euclidean and Conformal Spaces . . . . . . . . . . . . . .. . . . . . . . . . .152 6.1.4 Stereographic Projection . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .155 6.1.5 Inner- and Outer-Product Null Spaces . . . . . . . . .. . . . . . . . . . .157 6.1.6 Spheres and Planes. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .158 6.1.7 Geometric Identities, Meet and Join Operations, Duals, and Flats . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .160 6.1.8 Meet, Pair of Points, and Plunge . . . . . . . . . . . . . . .. . . . . . . . . . .166 6.1.9 Simplexes and Spheres . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .168 6.2 The 3D Affine Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .169 6.2.1 Lines and Planes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .170 6.2.2 Directed Distance . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .171 6.3 The Lie Algebra .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .172 6.4 Conformal Transformations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .172 6.4.1 Inversion.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .173 6.4.2 Reflection.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .175 6.4.3 Translation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .176 6.4.4 Transversion.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .176 6.4.5 Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .177 6.4.6 Rigid Motion Using Flags . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .177 6.4.7 Dilation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .179 6.4.8 Involution.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .179 6.4.9 Conformal Transformation . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .180 6.5 Ruled Surfaces . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .180 6.5.1 Cone and Conics . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .180 6.5.2 Cycloidal Curves .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .181 6.5.3 Helicoid . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .182 6.5.4 Sphere and Cone . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .182 6.5.5 Hyperboloid, Ellipsoids, and Conoid . . . . . . . . . .. . . . . . . . . . .183 6.6 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .183
7
Programming Issues . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .189 7.1 Main Issues for an Efficient Implementation . . . . . . . . . . . . .. . . . . . . . . . .189 7.1.1 Specific Aspects for the Implementation . . . . . .. . . . . . . . . . .190 7.2 Implementation Practicalities . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .191 7.2.1 Specification of the Geometric Algebra, Gp;q .. . . . . . . . . . .191 7.2.2 The General Multivector Class . . . . . . . . . . . . . . . . .. . . . . . . . . . .191 7.2.3 Optimization of Multivector Functions . . . . . . . .. . . . . . . . . . .192 7.2.4 Factorization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .193 7.2.5 Speeding Up Geometric Algebra Expressions . . . . . . . . . . .194 7.2.6 Multivector Software Packets . . . . . . . . . . . . . . . . . .. . . . . . . . . . .195
Contents
xxi
Part III Geometric Computing for Image Processing, Computer Vision, and Neurocomputing 8
Clifford–Fourier and Wavelet Transforms . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .201 8.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .201 8.2 Image Analysis in the Frequency Domain .. . . . . . . . . . . . . . .. . . . . . . . . . .201 8.2.1 The One-Dimensional Fourier Transform . . . . .. . . . . . . . . . .202 8.2.2 The Two-Dimensional Fourier Transform .. . . .. . . . . . . . . . .203 8.2.3 Quaternionic Fourier Transform . . . . . . . . . . . . . . .. . . . . . . . . . .203 8.2.4 2D Analytic Signals . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .205 8.2.5 Properties of the QFT .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .208 8.2.6 Discrete QFT . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .211 8.3 Image Analysis Using the Phase Concept . . . . . . . . . . . . . . . .. . . . . . . . . . .213 8.3.1 2D Gabor Filters . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .213 8.3.2 The Phase Concept.. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .214 8.4 Clifford–Fourier Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .214 8.4.1 Tri-Dimensional Clifford–Fourier Transform .. . . . . . . . . . .217 8.4.2 Space and Time Geometric Algebra Fourier Transform . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .218 8.4.3 n-Dimensional Clifford–Fourier Transform .. .. . . . . . . . . . .219 8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .219 8.5.1 Real Wavelet Transform .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .220 8.5.2 Discrete Wavelets . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .220 8.5.3 Wavelet Pyramid .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .223 8.5.4 Complex Wavelet Transform .. . . . . . . . . . . . . . . . . .. . . . . . . . . . .223 8.5.5 Quaternion Wavelet Transform.. . . . . . . . . . . . . . . .. . . . . . . . . . .225 8.5.6 Quaternionic Wavelet Pyramid .. . . . . . . . . . . . . . . .. . . . . . . . . . .229 8.5.7 The Tridimensional Clifford Wavelet Transform .. . . . . . . .232 8.5.8 The Continuous Conformal Geometric Algebra Wavelet Transform . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .234 8.5.9 The n-Dimensional Clifford Wavelet Transform .. . . . . . . .235 8.6 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .236
9
Geometric Algebra of Computer Vision . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .237 9.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .237 9.2 The Geometric Algebras of 3D and 4D Spaces . . . . . . . . . .. . . . . . . . . . .237 9.2.1 3D Space and the 2D Image Plane . . . . . . . . . . . . .. . . . . . . . . . .238 9.2.2 The Geometric Algebra of 3D Euclidean Space . . . . . . . . .240 9.2.3 A 4D Geometric Algebra for Projective Space .. . . . . . . . . .240 9.2.4 Projective Transformations .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .241 9.2.5 The Projective Split . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .242 9.3 The Algebra of Incidence.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .244 9.3.1 The Bracket . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .245 9.3.2 The Duality Principle and Meet and Join Operations . . . .246
xxii
Contents
9.4
9.5
9.6
9.7
9.8
9.9 9.10
Algebra in Projective Space . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .247 9.4.1 Intersection of a Line and a Plane . . . . . . . . . . . . . .. . . . . . . . . . .248 9.4.2 Intersection of Two Planes . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .249 9.4.3 Intersection of Two Lines . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .250 9.4.4 Implementation of the Algebra .. . . . . . . . . . . . . . . .. . . . . . . . . . .250 Projective Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .251 9.5.1 The 1D Cross-Ratio. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .251 9.5.2 2D Generalization of the Cross-Ratio. . . . . . . . . .. . . . . . . . . . .253 9.5.3 3D Generalization of the Cross-Ratio. . . . . . . . . .. . . . . . . . . . .254 Visual Geometry of n-Uncalibrated Cameras . . . . . . . . . . . .. . . . . . . . . . .255 9.6.1 Geometry of One View . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .255 9.6.2 Geometry of Two Views . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .259 9.6.3 Geometry of Three Views . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .261 9.6.4 Geometry of n-Views . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .263 Omnidirectional Vision .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .264 9.7.1 Omnidirectional Vision and Geometric Algebra . . . . . . . . .265 9.7.2 Point Projection .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .266 9.7.3 Inverse Point Projection . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .267 Invariants in the Conformal Space . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .268 9.8.1 Invariants and Omnidirectional Vision. . . . . . . . .. . . . . . . . . . .269 9.8.2 Projective and Permutation p 2 -Invariants . . . . .. . . . . . . . . . .271 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .273 Exercises .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .273
10 Geometric Neuralcomputing . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .277 10.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .277 10.2 Real-Valued Neural Networks.. . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .278 10.3 Complex MLP and Quaternionic MLP . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .279 10.4 Geometric Algebra Neural Networks . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .280 10.4.1 The Activation Function .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .280 10.4.2 The Geometric Neuron . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .281 10.4.3 Feedforward Geometric Neural Networks .. . . .. . . . . . . . . . .283 10.4.4 Generalized Geometric Neural Networks . . . . .. . . . . . . . . . .284 10.4.5 The Learning Rule . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .285 10.4.6 Multidimensional Back-Propagation Training Rule.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .285 10.4.7 Simplification of the Learning Rule Using the Density Theorem . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .286 10.4.8 Learning Using the Appropriate Geometric Algebras . . .287 10.5 Support Vector Machines in Geometric Algebra . . . . . . . . .. . . . . . . . . . .288 10.6 Linear Clifford Support Vector Machines for Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .288 10.7 Nonlinear Clifford Support Vector Machines For Classification . . .292 10.8 Clifford SVM for Regression.. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .294 10.9 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .296
Contents
Part IV
xxiii
Geometric Computing of Robot Kinematics and Dynamics
11 Kinematics. . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .299 11.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .299 11.2 Elementary Transformations of Robot Manipulators . . . .. . . . . . . . . . .299 11.2.1 The Denavit–Hartenberg Parameterization . . . .. . . . . . . . . . .300 11.2.2 Representations of Prismatic and Revolute Transformations .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .301 11.2.3 Grasping by Using Constraint Equations . . . . . .. . . . . . . . . . .304 11.3 Direct Kinematics of Robot Manipulators .. . . . . . . . . . . . . . .. . . . . . . . . . .306 11.3.1 MAPLE Program for Motor Algebra Computations .. . . .307 11.4 Inverse Kinematics of Robot Manipulators Using Motor Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .308 11.4.1 The Rendezvous Method . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .309 11.4.2 Computing 1 , 2 , and d3 Using a Point . . . . . .. . . . . . . . . . .309 11.4.3 Computing 4 and 5 Using a Line . . . . . . . . . . . .. . . . . . . . . . .312 11.4.4 Computing 6 Using a Plane Representation ... . . . . . . . . . .314 11.5 Inverse Kinematics Using the 3D Affine Plane .. . . . . . . . . .. . . . . . . . . . .315 11.6 Inverse Kinematic Using Conformal Geometric Algebra .. . . . . . . . . .318 11.7 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .322 12 Dynamics . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .325 12.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .325 12.2 Differential Kinematics .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .325 12.3 Dynamics .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .328 12.3.1 Kinetic Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .328 12.3.2 Potential Energy . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .335 12.3.3 Lagrange’s Equations . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .335 12.4 Complexity Analysis.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .343 12.4.1 Computing M . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .343 12.4.2 Computing G . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .343 12.5 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .344 Part V Applications I: Image Processing, Computer Vision, and Neurocomputing 13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .347 13.1 Lie Filters in the Affine Plane . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .347 13.1.1 The Design of an Image Filter . . . . . . . . . . . . . . . . .. . . . . . . . . . .348 13.1.2 Recognition of Hand Gestures . . . . . . . . . . . . . . . . .. . . . . . . . . . .349 13.2 Representation of Speech as 2D Signals . . . . . . . . . . . . . . . . . .. . . . . . . . . . .350 13.3 Preprocessing of Speech 2D Representations Using the QFT and Quaternionic Gabor Filter . . . . . . . . . . .. . . . . . . . . . .352 13.3.1 Method 1 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .352 13.3.2 Method 2 . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .354
xxiv
Contents
13.4 13.5
13.6
Recognition of French Phonemes Using Neurocomputing . . . . . . . . .355 Application of QWT .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .357 13.5.1 Estimation of the Quaternionic Phase. . . . . . . . . .. . . . . . . . . . .358 13.5.2 Confidence Interval . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .359 13.5.3 Discussion on Similarity Distance and the Phase Concept .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .360 13.5.4 Optical Flow Estimation .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .361 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .365
14 Invariants Theory in Computer Vision and Omnidirectional Vision . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .367 14.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .367 14.2 Conics and Pascal’s Theorem . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .368 14.3 Computing Intrinsic Camera Parameters . . . . . . . . . . . . . . . . .. . . . . . . . . . .371 14.4 Projective Invariants . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .372 14.4.1 The 1D Cross-Ratio. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .373 14.4.2 2D Generalization of the Cross-Ratio. . . . . . . . . .. . . . . . . . . . .374 14.4.3 3D Generalization of the Cross-Ratio. . . . . . . . . .. . . . . . . . . . .376 14.4.4 Generation of 3D Projective Invariants . . . . . . . .. . . . . . . . . . .377 14.5 3D Projective Invariants from Multiple Views . . . . . . . . . . .. . . . . . . . . . .381 14.5.1 Projective Invariants Using Two Views . . . . . . . .. . . . . . . . . . .381 14.5.2 Projective Invariant of Points Using Three Uncalibrated Cameras . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .383 14.5.3 Comparison of the Projective Invariants .. . . . . .. . . . . . . . . . .385 14.6 Visually Guided Grasping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .387 14.6.1 Parallel Orienting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .387 14.6.2 Centering . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .389 14.6.3 Grasping .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .389 14.6.4 Holding the Object .. . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .390 14.7 Camera Self-Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .390 14.8 Projective Depth . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .391 14.9 Shape and Motion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .393 14.9.1 The Join-Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .394 14.9.2 The SVD Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .395 14.9.3 Completion of the 3D Shape Using Invariants . . . . . . . . . . .396 14.10 Omnidirectional Vision Landmark Identification Using Projective Invariants .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .398 14.10.1 Learning Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .398 14.10.2 Recognition Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .399 14.10.3 Omnidirectional Vision and Invariants for Robot Navigation . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .400 14.10.4 Learning Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .401 14.10.5 Recognition Phase . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .401 14.10.6 Quantitative Results . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .402 14.11 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .403
Contents
xxv
15 Registration of 3D Points Using GA and Tensor Voting .. . . . . .. . . . . . . . . . .405 15.1 Problem Formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .405 15.1.1 The Geometric Constraint . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .406 15.2 Tensor Voting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .409 15.2.1 Tensor Representation in 3D . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .409 15.2.2 Voting Fields in 3D . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .410 15.2.3 Detection of 3D Surfaces .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .414 15.2.4 Estimation of 3D Correspondences . . . . . . . . . . . .. . . . . . . . . . .415 15.3 Experimental Analysis .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .417 15.3.1 Correspondences Between 3D Points by Rigid Motion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .417 15.3.2 Multiple Overlapping Motions and Nonrigid Motion .. . .419 15.3.3 Extension to Nonrigid Motion . . . . . . . . . . . . . . . . .. . . . . . . . . . .420 15.4 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .422 16 Applications in Neuralcomputing .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .425 16.1 Experiments Using Geometric Feedforward Neural Networks .. . . .425 16.1.1 Learning a High Nonlinear Mapping . . . . . . . . . .. . . . . . . . . . .425 16.1.2 Encoder–Decoder Problem .. . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .426 16.1.3 Prediction.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .428 16.2 Experiments Using Clifford Support Vector Machines . .. . . . . . . . . . .429 16.2.1 3D Spiral: Nonlinear Classification Problem ... . . . . . . . . . .430 16.2.2 Object Recognition . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .432 16.2.3 Multi-Case Interpolation . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .440 16.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .442 17 Neural Computing for 2D Contour and 3D Surface Reconstruction . . .443 17.1 Determining the Shape of an Object . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .443 17.1.1 Automatic Sample Selection Using GGVF . . . .. . . . . . . . . . .444 17.1.2 Learning the Shape Using Versors . . . . . . . . . . . . .. . . . . . . . . . .446 17.2 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .448 17.3 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .455 Part VI Applications II: Robotics and Medical Robotics 18 Rigid Motion Estimation Using Line Observations .. . . . . . . . . . .. . . . . . . . . . .459 18.1 Introduction . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .459 18.2 Batch Estimation Using SVD Techniques . . . . . . . . . . . . . . . .. . . . . . . . . . .459 18.2.1 Solving AX D XB Using Motor Algebra . . . . .. . . . . . . . . . .461 18.2.2 Estimation of the Hand–Eye Motor Using SVD . . . . . . . . .464 18.3 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .466 18.4 Discussion .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .470 18.5 Recursive Estimation Using Kalman Filter Techniques... . . . . . . . . . .470 18.5.1 The Kalman Filter . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .470 18.5.2 The Extended Kalman Filter . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .472 18.5.3 The Rotor-Extended Kalman Filter . . . . . . . . . . . .. . . . . . . . . . .474
xxvi
Contents
18.6
18.7
The Motor-Extended Kalman Filter. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .477 18.6.1 Representation of the Line Motion Model in Linear Algebra . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .478 18.6.2 Linearization of the Measurement Model . . . . .. . . . . . . . . . .480 18.6.3 Enforcing a Geometric Constraint . . . . . . . . . . . . .. . . . . . . . . . .481 18.6.4 Operation of the MEKF Algorithm . . . . . . . . . . . .. . . . . . . . . . .483 18.6.5 Estimation of the Relative Positioning of a Robot End-Effector . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .486 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .490
19 Tracker Endoscope Calibration and Body-Sensors’ Calibration . . . . . . .491 19.1 Camera Device Calibration .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .491 19.1.1 Rigid Body Motion in CGA . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .491 19.1.2 Hand–Eye Calibration in CGA . . . . . . . . . . . . . . . . .. . . . . . . . . . .493 19.1.3 Tracker Endoscope Calibration . . . . . . . . . . . . . . . .. . . . . . . . . . .494 19.2 Body-Sensor Calibration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .497 19.2.1 Body–Eye Calibration . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .498 19.2.2 Algorithm Simplification .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .501 19.3 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .503 20 Tracking, Grasping, and Object Manipulation . . . . . . . . . . . . . . . .. . . . . . . . . . .505 20.1 Tracking . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .505 20.1.1 Exact Linearization via Feedback .. . . . . . . . . . . . .. . . . . . . . . . .506 20.1.2 Visual Jacobian . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .508 20.1.3 Exact Linearization via Feedback .. . . . . . . . . . . . .. . . . . . . . . . .509 20.1.4 Experimental Results . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .510 20.2 Barrett Hand Direct Kinematics .. . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .512 20.3 Pose Estimation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .514 20.3.1 Segmentation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .515 20.3.2 Object Projection . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .516 20.4 Grasping Objects .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .518 20.4.1 First Style of Grasping.. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .519 20.4.2 Second Style of Grasping.. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .521 20.4.3 Third Style of Grasping.. . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .521 20.5 Target Pose . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .522 20.5.1 Object Pose. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .524 20.6 Visually Guided Grasping . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .524 20.6.1 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .525 20.7 Fuzzy Logic and Conformal Geometric Algebra for Grasping .. . . .525 20.7.1 Mandami Fuzzy System . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .526 20.7.2 Direct Kinematics of the Barrett Hand.. . . . . . . .. . . . . . . . . . .527 20.7.3 Fuzzy Grasping of Objects . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .528 20.8 Conclusion . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .531
Contents
xxvii
21 3D Maps, Navigation, and Relocalization .. . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .533 21.1 Map Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .533 21.1.1 Matching Laser Readings . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .533 21.1.2 Map Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .536 21.1.3 Line Map . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .536 21.1.4 3D Map Building . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .538 21.2 Navigation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .540 21.2.1 Localization . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .540 21.2.2 Adding Objects to the 3D Map .. . . . . . . . . . . . . . . .. . . . . . . . . . .540 21.2.3 Path Following . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .541 21.3 3D Map Building Using Laser and Stereo Vision . . . . . . . .. . . . . . . . . . .545 21.3.1 Laser Rangefinder.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .548 21.3.2 Stereo Camera System with Pan-Tilt Unit . . . . .. . . . . . . . . . .550 21.4 Relocation Using Lines and the Hough Transform.. . . . . .. . . . . . . . . . .551 21.5 Experiments . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .554 21.6 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .555 22 Modeling and Registration of Medical Data .. . . . . . . . . . . . . . . . . . .. . . . . . . . . . .557 22.1 Background . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .557 22.1.1 Union of Spheres.. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .557 22.1.2 The Marching Cubes Algorithm . . . . . . . . . . . . . . .. . . . . . . . . . .558 22.2 Segmentation .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .559 22.3 Marching Spheres .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .563 22.3.1 Experimental Results for Modeling .. . . . . . . . . . .. . . . . . . . . . .564 22.4 Registration of Two Models .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .567 22.4.1 Sphere Matching .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .567 22.4.2 Experimental Results for Registration . . . . . . . . .. . . . . . . . . . .570 22.5 Conclusions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .572 Part VII Appendix 23 Clifford Algebras and Related Algebras .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .575 23.1 Clifford Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .575 23.1.1 Basic Properties .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .575 23.1.2 Definitions and Existence . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .576 23.1.3 Real and Complex Clifford Algebras . . . . . . . . . .. . . . . . . . . . .577 23.1.4 Involutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .579 23.1.5 Structure and Classification of Clifford Algebras . . . . . . . .579 23.1.6 Clifford Groups, Pin and Spin Groups, and Spinors . . . . .581 23.2 Related Algebras .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .584 23.2.1 Gibbs’ Vector Algebra .. . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .584 23.2.2 Exterior Algebras . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .586 23.2.3 Grassmann–Cayley Algebras.. . . . . . . . . . . . . . . . . .. . . . . . . . . . .590 24 Notation .. . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .595
xxviii
Contents
25 Useful Formulas for Geometric Algebra .. . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .597 References .. . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .603 Index . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .. . . . . . . . . . .613
Part I
Fundamentals of Geometric Algebra
Chapter 1
Introduction to Geometric Algebra
This chapter gives a detailed outline of geometric algebra and explains the related traditional algebras in common use by mathematicians, physicists, computer scientists, and engineers.
1.1 History of Geometric Algebra Historically, Clifford algebra in its geometric interpretation has constituted the general framework for the embedding and development of ideas of multilinear algebra, multivariable analysis, and the representation theory of Lie groups and Lie algebras. This trend toward geometric algebra started in 300 B.C. with the synthetic geometry of Euclid and has continued to evolve into the present. The analytic geometry of Descartes (1637), the complex algebra of Wessel and Gauss (1798), Hamilton algebra (1843), matrix algebra (Cayley 1854), exterior algebra (Grassmann 1844), Clifford algebra (1878), the tensor algebra of Ricci (1890), the differential forms of Cartan (1923), and the spin algebra of Pauli and Dirac (1928) have all contributed to a maturing geometric algebra framework. Geometric algebra offers a multivector concept for representation and a geometric product for multivector computation, which allow for a versatile higher-order representation and computation in domains of different dimensions and metric. Complex numbers, quaternions, and dual quaternions can all be represented in both rotor and motor bivector algebras. Moreover, double, or hyperbolic, numbers can also be found in geometric algebras of positive signature. Local analysis at tangent spaces, which requires differential operations to enhance the geometric symmetries of invariants, has been done successfully using Lie algebra and Lie theory. Since the Lie algebras are isomorphic with bivector algebras, such differential operations can be advantageously implemented for complex computations of differential geometry, as in the recognition of higher-order symmetries. Projective geometry and multilinear algebra, too, are elegantly reconciled in Clifford algebra, providing the resulting algebra of incidence with the duality principle, inner product, and outer morphisms. In the middle of the nineteenth century, J. Liouville proved, for the threedimensional case, that any conformal mapping on the whole of Rn can be expressed E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 1, c Springer-Verlag London Limited 2010
3
4
1 Introduction to Geometric Algebra
as a composition of inversions in spheres and reflections in hyperplanes [142]. In particular, rotation, translation, dilation, and inversion mappings will be obtained with these two mappings. In conformal geometric algebra, these concepts are simplified, because due to the isomorphism between the conformal group on Rn and the Lorentz group on RnC1 , we can easily express, with a linear Lorentz transformation, a nonlinear conformal transformation, and then use versor representation to simplify composition of transformations with multiplication of vectors [119]. Thus, with conformal geometric algebra, it is computationally more efficient and simpler to interpret the geometry of the conformal mappings than with matrix algebra. Conformal geometric algebra is the fusion of Clifford geometric algebra and non-Euclidean hyperbolic geometry. One of the results of non-Euclidean geometry demonstrated by Nikolai Lobachevsky in the nineteenth century is that in spaces with hyperbolic structure we can find subsets that are isomorphic to a Euclidean space. In order to do this, Lobachevsky introduced two constraints to what is now called the conformal point, xc 2 RnC1;1 . The first constraint is the homogeneous representation, normalizing the vector xc such that xc e1 D 1, and the second constraint is such that the vector must be a null vector, that is, x2c D 0. Conformal geometric algebra offers enormous advantages by computing in the null cone space. The space Rn is extended with two null vectors (the origin and the point at infinity), which are used to represent the basic computational unit, the sphere. In conformal geometric algebra, one represents points, lines, planes, and volumes using unit spheres. Conformal transformations are represented using combinations of versors that are generated via reflections with respect to spheres. Their homogeneous representation allows projective computations; via the inner product one recovers the Euclidean metric. As a result, conformal geometric algebra constitutes a suitable computational framework for dealing with a variety of conformal and projective problems involving time dependence, as in causality and space and time analysis. One of the criticisms of conformal geometric algebra is the wrong idea that this system can manipulate only basic entities (points, lines, planes, and spheres), and therefore it will not be useful to model general two- and three-dimensional objects, curves, surfaces, or any other nonlinear entity required to solve a problem of a perception action system in robotics and computer vision. Surprisingly, ruled surfaces can be treated in conformal geometric algebra for promising applications like the motion guidance of very nonlinear curves, reaching tasks and 3D object manipulation on very nonlinear surfaces. Our initial attempts to use geometric algebra in robot vision have been successful, reinforcing our opinion that there is no need to abandon this framework in order to carry out different kinds of computations. For all of these reasons, we believe that the single unifying language of geometric algebra offers the strongest potential for building perception and action systems; it allows us to better understand links between different fields, incorporate techniques from one field into another, reformulate old procedures, and find extensions by widening their sphere of applicability. Finally, geometric algebra helps to reduce the complexity of algebraic expressions, and, as a result, improves algorithms both in speed and accuracy.
1.2 What Is Geometric Algebra?
5
1.2 What Is Geometric Algebra? The algebras of Clifford and Grassmann are well known to pure mathematicians, but from the beginning were abandoned by physicists in favor of the vector algebra of Gibbs, the commonly used algebra today in most areas of physics. Geometric algebra is a coordinate-free approach to geometry based on the algebras of Grassmann [74] and Clifford [42]. The geometric approach to Clifford algebra, adopted in this book, was pioneered in the 1960s by David Hestenes [86], who has since worked on developing his version of Clifford algebra – which will be referred to as geometric algebra in this volume – into a unifying language for mathematics and physics [87, 94]. Hestenes also presented a study of projective geometry using Clifford algebra [95] and, recently, the essential concepts of the conformal geometric algebra [119]. The introductory sections of this chapter will present the basic definitions of geometric algebra. The reader can find in Appendix A (Chap. 23) an outline of classical Clifford algebras with explanations of their related subjects and many complementary mathematical definitions, which will help the more mathematically inclined reader to have a broad view of the role of geometric algebra and Clifford algebra. Also a reader with no mathematical background can consult this appendix to learn more about some concepts and related definitions. For this mathematical system, we denote scalars with lowercase letters, matrices with uppercase letters, and we use bold lowercase, for both vectors in three dimensions and the bivector parts of spinors. Spinors and dual quaternions in four dimensions are denoted by bold uppercase letters.
1.2.1 Basic Definitions Let V n be a vector space of dimension n. We are going to define and generate an algebra Gn , called a geometric algebra. Let fe1 ; e2 ; : : : ; en g be a set of basis vectors of V n . The scalar multiplication and sum in Gn are defined in the usual way of a vector space. The product or geometric product of elements of the basis of Gn will simply be denoted by juxtaposition. In this way, from any two basis vectors, ej and ek , a new element of the algebra is obtained, and denoted as ej ek ejk . The product of basis vectors is anticommutative: ej ek D ek ej ; 8j ¤ k:
(1.1)
The basis vectors must square in C1, 1, or 0; this means that there are nonnegative integers, p, q, and r, such that n D p C q C r and 8 < C1 for i D 1; : : : ; p; 2 ei ei D ei D 1 for i D p C 1; : : : ; p C q; : 0 for i D p C q C 1; : : : ; n:
(1.2)
6
1 Introduction to Geometric Algebra
This product will be called the geometric product of Gn . With these operations, Gn is an associative linear algebra with identity and is called the geometric algebra or Clifford algebra of dimension n D p C q C r, generated by the vector space V n . It is usual to write Gn for the geometric algebra spanned by the real vector space V n . The elements of this geometric algebra are called multivectors, because they are entities generated by the sum of elements of mixed grade of the basis set of Gn , such as A D hAi0 C hAi1 C C hAin ; (1.3) where the multivector A 2 Gn is expressed by the addition of its 0-vector part (scalar) hAi0 , its 1-vector part (vector) hAi1 , its 2-vector part (bivector) hAi2 , its 3-vector part (trivector) hAi3 , . . . , and its n-vector part hAin . We call an r-blade or a blade of grade r for the geometric product of r linearly independent vectors. Because the addition of k-vectors (homogeneous vectors of grade k) is closed and the multiplication of a k-vector by a scalar is another k-vector, the set of all k-vectors is k V n n a vector space, denoted V . Each of these spaces is spanned by k-vectors, k n nŠ where D .nk/ŠkŠ . Thus, our geometric algebra, Gn , which is spanned by k n P n D 2n elements, is a direct sum of its homogeneous subspaces of grades k kD0 0, 1, 2, . . . , n, that is Gn D
0 ^
Vn ˚
1 ^
Vn˚
2 ^
Vn ˚˚
k ^
Vn ˚˚
n ^
V n;
(1.4)
0 1 V V where V n D R is the set of real numbers and V n D V n . Thus, any multivector of Gn can be expressed in terms of the basis of these subspaces. For instance, in the geometric algebra G3 with multivector basis
f1; e1 ; e2 ; e3 ; e12 ; e31 ; e23 ; I3 D e123 g ;
(1.5)
a typical multivector u will be of the form u D ˛0 C ˛1 e1 C ˛2 e2 C ˛3 e3 C ˛4 e12 C ˛5 e31 C ˛6 e23 C ˛7 I3 ;
(1.6)
where the ˛i ’s are real numbers. When a subalgebra is generated only by multivectors of even grade, it is called the even subalgebra of Gn and will be denoted as GnC . Note, however, that the set of odd multivectors is not a subalgebra of Gn . In Gp; q; r , we have the pseudoscalar, In e12n , but we can even define a pseudoscalar for each set of vectors that square in a positive number, negative number, and zero, that is, Ip e12p , Iq e.pC1/.pC2/.pCq/ , and Ir e.pCqC1/.pCqC2/.pCqCr/ , where n D p C q C r. We say that Gp; q; r is degenerated if r ¤ 0.
1.2 What Is Geometric Algebra?
7
1.2.2 Nonorthonormal Frames and Reciprocal Frames Suppose we have a set of n linearly independent and nonorthonormal vectors fek g. Then any vector a can be expressed uniquely in terms of this frame using a standard tensor notation a D a k ek :
(1.7)
How do we find the components (ak ) in the fek g frame? To answer this question, we need a second set of vectors fe k g, called reciprocal frame, which is related to the initial set as follows: e i ej D ıji ;
(1.8)
where the upper and lower indices provide a useful manner to relate both frames in terms of their components. By using the reciprocal vectors, we can straightforwardly compute the components of a as follows: e k a D e k .aj ej / D e k e j aj D aj ıji D ak :
(1.9)
In order to construct the reciprocal frame, we consider that e 1 must be orthogonal to each of fe2 : : : en g; thus, e 1 lies completely outside the hyperplane e2 ^ e3 ^ ^ en . The vector perpendicular to this hyperplane is simply found by dualizing the hyperplane via the pseudoscalar e 1 D ˛e2 ^e3 ^ ^en I;
(1.10)
where ˛ is some constant, which can be fixed by applying to this expression from the left inner product 1 D e1 e 1 D ˛e1 .e2 ^e3 ^ ^en I /:
(1.11)
If we define the volume element as En D e1 ^e2 ^e3 ^ ^en , then ˛En I D 1, so that ˛ D I 1 En1 ; thus, the last equation is simplified to e 1 D e2 ^e3 ^ ^en En1 :
(1.12)
This result can be extended to the following useful formula: e k D .1/kC1 e1 ^ ^ eOk ^ ^en En1 ; where the check denotes that the ek -term is missing in the product.
(1.13)
8
1 Introduction to Geometric Algebra
1.2.3 Some Useful Formulas Using the following identity x D x k ek D x e k ek D x ek e k ;
(1.14)
ek e k x D 1x;
(1.15)
or
one can derive a series of useful formulas. Let us first consider ek e k .x^y/ D ek .e k xy e k yx/ D xy yx D 2x^y:
(1.16)
This result can then be extended inductively for an m-grade multivector, X m , to yield ek e k X m D mX m :
(1.17)
ek e k D ek .ej ek e j / D ej ek e k e j ;
(1.18)
Consider now the following:
but ej ek is symmetric on j; k; thus, one can pick up only the symmetric component of e k e j . Since this is also a scalar, one gets then only a scalar contribution to the sum ek e k D ek e k D n;
(1.19)
where n is the dimension of the involved space. Let us now consider ek e k ^X m D ek .e k X m e k X m / D .n m/X m :
(1.20)
Combining these results, we get an expression for the geometric product: ek X m e k D .1/m ek e k X m D .1/m ek .e k ^Xm e k X m / D .1/m ..n m/X m mX m / D .1/m .n 2m/X m :
(1.21)
1.2.4 Multivector Products It will be convenient to define other products between the elements of this algebra, which will allow us to set up several geometric relations (unions, intersections, projections, etc.) between different geometric entities (points, lines, planes, spheres, etc.) in a very simple way.
1.2 What Is Geometric Algebra?
9
First, we define the inner product, a b, and the exterior or wedge product, a ^ b, of any two 1-vectors a; b 2 Gn , as the symmetric and antisymmetric parts of the geometric product ab, respectively. That is, from the expression symmetric part
antisymmetric part
‚ …„ ƒ ‚ …„ ƒ 1 1 ab D .ab C ba/ C .ab ba/ ; 2 2
(1.22)
we define the inner product ab
1 .ab C ba/ 2
(1.23)
a^b
1 .ab ba/: 2
(1.24)
and the outer or wedge product
Thus, we can now express the geometric product of two vectors in terms of these two new operations as ab D a b C a ^ b: (1.25) From (1.23) and (1.24), a b D b a and a ^ b D b ^ a. The inner product of the vectors a and b corresponds to the standard scalar or dot product and produces a scalar. In the Sect. 1.2.5 ahead, we will discuss in detail other complementary definitions of the inner product called contraction, introduced by Lounesto [125]. The outer or wedge product of these vectors is a new quantity, which we call a bivector. We can think of a bivector as an oriented area in the plane containing a and b formed by sweeping a along b; see Fig. 1.1a. Thus, b ^ a will have an opposite orientation, because the wedge product is anticommutative. The outer product is immediately generalizable to higher dimensions: for example, .a ^ b/ ^ c, a trivector, is interpreted as the oriented volume formed by sweeping the area a ^ b along vector c; see Fig. 1.1b. Note that the outer product does not determine the spatial forms of the bivector and trivector: it computes only their
a
b
c
B
e3
T a
e1e2e3 = I
e2e3
e1e3
b a
b
e1e2 c
e2
e1
Fig. 1.1 (a) The directed area, or bivector, B D a ^ b; (b) the oriented volume, or trivector, T D a^b^c; (c) multivector basis of the Euclidean 3D space
10
1 Introduction to Geometric Algebra
orientations and values of their involved determinants corresponding to the area and volume, respectively. Figure 1.1a, b depicts such forms simply to favor the geometric understanding. Figure 1.1c depicts the multivector basis of the Euclidean 3D space. The symmetric part of the geometric product of two vectors a and b is called the anticommutator product and the antisymmetric part, the commutator product. These expressions are denoted as follows: 1 .ab C ba/ 2 1 axb D .ab ba/ 2 axb D
anticommutator product;
(1.26)
commutator product:
(1.27)
In general, the geometric product of two multivectors, A; B 2 Gn , can be written as the sum of their commutator and anticommutator: AB D
1 1 .AB C BA/ C .AB BA/ D AxB C AxB: 2 2
(1.28)
In the literature, the commutator of two multivectors, A; B 2 Gn , is usually written as ŒA; B and the anticommutator as fA; Bg, although we prefer to emphasize their role within the geometric product by using the operator symbols x and x. The geometric product of a vector and a bivector reads a.b^c/ D
1 a.bc cb/ 2
1 D .a b/c .a c/b .bac cab/ 2 1 D 2.a b/c 2.a c/b C .bc ca/a 2 D 2.a b/c 2.a c/b C .a^c/a:
(1.29)
From this result, one gets an extension of the definition of the inner product: a .b^c/ D
1 Œ.a.b^c/ .b^c/a D .a b/c .a c/b: 2
(1.30)
The remaining part of the product corresponds to a trivector: a^.b^c/ D
1 Œ.a.b^c/ C .b^c/a D a^b^c: 2
(1.31)
So the geometric product of a vector and a bivector is a.b^c/ D a .b^c/ C a^.b^c/ D .a b/c .a c/b C a^b^c: (1.32)
1.2 What Is Geometric Algebra?
11
The reader can see that in geometric algebra one has to handle expressions with large number of brackets; thus, we should follow an operator ordering convention: in the absence of brackets, inner and outer products take precedence over the geometric products, so we can write .a b/c D a bc;
(1.33)
where the right-hand side is not to be confused with a .bc/.
1.2.5 Further Properties of the Geometric Product The geometric product of a vector and a bivector extends simply to that of a vector a and a grade-r multivector A r (composed of r orthogonal blades ar ). To carry out the geometric product, first decompose A r into blades aA r D .aa1 /a2 ar D .a a1 /a2 ar C .a^a1 /a2 ar D 2.a a1 /a2 ar .a1 a/a2 ar D 2a a1 a2 ar 2a a2 a1 a3 ar C a1 a2 aa3 ar r X D2 .1/kC1 a ak a1 a2 aL k ar C .1/r a1 a2 ar a; (1.34) kD1
where the check indicates that this term is missing from the series. Each term in the sum has grade r 1, so one can define a A r D haA r ir1 D
1 .aAr .1/r A r a/ 2
(1.35)
and the right term is the remaining antisymmetric part. One can therefore write a^A r D haA r irC1 D
1 .aA r C .1/r A r a/: 2
(1.36)
Thus, the geometric product of a vector and a grade-r multivector Ar can be written as aA r D a A r C a^A r D haA r ir1 C haA r irC1 :
(1.37)
Geometric product multiplication by a vector lowers and raises the grade of a multivector by 1. In Eq. 1.34, we assumed that the vectors are orthogonal: we can extend this decomposition for the case of nonorthogonal vectors as follows: 1 Œaha1 a2 : : : ar ir .1/r ha1 a2 ^ ar ir a 2 1 D haa1 a2 ^ ar .1/r a1 a2 ^ ar air1 : (1.38) 2
a .a1 ^a2 ^ ar / D
12
1 Introduction to Geometric Algebra
In this equation, since we are expecting to lower the r-grade by 1, we can insert a within the brackets. Now, by reutilizing Eq. 1.34, we can write a .a1 ^a2 ^ ^ar / D D
r DX
.1/kC1 a ak a1 a2 : : : aL k : : : ar
kD1 r X
E r1
.1/kC1 a ak a1 a2 : : : aL k : : : ar :
(1.39)
kD1
This result is very useful in practice; see the following examples: a .a1 ^a2 / D a a1 a2 a a2 a1 ; a .a1 ^a2 ^a3 / D a a1 a2 ^a3 a a2 a1 ^a3 C a a3 a1 ^a2 :
(1.40)
The generalization of Eq. 1.39 will be discussed in detail in a later section. Projections and Rejections Let A r be a nondegenerated r-blade or hyperplane in Gn , then any vector v 2 Rn has a projection (parallel component) onto the space of A r defined by P k D .v A r /A 1 r ; Ar
(1.41)
and a rejection (perpendicular component) from the space of A r defined by ? D .v^A r /A 1 PA r : r
(1.42)
Therefore, v can be split in terms of a projection and a rejection as follows: ? ; v D vk C v? D P k C PA Ar r
(1.43)
which is an orthogonal decomposition of v with respect to A r . Figure 1.2 shows the projection, rejection, and inner product of the vector v with respect to the space k (plane) A 2 D x^y. Note that the vector vx^y is perpendicular to the vector P . A2
Fig. 1.2 Projection, rejection, and inner product with respect to A 2 D x ^ y
1.2 What Is Geometric Algebra?
13
Projective Split The idea of the projective split was introduced by Hestenes [88] in order to connect projective geometry and metric geometry. This is done by associating the even subalgebra of GnC1 with the geometric algebra of the next lower dimension, Gn . The multivector basis of GnC1 and Gn is given by 1 ; 2 ; : : : ; nC1 and e1 ; e2 ; : : : ; en , respectively. One can define a mapping between the spaces by choosing a preferred direction in GnC1 , nC1 . Then, by taking the geometric product of a vector X 2 GnC1 and nC1 , X^nC1 ; XnC1 D XnC1 C X^nC1 D XnC1 1 C X nC1
(1.44)
X ^ nC1 2 GnC1 . In proX nC1 jective geometry (1.44) can be interpreted as the pencil of all lines passing though the point nC1 . In physics, the projective split is called the space–time split, and it relates a space–time system G4 with a Minkowski metric to an observable system G3 with a Euclidean metric. the vector x 2 Gn can be associated with the bivector
Generalized Inner Product Multivector computing involving inner products is easier if one employs the next equality for the generalized inner product of two blades A r D a1 ^ a2 ^ ^ ar , and B s D b1 ^ b2 ^ ^ bs : Ar B s D
..a1 ^ a2 ^ ^ ar / b1 / .b2 ^ b3 ^ ^ bs / .a1 ^ a2 ^ ^ ar1 / .ar .b1 ^ b2 ^ ^ bs //
if r s if r < s: (1.45)
For the right contraction use the following equation: .a1 ^ a2 ^ ^ ar / b1 D
r X
.1/ri a1 ^ ^ ai 1 ^ .ai b1 / ^ ai C1 ^ ^ ar
i D1
(1.46) and for the left contraction ar .b1 ^ b2 ^ ^ bs / D
s X
.1/i 1 b1 ^ ^ bi 1 ^ .ar bi / ^ bi C1 ^ ^ bs :
i D1
(1.47) From these equations, we can see that the inner product is not commutative for general multivectors. Indeed, if, in particular, a, b, and c are 1-vectors, then from (1.47), a .b ^ c/ D .a b/c .a c/b; (1.48) and from Eq. 1.46, .b ^ c/ a D b.c a/ .b a/c D .a b/c C .a c/b:
(1.49)
14
1 Introduction to Geometric Algebra
Thus, the inner product of a vector a and the bivector B D b^c fulfills a B D B a:
(1.50)
Let us now consider the inner product for the case r < s, given the bivector A2 D a1 ^a2 and the trivector B 3 D b1 ^b2 ^b3 . Then, applying Eq. 1.47 from the left, we should get a vector b: A 2 B 3 D hA 2 B 3 ij23j D hA 2 B 3 i1 D .a1 ^a2 / .b1 ^b2 ^b3 / D a1 Œa2 .b1 ^b2 ^b3 / D a1 Œ.a2 b1 /b2 ^b3 .a2 b2 /b1 ^b3 C .a2 b3 /b1 ^b2 D .a2 b2 /Œ.a1 b1 /b3 .a1 b3 /b1 C C.a2 b3 /Œ.a1 b1 /b2 .a1 b2 /b1 D Œ.a1 b3 /Œ.a2 b2 / .a1 b2 /.a2 b3 /b1 C CŒ.a1 b1 /Œ.a2 b3 / .a1 b3 /.a2 b1 /b2 C CŒ.a1 b2 /Œ.a2 b1 / .a1 b1 /.a2 b2 /b3 D ˇ1 b1 C ˇ2 b2 C ˇ3 b3 D b:
(1.51)
The projective geometric interpretation of this result tells us that the inner product of a line A 2 with the plane B 3 should be a vector b or point lying on the plane. Note that the point b is spanned as a linear combination of the vector basis that spans the plane B 3 . Later we will see that in an algebra of incidence of the projective geometry, the meet operation (inner product with a dual multivector) computes the incidence point of a line through a plane. Let us consider two other examples of the conformal geometric algebra G4;1 . Given the Euclidean pseudoscalar, IE D e1 ^ e2 ^ e3 , and the Minkowski plane, E D e4 ^ e5 , let us compute the square of the pseudoscalar, I D IE ^ E D e1 ^ e2 ^ e3 ^ e4 ^ e5 : I 2 D .IE ^E/ .IE ^E/ D Œ.IE ^.E IE / E C Œ.IE Ie /^E E (1.52) D .IE2 ^E/ E D ..1/ E/ E D E 2 D 1; since E and IE are orthogonal, E IE D 0. Let us now consider Eq. 2.64, GnC1;1 given in [119], which claims that the component of the sphere s in Rn is given by s D PE? .s/ D .s ^E/E D s C .s e0 /e:
(1.53)
Since the Minkowski plane can also be expressed as E D e1 ^ e0 , where e1 D enC1 C enC2 , e0 D 0:5.enC1 enC2 /, and E 2 D 1, we proceed by computing the inner product from the right using Eq. 1.46 and the equality B .a^b/ D B a b, namely,
1.2 What Is Geometric Algebra?
15
s D s 1 D .sE/E D .s E/E C .s ^E/E D PEk .s/ C PE? .s/ D PE? .s/ D .s ^E/E D .s ^e1 ^e0 /^.e1 ^e0 / C .s ^e1 ^e0 / .e1 ^e0 / D Œ.s ^e1 ^e0 / e1 e0 D Œ.s ^e1 /^.e0 e1 / s ^.e1 e1 /^e0 C .s e1 /^e1 ^e0 e0 D .s ^1 / e0 D Œs ^.e1 e0 / .s e0 /^e1 D Œs .s e0 /^e1 D s C .s e0 /^e1 :
(1.54)
Let us see how we can compute the meet \ (incidence algebra intersection) using the generalized inner product. Consider the meet of the k-vectors C q ; A s 2 Gn C q \ A s D C nq A s D .C q In1 / A s D B r A s ;
(1.55)
where the meet operation is expressed as the inner product of the dual of C q , that is, C nq D C q In1 D B r , and As , where the upper symbol stands for the dual. Thereafter, this inner product can be then easily computed using the generalized inner product of Eq. 1.45. Geometric Product of Multivectors In general, any two paired multivectors can be multiplied using the geometric product. Consider two homogeneous multivectors A r and B s of grade r and s, respectively. The geometric product of Ar and B s can be written as A r B s D hABirCs C hABirCs2 C C hABijrsj
(1.56)
where hABit denotes the t-grade part of the multivector A r B s . Note that hA r B s ijrsj
(1.57)
is a generalized contraction or inner product, and A r ^ B s hA r B s irCs
(1.58)
is the outer or wedge product of the highest grade. The inner product of lowest grade hABi0 corresponds to a full contraction, or standard inner product. Since the elements of A r B s are of a different grade, A r B s is thus an inhomogeneous multivector. As an illustration, consider ab D habi0 C habi2 D a b C a^b and the geometric product of A D 5e3 C 3e1 e2 and b D 9e2 C 7e3 : Ab D hAbi0 C hAbi1 C hAbi2 C hAbi3 D 35.e3/2 C 27e1 .e2 /2 C 45e3 e2 C 21e1 e2 e3 D 35 C 27e1 45e2 e3 C 21I:
(1.59)
16
1 Introduction to Geometric Algebra
In the following sections, expressions of grade 0 will be written ignoring their subindex, that is, habi0 D habi. We can see from Eq. 1.57 that the inner product of a scalar ˛ with a homogeneous multivector A r has a special treatment: A˛ D ˛A D 0. However, it is not the case for the outer product, A ^ ˛ D ˛ ^ A D h˛Air ˛A. We make the observation that software like GABLE or CLICAL does not have this exceptional case for the inner product. Now, the inner and outer product of any two general multivectors will be obtained applying the left and right distributive laws over their homogeneous parts and then using Eqs. 1.57 and 1.58. Note that the definition in (1.57) is not in contradiction to (1.23). Analogously, (1.58) and (1.24) are consistent. From (1.57), it can be said that the inner product, A r B s , lowers the grade of A r by s units when r s > 0, and from Eq. 1.58 that the outer product, A r ^ B s , raises the grade of A r by s units for every r, s 0. Contractions and the Derivation Let us consider a vector and a bivector v; A 2 G3 . The vector v is tilted by an angle out of the plane of a bivector A; see Fig. 1.3. Let x be the orthogonal projection of v in the plane, then jxj D jvj cos . In this setting, Lounesto [125] introduced the following notion: the right contraction of the bivector A by the vector v is a vector y D A ` v that is orthogonal to x and lies on the plane of the bivector A such that jyj D jAjjxj y ? x and x^y "" A:
(1.60)
According to Eq. 1.50, v a A D A ` v;
(1.61)
that is, the left and right contractions have opposite signs. In Fig. 1.3, we see that the area of the rectangle can be computed in terms of the inverse of x, namely,
Fig. 1.3 Projection, rejection, and inner product with respect to A 2 D a1 ^ a2
1.2 What Is Geometric Algebra?
17
jAj D jx 1 jjyj, where x 1 D xx2 . If we write vk D x and v? D x vk , the vector v can be expressed in terms of its parallel and perpendicular components: k
? D vk C v? D .v a A/A 1 C .v^A/A 1 ; v D P C PA A
(1.62)
where A 1 D A=A2 and A 2 D jAj2 . According to Eqs. 1.35 and 1.36, the geometric product of v and the bivector A can also be rewritten as vA D v a A C v^A D
1 1 .vA Av/ C .vA C Av/: 2 2
(1.63)
The left contraction can also be computed using the notion of duality, given the vectors and pseudoscalar a; b; I 2 G3 : a a b D Œa^.bI /I 1 :
(1.64)
This equation means that the left contraction is dual to the wedge product. Later, we will discuss at length the concept of duality and the related operations of meet and join of the incidence algebra. Now consider in our next formula the vectors v, b and the k-vector A k 2 G3 : v a .A k b/ D .v a A k /b C .1/k A k .v a b/:
(1.65)
We see that the left contraction by a vector is a derivation in the geometric algebra. In general, the product of a vector v and an arbitrary r-vector A r can be decomposed into a sum of the left contraction and the wedge product. Thus using Eqs. 1.35 and 1.36, vA r D v a A r C v^Ar 1 1 D .vA r .1/r A r v/ C .vA r C .1/r A r v/: 2 2
(1.66)
As we have seen in this section, the right a and left ` contractions can be computed simply using Eqs. 1.46 and 1.47 of the generalized inner product (1.45). Therefore, in order to avoid using an excessive number of operator symbols, in this book we will not use the operators a and ` when carrying out contractions. Hodge Dual The Hodge star operator establishes a correspondence between the k nk V V n space V n of k-vectors with the space V of .n k/-vectors. The dimension n n of the former space is and that of the latter space is . Due to the k nk symmetry of the binomial coefficients, the dimensions of both spaces are equal. The mapping between spaces of the same dimension is an isomorphism, which, in the case of the Hodge duality, exploits the inner product and orientation of the vector space. The image of a k-vector under this isomorphism is known as its Hodge dual of the k-vector.
18
1 Introduction to Geometric Algebra
The Hodge star operator on an oriented inner product space, V n , is a linear operator on the exterior algebra of V n . This operator interchanges the subspaces of k-vectors and .n k/-vectors, as we show next. Given an oriented orthonormal basis e1 ; e2 ; : : : ; en , the Hodge dual of a k-vector is computed as follows: ? .e1 ^e2 ^ ^ek / D ekC1 ^ekC2 ^ ^en :
(1.67)
As an illustration, let us apply the Hodge dual to the vector y 2 R3 : Y D ?y D ?.y1 e1 C y2 e2 C y3 e3 / D y1 e2 ^e3 C y2 e3 ^e1 C y3 e1 ^e2 2
2 ^
R3
and now compute the wedge of a vector x 2 R3 with the bivector Y D ?y 2 x^Y D x^?y D .x y/e1 ^e2 ^e3 ;
(1.68) 2 V
R3 :
(1.69)
or the wedge product of a bivector X with the Hodge dual of a bivector Y : X ^?Y D hX ; Y ie1 ^e2 ^e3 :
(1.70)
Using the Hodge duality, we can establish in R3 the relationship between the cross product and the wedge product as follows: x^y D ?.x y/; x y D ?.x^y/:
(1.71)
In the geometric algebra G3 , the Hodge dual is computed using the reversion of the multivector e 3; ? A D AI
(1.72)
where I3 D e1 ^e2 ^e3 is the pseudoscalar. The reversion is defined in Eq. 1.83. For example, the Hodge duals of a vector x and a bivector X are computed as ?x De x I3 D xI3 D X ; e I3 D X I3 D x: ?X D X
(1.73)
Finally, the relationship between the cross product and the wedge product is written as x^y D ?.x y/ D .x y/I3 ; x y D .x^y/I3 :
(1.74)
1.2 What Is Geometric Algebra?
19
1.2.6 Dual Blades and Duality in the Geometric Product A fundamental concept algebraically related to the unit pseudoscalar I is that of duality. In a geometric algebra Gn , we find dual multivectors and dual operations. The dual of the multivector A 2 Gn is defined as follows: A D AIn1 ;
(1.75)
where In1 differs from In at most by a sign. Note that, in general, I 1 might not necessarily commute with A. The multivector bases of a geometric algebra Gn have 2n basis elements. It can be shown that the second half is the dual of the first half. For example, in G3 the dual of the scalar is the pseudoscalar, and the dual of a vector is a bivector e23 D Ie1 . In general, the dual of an r-blade is an .n r/-blade. The operations related directly to the Clifford product are the inner and outer products, which are dual to one another. This can be written as follows: .x A/In D x^.AIn /;
(1.76)
.x^A/In D x .AIn /;
(1.77)
where x is any vector and A any multivector. By using the ideas of duality, we are then able to relate the inner product to incidence operators in the following manner. In an n-dimensional space, suppose we have an r-vector A and an s-vector B where the dual of B is given by B D BI 1 B I 1 . Since BI 1 D B I 1 C B ^I 1 , we can replace the geometric product by the inner product alone (in this case, the outer product equals zero, and there can be no .n C 1/-D vector). Now, using the identity A r .B s C t / D .A r ^B s /C t
for
r C s t;
(1.78)
we can write A.BI 1 / D A.B I 1 / D .A^B/I 1 D .A^B/I 1 :
(1.79)
This expression can be rewritten using the definition of the dual as follows: A B D .A ^B/ :
(1.80)
The equation shows the relationship between the inner and outer products in terms of the duality operator. Now, if r Cs D n, then A ^B is of grade n and is, therefore, a pseudoscalar. Using Eq. 1.79, we can employ the involved pseudoscalar in order to get an expression in terms of a bracket:
20
1 Introduction to Geometric Algebra
AB D .A ^B/ D .A ^B/I 1 D .ŒA ^BI /I 1 D ŒA ^B:
(1.81)
We see, therefore, that the bracket relates the inner and outer products to nonmetric quantities.
1.2.7 Multivector Operations In this section, we will define a number of right operations that act on multivectors and return multivectors. Involution, P Reversion, and Conjugation Operations For an r-grade multivector A r D riD0 hA r ii , the following operations are defined: br D Grade involution: A
r X
.1/i hAr ii ;
(1.82)
i D0
er D Reversion: A
r X
.1/
i.i 1/ 2
hAr ii ;
(1.83)
i D0
Conjugate: hA r ik D .a1 ^a2 ^ ^ak /
D ak ^ak1 ^ ^a1 ; e br D Clifford conjugation: A r D A
r X
.1/
i.i C1/ 2
(1.84) hA r ii :
(1.85)
i D0
The grade involution simply negates the odd-grade blades of a multivector. The reversion can also be obtained by reversing the order of basis vectors making up the blades in a multivector and then rearranging them in their original order using the anticommutativity of the Clifford product. Let us explain the case of conjugation. Let A 2 f1; : : : ; p C qg be the set of vectors eA 2 Gp;q ; the conjugate of eA , denoted by e , is defined as: A e D .1/r eQA ; r WD grade of eA : A
(1.86)
Thus, aQ i D ai , ai is not necessarily equal to ai ; this is because for ei 2 Gp;q , ei D ei if p < i < p C q. The computation of the magnitude or modulus of a blade can be done using the reversion operation, as follows: 1
e 2 D jjhAik jj2 : jjAjj D hAAi 0
(1.87)
1.2 What Is Geometric Algebra?
21
Accordingly, the magnitude of a multivector M reads 1
fM i 2 jjM jj D hM 0 1
D .jjhM i0 jj2 C jjhM i1 jj2 C jjhM i2 jj2 C C jjhM in jj2 / 2 ; v u n uX jjhM ir jj2 : Dt
(1.88)
rD0
In particular, for an r-vector A r of the form A r D a1 ^ a2 ^ ^ ar : A r D .a1 ar1 ar / D ar ar1 a1 and thus A r A r D a21 a22 a2r ; so, we will say that such an r-vector is null if and only if it has a null vector for a factor. If in such a factorization of A r , p, q, and s factors square in a positive number, negative, and zero, respectively, we will say that A r is an r-vector with signature .p; q; s/. In particular, if s D 0, such a nonsingular r-vector has a multiplicative inverse that can also be written using the reversion A 1 D .1/q
A AA
D .1/q
A jAj
2
D
A A2
:
(1.89)
In general, the inverse A 1 of a multivector A, if it exists, is defined by the equation A 1 A D 1. Join and Meet Operations When we work with lines, planes, and spheres, however, it will clearly be necessary to employ operations to compute the meets (intersections) or joins (expansions) of geometric objects. For this, we will need a geometric means of performing the set-theory operations of intersection, \, and union, [. Herewith, [ and \ will stand for the algebra of incidence operations of join and meet, respectively. If in an n-dimensional geometric algebra, the r-vector A and the s-vector B do not have a common subspace (null intersection), one can define the join of both vectors as C D A [ B D A^B;
(1.90)
so that the join is simply the outer product (an r C s vector) of the two vectors. However, if A and B have common blades, the join would not simply be given by the wedge but by the subspace the two vectors span. The operation join [ can be interpreted as a common dividend of lowest grade and is defined up to a scale factor. The join gives the pseudoscalar if .r C s/ n. We will use [ to represent the join only when the blades A and B have a common subspace; otherwise, we will use the ordinary exterior product, ^, to represent the join.
22
1 Introduction to Geometric Algebra
If there exists a k-vector D such that for A and B we can write A D A 0 D and B D B 0 D for some A 0 and B 0 , then we can define the intersection or meet using the duality principle as follows: D I 1 D D D .A \ B/ D A [ B :
(1.91)
This is a beautiful result, telling us that the dual of the meet is given by the join of the duals. Since the dual of A \ B will be taken with respect to the join of A and B, we must be careful to specify which space we will use for the dual in Eq. 1.91. However, in most cases of practical interest, this join will indeed cover the entire space, and therefore we will be able to obtain a more useful expression for the meet using Eq. 1.80. Thus, A \ B D ..A \ B/ / D .A [ B /I D .A ^B /.I 1 I /I D .A B/: (1.92) The above concepts are discussed further in [95]. In the theory of Grassman–Cayley algebra, the meet operation is computed using the so-called shuffle formula. This formula will be explained in Sect. 23.2.3 of the appendix.
1.3 Linear Algebra This subsection presents the geometric algebra approach to the basic concepts of linear algebra and is presented here for completeness. Although it will not be discussed in this chapter, the treatment of invariants [117] uses linear algebra and projective geometry to create geometric entities that are invariant under projective transformations. A linear function f maps vectors to vectors in the same space. The extension of f to act linearly on multivectors is possible via the so-called outermorphism f , which defines the action of f on r-blades thus: f .a1 ^a2 ^ ^ar / D f .a1 /^f .a2 /^ ^f .ar /:
(1.93)
The function f is called an outermorphism because f preserves the grade of any r-vector it acts upon. The action of f on general multivectors is then defined through linearity. The function f must therefore satisfy the following conditions: f .a1 ^a2 / D f .a1 /^f .a2 /; f .Ar / D hf .Ar /ir ; f .˛1 a1 C ˛2 a2 / D ˛1 f .a1 / C ˛2 f .a2 /:
(1.94)
Accordingly, the outermorphism of a product of two linear functions is the product of the outermorphisms, that is, if f .a/ D f2 .f1 .a//, we write f D f 2 f 1 . The
1.4 Simplexes
23
adjoint f of a linear function f acting on the vectors a and b can be defined by the property f .a/b D af .b/:
(1.95)
If f D f , the function is self-adjoint and can be represented by a symmetric matrix F (F D F T ). Since the outermorphism preserves grade, the unit pseudoscalar must be mapped onto some multiple of itself; this multiple is called the determinant of f . Thus, f .I / D det.f /I:
(1.96)
This is a particularly simple definition of the determinant, from which many properties of the determinants follow straightforwardly.
1.4 Simplexes An r-dimensional simplex (r-simplex) in Rn corresponds to the convex hull of r C 1 points, of which at least r are linearly independent. A set of points fx 0 ; x 1 ; x 2 ; : : : ; x r g that define an r-simplex is seen as the frame for the r-simplex. One can select x 0 as the base point or place of the simplex. Using the geometric algebra, the related notations for the r-simplex are as follows: N r; X r x 0 ^x 1 ^x2 ^: : :^x r D x 0 ^ X N r .x 1 x 0 /^.x2 x 0 /^ ^.x r x 0 /; X
Fig. 1.4 (a) Tangent of the simplex; (b) volume of a 3-hedron
(1.97) (1.98)
24
1 Introduction to Geometric Algebra
N r is called the tangent of the simplex, where xN i D .x i x 0 / for i D 1; : : : ; r. X because it is tangent to the r-plane in which the simplex lies; see Fig. 1.4a. The N r =rŠ and it assigns a deftangent determines the directed distance to the simplex X inite orientation to the simplex. On the other hand, the volume of the simplex is N r j, which corresponds to area of the triangle formed by xN 1 and given by .rŠ/1 jX N r is xN 2 in Fig. 1.4b. In general, this is the volume of an .r C 1/-hedron. Note that X independent of the choice of the origin, but X r is not. Now, an r-dimensional plane parallel to the subspace of X r and through the point x is the solution set of .y x/^X r D 0;
(1.99)
for y 2 Rn . According to Eq. 1.99, the equation for the plane of the simplex is given by N r D x0 ^ X N r D Xr: y ^X
(1.100)
The term X r corresponds to the moment of the simplex, because it is the wedge product of a point touching the simplex with the orientation entity; think in the moment of a line. Since Eq. 1.98 represents the pseudoscalar for the simplex frame fxi g, it determines also a dual frame fx i g given by Eq. 1.8: x i x j D ıji :
(1.101)
The face opposite x i in simplex X r is described by its moment Fir X r X ir x i X r D .1/i C1 x 0 ^x1 ^ ^ xL i ^ ^x r :
(1.102)
This equation defines a face operator Fi ; see Fig. 1.4b. Note that the face X ir is an r 1-dimensional simplex and X r D x i ^X ir ;
(1.103)
for any 0 i r. The boundary of a simplex X r is given by the multivector sum as follows: ˝b X r D
r X
X ir D
i D0
r X
Fi X r :
(1.104)
i D0
This boundary operator fulfills X r D x i ^˝b X r ;
(1.105)
for any 0 i r. Taking into account the identity .x 1 x 0 /^.x 2 x 0 /^ ^.x r x 0 / D
r X
.1/i x 0 ^ ^ xL i ^ ^x r : (1.106)
i D0
1.5 Geometric Calculus
25
and Eqs. 1.98 and 1.102, we derive N r D ˝b X r : X
(1.107)
Finally, the following relations hold for the operators Fi and ˝b : Fi Fi D 0;
(1.108)
Fi Fj D Fj Fi ;
(1.109)
Fi ˝b D ˝b Fi ;
(1.110)
˝b ˝b D 0:
(1.111)
Note that these operators’ relations are strongly analogous to relations one finds in algebraic topology.
1.5 Geometric Calculus We are used to the grad, div, and curl operators, which are formulated using a single vector derivative. The derivative operator is essential in complex analysis and enables the extension of complex analysis to higher dimensions. The synthesis of vector differentiation and geometric algebra is called geometric calculus [94]. Next, we will describe these operators in the geometric algebra framework.
1.5.1 Multivector-Valued Functions and the Inner Product A multivector-valued function f W Rp;q ! Gp;q , where n D p C q, has 2n blade components f .x/ D
X M
f .x/M eM D
n X hf .x/in ;
(1.112)
i D1
where M 2 f0; 1; 2; : : : ; 12; 23; : : : ; 123; 124; : : : ; 123 .n 1/ng contains all possible 2n blade subindexes and f .x/M 2 R corresponds to the scalar accompanying each multivector base. Let us consider the complex conjugation in G3 fQ.x/ D
X
f .x/M eQM D f .x/0 e0 C f .x/1 e1 C f .x/2 e2 C f .x/3 e3
M
f .x/23 e23 f .x/32 e32 f .x/12 e12 f .x/123 e123 D hf .x/i0 C hf .x/i1 hf .x/i2 hf .x/i3 :
(1.113)
26
1 Introduction to Geometric Algebra
Next, we define the inner product of Rn ! Gn functions f; g by .f; g/ D
X
g n x: f .x/g.x/d
(1.114)
Rn
Let us consider the inner product of two functions in G3 : .f; g/L2 .R3 IG3 / D
Z R3
g 3 xD f .x/g.x/d
X
eM eQN
M;N
Z R3
fM .x/gN .x/d3 x; (1.115)
where d3 x D dx1^dIx32^dx3 . The scalar part of this equation corresponds to the L2 norm Z X X 2 3 2 Q f .x/ f .x/d x D fM .x/d3 x: jjf jjL2 .R3 IG / Dh.f; f /L2 .R3 IG3 / i0 D 3
R3 M
R3
The L2 .Rn I Gn;0 / norm is given by jjf jj2 D h.f; f /i0 D
X
jf .x/j2 dn x:
(1.116)
Rn
For the case of square-integrable functions on the sphere L2 .S n1 /, the inner product and the norm are Z g .f; g/L2 D f .x/g.x/dS.x/; (1.117) S n1 Z hf .x/fg .x/i0 dS.x/; (1.118) jjf jj2 D 2n S n1
where dS.x/ is the normalized spin(n)-invariant measure on S n1 .
1.5.2 The Multivector Integral Let us consider F .x/ a multivalued function (field) of a vector variable x defined in a certain region of the Euclidean space E n . If the function is only scalar or vector valued, it will be called a scalar- or vector-field, respectively. The Riemann integral of a multivector-valued function F .x/ is defined as follows: Z
F .x/jdxj D En
lim j4xj j ! 0 n!1
n X j D1
F .xj ej /4xj ;
(1.119)
1.5 Geometric Calculus
27
where the quantity in brackets, jdxj, is used to make the integral grade preserving, because dx is a vector in a geometric algebra Gn . Thus, the integral can be discretized using the sum of quadrature expressions.
1.5.3 The Vector Derivative In the study of fields in geometric algebra, we will represent a position by a vector x 2 Rn . The vector derivative r is the derivative with respect to the vector position x, and it can be written with respect to a fixed frame fe k g with coordinates x k D e k x as follows: rD
X k
ek
@ : @x k
(1.120)
Since the vectors fe k g 2 Gn , these vectors inherit the full geometric product. If we compute the inner product of r with the vector v, we get the directional derivative in the v direction: F .x C v/ F .x/ ; !0
v rF .x/ D lim
(1.121)
where F .x/ can be any multivector-valued function of position, or more generally a position-dependent linear function. In the next subsections, we will explain operators that are a result of different actions of the vector derivative r on vector fields.
1.5.4 Grad, Div, and Curl The vector derivative r acting on a scalar vector field f .x/ produces the gradient rf .x/, which is a vector whose components in the frame fe k g are the partial derivatives with respect to the x k -coordinates. In Euclidean space, the vector rf .x/ points in the direction of the steepest increase of the function f .x/. However, in spaces of mixed signature, such as in the space with Minskowski signature used for projective geometry, one cannot easily interpret the direction of the vector rf .x/. The geometric product between the vectors r and F .x/ is rF D r F C r ^ F:
(1.122)
By computing the inner product of r with a vector field F .x/, we obtain its divergence or symmetric part: r F D
@ k @F k e F D D @k F k ; @k @k
(1.123)
28
1 Introduction to Geometric Algebra
where we utilize the useful abbreviation @i D @x@ i . If we compute the wedge product, we get a bivector or the antisymmetric part: r ^ F D er.@i F / D e i ^ e j @i Fj :
(1.124)
The antisymmetric part or bivector is related in 3D with the curl via the cross product as follows: r ^ F D I3 r F:
(1.125)
Note that r ^ F is not an axial vector, but something different, a bivector, which represents an oriented plane. Also, one can see as in the true sense of geometric calculus that one can generalize the curl to arbitrary dimensions.
1.5.5 Multivector Fields The preceding computations can easily be extended to the case of the vector derivative acting on a multivector field F 2 Gn as follows: rF D e k @k F :
(1.126)
For the m-grade multivector field F m , its inner and wedge products read r F m D hrF m im1 ;
r ^ F m D hrF m imC1 ;
(1.127)
which are known as divergence and curl, respectively. Be aware that divergence of a divergence vanishes: r .r F / D 0;
(1.128)
r ^ .r ^F / D e i ^@i .e j ^@j F / D e i ^ e j ^.@i @j F / D 0:
(1.129)
and also the curl of a curl:
By convention here, the inner product of a vector and a scalar is zero. We should define the following conventions: (i) in the absence of brackets, r acts on the object to its immediate right; (ii) when r is followed by an expression in brackets, the derivative acts on all elements of the expression; (iii) when r acts on a multivector to which one is not adjacent, one uses over-dots to describe the procedure: P GP D e k F @k G : rF
(1.130)
According to this notation, one can write P GP ; r.F G / D rF G C rF
(1.131)
1.5 Geometric Calculus
29
which is a form of the Leibniz rule, and rP fP.x/ D rf .x/ e k f .@k x/:
(1.132)
1.5.6 Convolution and Correlation of Scalar Fields The filtering of a continuous signal f W E n ! C is computed via the convolution of the signal with a filter h W E n ! C as follows: .h f /.x/ D
Z
h.x 0 /f .x x 0 /dx 0 :
(1.133)
h.x 0 /f .x C x 0 /dx 0 :
(1.134)
En
The spatial correlation is defined by .h ? f /.x/ D
Z En
Note that the convolution can be seen as a correlation with a filter that has been reflected with respect to its center.
1.5.7 Clifford Convolution and Correlation Given a multivector field F and a multivector-valued filter H , their Clifford convolution is defined in terms of the Clifford product of multivectors: .H l F /.x/ D .F r H /.x/ D
Z Z
H .x 0 /F .x x 0 /jdx 0 j; En
F .x x 0 /H .x 0 /jdx 0 j:
(1.135)
En
Since the Clifford product is not commutative, we distinguish the application of the filter H from the left and from the right by using the subindex l and r, respectively. For the case of discrete fields, the convolution has to be discretized. If the application of the filter is done from the left, the discretized, left-convolution is given by .H l F /i;j;k D
d d d X X X
H r;s;t F i r;j s;kt ;
(1.136)
i Dd j Dd kDd
where a 3D uniform grid is used where i; j; k; r; s; t 2 Z, .i; j; k/ denotes the grid nodes, and d 3 is the dimension of the filter grid.
30
1 Introduction to Geometric Algebra
The spatial Clifford correlation can be defined in a similar manner as the Clifford convolution: Z .H l F /.x/ D H .x 0 /F .x C x 0 /jdx 0 j; n ZE F .x C x 0 /H .x 0 /jdx 0 j: (1.137) .F r H /.x/ D En
This correlation formula can be seen simply as a convolution with a filter that has been reflected with respect to its center. We can also carry out the scalar Clifford convolution in the Fourier domain. Consider the vector fields f ; h W E 3 ! E 3 2 G3 . Since .h s f /.x/ D h.h l f /i0 ; .h f /3 D 0;
(1.138)
F f.h s f /g.u/ D hF fhg; F ff gi C hF fhg; Fff gi3 :
(1.139)
the Clifford convolution is
Since the Clifford–Fourier transform of 3D vector fields contains a vector part and a bivector part, the trivector part hF fhg; F ff gi3 is generally nonzero.
1.5.8 Linear Algebra Derivations In linear algebra, a number of derivations are carried out using frame contraction; instead, we can use vector derivatives as we show in the next equation: r.x v/ D e i @i .x i ej /:v D e i ej vıij D e i ei v D v:
(1.140)
This shows that when differentiating a function that depends linearly on x, it is simply equivalent to carry out contractions over frame indices. To take advantage of this, one introduces a vector variable x and denotes the derivative with respect to x by @x . In this manner, we can rewrite Eqs. 1.17–1.21 as follows: @x x X m D mX m ; @x x^X m D .n m/X m ; @P x X m xP D .1/m .n 2m/X m :
(1.141)
The trace of a linear function can be expressed in terms of a vector derivative as follows: T r.f / D @x f .x/:
(1.142)
1.5 Geometric Calculus
31
The use of the vector derivatives enables us to write the terms of an equation that do not depend on a frame in such a way that reflects this independence. In general, this kind of formulation brings out the intrinsic geometric content of an equation.
1.5.9 Reciprocal Frames with Curvilinear Coordinates In many situations, one works in non-Cartesian coordinate systems, where a coordinate system is defined by a set of scalar function fx i .x/g that is defined over some region. A function F .x/ can be expressed in terms of such coordinates as F .x i /. If we apply the chain rule, rF D rx i @i F D e i @i F:
(1.143)
This defines the so-called contravariant frame vectors fe i g as follows: e i D rx i :
(1.144)
In Euclidean space, these vectors are fully perpendicular to the surfaces of constant x i . It is clear that these vectors have a vanishing curl: r ^e i D r ^.rx i / D 0:
(1.145)
The reciprocal frame vectors are the coordinate vectors, which are covariant: ei D @i x;
(1.146)
and which are formed incrementing the x i coordinates by keeping all others fixed. These two frames are reciprocal, because they fulfill ei e j D .@i x/ rx j D @i x j D ıji :
(1.147)
Particularly for the case when the space signature is not Euclidean, we should refrain from using these kinds of frame representations restricted to orthogonal frames and compensated with weighting factors as follows: ei D gi eOi ;
e i D gi1 eOi :
(1.148)
1.5.10 Geometric Calculus in 2D The vector derivative relates and combines the algebraic properties of geometric algebra with vector calculus in a natural manner. Let us first see the derivative in 2D. Vectors are written in terms of a right-handed orthonormal frame as x D x 1 e1 C x 2 e2 D xe1 C ye2 ;
(1.149)
32
1 Introduction to Geometric Algebra
where the scalar coordinates are represented with superscripts. The vector derivative reads r D e1 @x C e2 @y D e1 .@x C I2 @y /;
(1.150)
where I2 is the pseudoscalar of G2 . Consider the vector derivative acting on the vector v D f e1 ge2 as follows: rv D .e1 @x C e2 @y /.f e1 ge2 / D
@g @f @x @y
I2
@g @f C @x @y
: (1.151)
Note that the two terms in the parentheses are the same that vanish in the Cauchy– Riemann equations. One recognizes the close relationship between the complex analysis and the 2D vector derivative. In order to clarify further, let us introduce the complex field : D ve1 D f C I2 g:
(1.152)
That is analytic means it satisfies the Cauchy–Riemann equations. This statement can be written using the vector derivative as follows: r
D 0:
(1.153)
This fundamental equation can be generalized straightforwardly to higher dimensions. For example, in 3D, Eq. 1.153 with , an arbitrary even-grade multivector, defines the spin harmonics, which are fundamental for the Pauli and Dirac theories of electron orbitals (see Sect. 1.5.12 and the appendix). In 4D, for the space–time algebra, one uses r D e 0 @t C e i @x i ;
i D 1; 2; 3:
(1.154)
Using an even-grade multivector for , the equation r D 0 represents the wavefunction for a massless fermion (e.g., a neutrino). If D F is a pure bivector in the space–time algebra, the equation rF D 0 encodes all the Maxwell equations for the case of free-field electromagnetism. Finally, one can recognize that all these examples are simply special cases of the same underlying mathematics.
1.5.11 Electromagnetism: The Maxwell Equations By using the space–time vector derivative and the geometric product, one can unify all four of the Maxwell equations into a single equation. This result undoubtedly confirms the impressive power of the geometric algebra framework. The gradient and curl operators are not invertible, while the vector derivative of the geometric
1.5 Geometric Calculus
33
algebra is fully invertible. This helps to simplify the algebraic equation manipulation enormously too. The space–time algebra G1;3 has the basis 0 ; 1 ; 2 ; 3 . The multivector basis is 1 ; ; ; I ; „ƒ‚… I : „ƒ‚… „ƒ‚… „ƒ‚… „ƒ‚… scalar
(1.155)
4 vectors 6 bivectors 4 trivectors pseudoscalar
The pseudoscalar is I D 0 1 2 3 , with I 2 D .0 1 2 3 /.0 1 2 3 / D .2 3 /.2 3 / D 1:
(1.156)
The space–time vector derivative is r D @ ;
@ D
@ ; @x
(1.157)
where using superscripts x 0 D x 0 D t is the time coordinate, and x i D x i are the three spatial coordinates: Here 0 D 0 and i D i . Let us compute the space–time split of the vector derivative, r0 D . 0 @t C i @i /0 D @t i @i D @t r ;
(1.158)
where the minus sign is due to the Lorentzian metric. We start by considering the four Maxwell equations: r B D 0; r E D @t B;
r E D ; r B D J C @t E :
(1.159)
Let us first find the space–time equation to relate both the electric and magnetic fields. The familiar nonrelativistic form of the Lorentz force law is given by dp D q.E C v B/; dt
(1.160)
where all the vectors are expressed in the frame 0 and p D p^0 . By multiplying by v 0 , one converts the derivative into one with respect to proper time. Applying it to the first term on the right-hand side, one gets v 0 E D v .E ^0 / .v E /^0 D .E v/^0 ;
(1.161)
where E ^0 vanishes due to the wedge between its terms k D k 0 and 0 . The magnetic term changes to v 0 v B D v 0 v .I B/ D .v^0 / .I B/ D Œ.I B/ v^0 C Œ0 .I B/^v D Œ.I B/ v^0 ; (1.162)
34
1 Introduction to Geometric Algebra
where we used the result of the Jacobi identity. Using these results, Eq. 1.160 can be written in the following form: dp D p^ P 0 D qŒ.E C I B/ v^0 : d
(1.163)
In this equation, the space–time bivector F D E C IB
(1.164)
is the electromagnetic field strength, also known as the Faraday bivector. Now, in order to group the two Maxwell source equations, introduce the space– time vector J with D J 0 ;
J D J ^0 :
(1.165)
Then we form J 0 D C J D r E @t E C r B:
(1.166)
Consider the following algebraic manipulation using the bivector y D y ^0 : 0 ^.x y/ D 0 ^Œx .y ^0 / D 0 ^.x 0 y/ D x 0 y ^0 D x 0 y: Similarly, one can write @t E D 0 rE D 0 ^.r E /:
(1.167)
The full E -term can then be written as r E @t E D .0 ^r/ E 0 ^.r E / D .r E / 0 C .r E /^0 D .r E /0 :
(1.168)
Now consider the cross product of two bivectors: x y D .x0 / .x0 / D .x ^0 / .x ^0 / D x ^y0 0 :
(1.169)
Using this result, we rewrite r B D I r B D I.r ^0 / .B/ D I r ^B0 D r .I B/0 : (1.170) Combining Eqs. 1.168 and 1.170, we get J 0 D r .E C I B/0 :
(1.171)
1.5 Geometric Calculus
35
Using the definition of the Faraday bivector F D E C I B;
(1.172)
we can combine the two equations into one: r F D J:
(1.173)
Let us considering the two remaining Maxwell equations, r B D 0;
r E D @t B:
(1.174)
The first can be rewritten using the duality principle and E ^0 D 0: 0 D r B D .0 ^r/^.I B/ D r ^.I B/^0 D r ^.E C I B/^0 D r ^F ^0 :
(1.175)
Let us now consider the inner product, making use of Eq. 1.158: .r ^F / 0 D r ^.F 0 / C @t F D hrE 0 i2 C @t F D h.@t r /E i2 C @t .E C I B/ D I.@t B C r E / D 0:
(1.176)
Since .r ^ F / 0 equals zero, we can merge this result with Eq. 1.173 into one single equation using the geometric product rF D J:
(1.177)
Recall that using tensor calculus, one utilizes two equations: @u F D J ;
@u F D 0:
(1.178)
1.5.12 Spinors, Shr¨odinger Pauli, and Dirac Equations As an illustration on the formulation and use of spinor spaces in geometric calculus, let us now consider the spinor representation in the Schr¨odinger Pauli equation. See Sect. 23.1.6, in the Appendix, for further definitions and details on pin and spin groups and spinors.
36
1 Introduction to Geometric Algebra
In the following equation, the wave function sends space–time points to Pauli spinors 1 2 C 2; .r; t/ D (1.179) 2
where 1 ; 2 2 C. If one replaces the Pauli spinors by the square matrix spinors with zero entries in the second column, 1 0 .r; t/ D ; (1.180) 2 0 one obtains an isomorphic linear space S . This representation can be expressed as 1 0 f D : (1.181) 2 Mat.2; C/f ' G3 f; 0 0 Such matrices, where only the first column has nonzero entries, form a left ideal S of G3 , i.e., x
2 S; for all x 2 G3 and
2 S G3 :
(1.182)
The vector bases i of Mat.2; C/ and ei of G3 are related as follows: e1 ' 1 ; e2 ' 2 ; e3 ' 3 . The linear space S has a basis ff0 ; f1 ; f2 ; f3 g, where 1 1 1 0 0 0 .1 C e3 / ' ; f1 D .e23 C e2 / ' ; 0 0 i 0 2 2 1 1 0 0 i 0 f2 D .e31 e1 / ' ; f3 D .e12 C e123 / ' ; 1 0 0 0 2 2 f0 D
where f D f0 is an idempotent, that is, f 2 D f . An element 2 S , expressed in coordinate form in the basis ff0 ; f1 ; f2 ; f3 g, multiplied by the left hand side with an arbitrary even element, b 2 G3C b D.b0 Cb1 e23 Cb2 e31 Cb3 e12 /.
0 f0
C
1 f1
C
2 f2
C
3 f3 /;
(1.183)
corresponds to the following matrix multiplication
b
0 b0 Bb1 'B @b2 b3
b1 b0 b3 b2
b2 b3 b0 b1
10 b3 B b2 C CB b1 A @ b0
0
1
C A: 2 1C
(1.184)
3
This kind of square matrices corresponding to the left multiplication by even elements constitutes a subring of Mat(4,R), which in turn is an isomorphic image of the quaternion ring H.
1.5 Geometric Calculus
37
1.5.13 Spinor Operators Until now, spinors have been objects that have been operated upon. One can replace such passive spinors by active spinor operators. Instead of the spinors given by Eq. 1.181 in minimal left ideals, we formulate the following even elements: D 2 even. / D
1
2
2 1
2 G3C ;
(1.185)
which can also be computed as D C O for 2 G3 f . In a classical way, one computes the expected values of the components of the spin in terms of the column spinor 2 C 2 as follows: s1 D
1 ;
On the other hand, in terms of s1 D 2h e1 Q i0 ;
s2 D
2 ;
s3 D
3 :
(1.186)
2 G3 f this computation can be repeated as follows: s2 D 2h e2 Q i0 ;
s3 D 2h e3 Q i0 ;
(1.187)
where hi0 extracts the 0-blade or scalar part of the involved multivector. However, using the active spinor 2 G3C , we can compute the expected values straightforwardly in a compact form, s D s1 e1 C s2 e2 C s3 e3 D e3 Q ;
(1.188)
getting the entity s as a whole. Since acts like an operator here, we will now call it a spinor operator. The matrix algebra Mat(2; C) is an isomorphic image of the geometric algebra G3 of the Euclidean space R3 . As a result, not only vectors x 2 R3 and rotations in SO.3/ can be represented in G3 but also spinor spaces or spinor representations of the rotation group SO.3/ can be constructed in G3 . Recall the Schr¨odinger equation ih
@ h2 2 D r @t 2m
CW :
(1.189)
In an electromagnetic field E , B with potentials V and A, this equation becomes ih
1 @ D Œ.ihr eA/2 eV ; @t 2m 1 Œ.h2 r 2 C e 2 A2 C ihe.r A C A r/ D 2m
eV : (1.190)
Note that this equation does not yet involve the electron’s spin. In 1927, Pauli introduced spin into quantum mechanics by integrating a new term in the Schr¨odinger equation. Using the Pauli spin matrices, which fulfill the relation j k C k j D
38
1 Introduction to Geometric Algebra
2ıjk I and the generalized momentum D ihr eA D p eA satisfying
1 2 2 1 D iheB3 (permutable cyclically for 1,2,3), one can write . /2 D 2 he. B/;
(1.191)
where 2 D p 2 C e 2 A2 e.p A C A p/. In Eq. 1.190, Pauli replaced the term
2 by . /2 , obtaining this formulation: ih
1 @ D Œ 2 he. B/ @t 2m
eV :
(1.192)
Note that in this Schr¨odinger–Pauli equation, the spin is described by the term he 2m . B/. In geometric algebra G3 , the Pauli contribution can be formulated by replacing the dot product D 2 with 2 D heB, so that the equation changes to ih
1 @ D Œ 2 heB @t 2m
eV ;
(1.193)
where B2R3 G3 and .r; t/2S DG3 f , f D 12 .1 C e3 /. Note how remarkable the formulation of this equation is in geometric calculus, because all the arguments and functions now have values in one algebra. As a result, it greatly facilitates numerical computations. Now let us move a step forward by using the above-explained active spinor operator. The Schr¨odinger equation using the spinor operator reads ih
@ 1 2 he D
B e3 eV ; @t 2m 2m
(1.194)
and it explicitly shows the quantization direction e3 of the spin. The relativistic phenomena can be taken into consideration by starting the analysis from the equation E 2 =c 2 p D m2 c 2 . By inserting in this equation the energy and momentum operators, one gets the Klein–Gordon equation 1 @2 @2 @2 @2 h 2 2C 2 C 2 C 2 c @t @x1 @x2 @x3 2
D m2 c 2 :
(1.195)
In 1928, Dirac linearized the Klein–Gordon equation, formulating it as a first-order equation 1 @ @ @ @ ih 0 C 1 C 2 C 3 c @t @x1 @x2 @x3
D mc ;
(1.196)
where the symbols satisfy 02 D I , 12 D 22 D 32 D I , and D for ¤ . Dirac found a set of 44 matrices that satisfy these relations. Writing x0 D ct and substituting @ D @x@ , one gets a condensed form of the Dirac equation: ih @
D mc :
(1.197)
1.6 Exercises
39
One can include an interaction with the electromagnetic field F via the space– time potential .A0 ; A1 ; A2 ; A3 / D . 1c V; Ax ; Ay ; Az / of F utilizing the replacement ih@ ! ih@ eA . In these terms, one finally gets the conventional formulation of the Dirac equation: .ih@ eA /
D mc ;
(1.198)
which takes into account the relativistic phenomena and also the spin. This equation describes spin- 21 particles like the electron. In the equation, the wavefunction is a column spinor 0 B .x/ D B @
1
1
C C 2 C4 A 3 3
˛
2 C:
(1.199)
4
During the years 1966–1974, David Hestenes reformulated the Dirac theory. In this contribution, the role of the column spinors .x/ 2 C 4 was taken over by operators C in the even subalgebra G1;3 [86].
1.6 Exercises 1.1 Given x; y 2 G2;0;0 , expand the bivector x ^y in terms of geometric products. Prove that it anticommutes with both x and y, but commutes with any vector outside the plane. 1.2 Prove that the magnitude of bivector x^y is jxjjyjsin./. 1.3 Prove that in R3 the cross product is equivalent to the following expressions: x y D I x^y D x .I y/ D y .I x/: 1.4 Interpret geometrically the equations of exercise 1.3 and establish that the following expressions are true: x .y z/ D x .y ^z/ D .x yz x zy/; and x .y z/ D Œx; y; z D x^y ^zI 1 : 1.5 Given the 2-blade bivector X D e1 ^.e2 e3 / that represents a plane, check if the following vectors lie in that plane: (i) e1 ; (ii) e1 C e2 ; (iii) e1 C e2 C e3 ; and (iv) e1 2e2 C e3 .
40
1 Introduction to Geometric Algebra
1.6 In 3D space, a trivector x^y ^z can be written in terms of a determinant: x^y ^z D det.Œx y z/e1 ^e2 ^e3 ;
(1.200)
where Œx y z is a matrix with column vectors. Express the trivector not with a matrix but fully in terms of geometric algebra. 1.7 In 4D space with an associated orthonormal basis fei g4iD1 , project the 2-blade Y D .e1 C e2 /^.e3 C e4 / onto the 2-blade X D e1^e3 . Compute then the rejection as the difference of Y and its projection. Prove that this is not a blade. 1.8 In R4 , show that the 2-vector X D e1 ^ e2 C e3 ^ e4 is not a 2-blade, which means that it cannot be expressed as the wedge product of two vectors. (Hint: Given X D x^y, express x and y in terms of the basis vectors, expand the wedge product and try to solve the resulting scalar equations.) 1.9 In exercise 1.8, check that the 2-vector X D e1 ^e2 C e3 ^e4 does not contain any vector other than 0. We mean by contain, denoted X Y , the case when all vectors in X are also in Y . 1.10 In G2;0;0 the multivectors X and Y are given by X D x0 C x1 e1 C x2 e2 C x3 e1 e2
Y D y0 C y1 e1 C y2 e2 C y3 e1 e2 ;
where the basis vectors e1 ; e2 are orthonormals. Compute their geometric product X Y D w0 C w1 e1 C w2 e2 C w3 e1 e2 ; ˝ ˛ ˝ ˛ making explicit w0 ; w1 ; w2 ; w3 and establish that X Y D Y X . 1.11 Expand the trivector x ^ .y ^ z/ in terms of geometric products. Is the result antisymmetric on x and y? Justify why one can also write the following equality: x^y ^z D
1 .xyz w^y ^x/; 2
and prove that the following equation is true: x^y ^z D
1 .xyz C zxy C ywx xzy yaz wyx/: 6
1.12 What is the corresponding formula using wedge- and inner-products of the vector calculus formula x .y z/ D y.x z/ z.x y/? 1.13 Compute the area of a parallelogram spanned by the vectors x D 2e1 C e3 and y D e1 e3 relative to the area of e1 ^e3 .
1.6 Exercises
41
1.14 In G3;0;0 compute the intersection of the nonhomogeneous line L1 with position vector e3 and direction e2 C e3 and the line L2 with position vector e2 e3 and direction e2 . 1.15 The projection of a vector v with respect to a plane A r is given by ? I v D vk C v? D P k C PA Ar r
justify why the vector v A r is orthogonal to the plane formed by vk and v? . 1.16 With a bivector B and the general multivectors X , Y , prove that B .X Y / D .B X /Y C X .B Y /; hence that B .x^X r / D .B x/^X r C x^.B X r /: Use these results to establish that the operation of commuting with a bivector is grade-preserving. 1.17 In 2D space, compute the determinant using Eq. 1.96 and compare with the classical determinant. Given the basis fb1 ; b2 g, not necessarily orthonormal, and a linear mapping f so that f .b1 / D x and f .b2 / D y. First expand x and y in terms of this basis, namely x D x1 b1 Cx2 b2 and y D y1 b1 Cy2 b2 , then using I2 D x^y compute the determinant according to Eq. 1.96. Next, compute the matrix of f on the given basis and compute its classical determinant. These results should be equal. 1.18 Consider the linear transformation of the vectors of the plane x ^y given by f .x/ D 7x 5y and f .y/ D 5x 7y. Use linear algebra for computing the eigenvectors and their eigenvalues. Next, use geometric algebra to compute the determinant, and an eigen-2-blade with its eigenvalue. Finally, interpret the geometry of the transformation. 1.19 Eigen-2-blades: Formulate a nontrivial linear map f : R2 ! R2 that has an eigenvector and an eigen-2-blade, both with eigenvalue 1. 1.20 Consider a non-degenerate metric space Rn with an associated arbitrary basis fui gniD1 . Show that the adjoint of a linear transformation g can be formulated as follows: fN.x/ D
n X x g.ui / ui ; i D1
where stands for the scalar product.
(1.201)
42
1 Introduction to Geometric Algebra
1.21 Maxwell equations are modified under the presence of magnetic monopoles. If m and J m denote magnetic charges and currents, respectively, the relevant equations are r D D e ; r E D
r B D m ; @ B C J m; @t
r H D
(1.202) @ H C J e: @t
Prove that in free space, these equations can be written compactly as follows: rF D Je C Jm I;
(1.203)
where Jm D .m C J m /0 . A duality transformation of the E and B fields is defined by E 0 D E cos.˛/ C Bsin.˛/;
B D E cos.˛/ Bsin.˛/:
(1.204)
Prove that this can be written as F 0 D F e I ˛ . Finally, find an equivalent transformation law for the source terms such that the equations remain invariant. Prove also that the electromagnetic energy–momentum tensor is also invariant under a duality transformation. 1.22 Affine transformations: Consider the standard orthonormal basis of Rn;0 for the shear transformation gs W x ! gs .x/ D x C s.x e2 /e1 ; compute the transformation matrices Œgs and Œgs , both to act on vectors. Depict your results to see the shear effect of a planar line and its normal vector. 1.23 Simplex: Given the three points x 0 D e3 , x 1 D e2 C e3 , x 2 D e1 C e3 , N 2 , the simplex volume compute using CLICAL or CLUCAL, the simplex tangent X N 2 j and the simplex moment X 2 D x 0 ^ X N 2 . Show that if x 0 D 0 then the .2Š/1 jX simplex passes through the origin and its moment vanishes. 1.24 Simplex: Given the four points x 0 D e1 , x 1 D e2 , x 2 D e1 Ce2 , and x 3 D e3 , compute using CLICAL or CLUCAL, the four faces Fi3 X 3 opposite to the points x i i D 0; : : : ; 3. 1.25 Using the points given in exercise 1.13, compute using CLICAL or CLUCAL, the boundary ˝b X3 of the simplex X 3 . 1.26 Using the points given in exercise 1.13, check using CLICAL or CLUCAL, if the following relations hold for the operators Fi and ˝b : Fi Fi D 0; Fi Fj D Fj Fi ; Fi ˝b D ˝b Fi ; ˝b ˝b D 0:
1.6 Exercises
43
1.27 Using CLICAL or CLUCAL, compute the join X [ Y and meet X \ Y for the following blades: (i) X D e2 and Y D 3e1 ; (ii) X D e1 and Y D 4e p1 ; (iii) X D e1 and Y D e2 ; (iv) X D e1 ^e2 and Y D 2e1 ; (v) X D .e1 C e2 /= 2 and Y D e1 ; (vi) X D e1 ^ e2 and Y D 0:00001e1 C e2 ; and (vi) X D e1 ^ e2 and Y D cos./e1 C sin./e2 . See Sects. 1.2.7 and 9.3 for more details on the computation of join and meet.
Chapter 2
Geometric Algebra for Modeling in Robot Physics
In this chapter, we discuss the advantages for geometric computing that geometric algebra offers for solving problems and developing algorithms in the fields of artificial intelligence, robotics, and intelligent machines acting within the perception and action cycle. We begin with a short tour of the history of mathematics to find the roots of the fundamental concepts of geometry and algebra.
2.1 The Roots of Geometry and Algebra The lengthy and intricate road along the history of mathematics shows that the evolution through time of the two domains, algebra from the Arabic “alg-jbar” and geometry from the ancient Greek "!"K ˛ (geo D earth, metria D measure), started to intermingle early, depending upon certain trends imposed by the different groups and schools of mathematical thought. It was only at the end of the nineteenth century that they become a sort of a clear, integrated mathematical system. Broadly speaking, on the one hand, algebra is a branch of mathematics concerning the study of structure, relation, and quantity. In addition, algebra is not restricted to work with numbers, but it also covers the work involving symbols, variables, and set elements. Addition and multiplication are considered general operations, which, in a more general view, lead to mathematical structures such as groups, rings, and fields. On the other hand, geometry is concerned with essential questions of size, shape, and relative positions of figures, and with properties of space. Geometry is one of the oldest sciences initially devoted to practical knowledge concerned with lengths, areas, and volumes. In the third century, Euclid put geometry in an axiomatic form, and Euclidean geometry was born. During the first half of the seventeenth century, RenKe Descartes introduced coordinates, and the concurrent development of algebra evolved into a new stage of geometry, because figures such as plane curves could now be represented analytically with functions and equations. In the 1660s, Gottfried Leibniz and Isaac Newton, both inventors of infinitesimal calculus, pursued a geometric calculus for dealing with geometric objects rather than with sequences of numbers. The analysis of the intrinsic structure of geometric objects with the works of Euler and Gauss further enriched the topic of geometry and led to E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 2, c Springer-Verlag London Limited 2010
45
46
2 Geometric Algebra for Modeling in Robot Physics
the creation of topology and differential geometry. Since the nineteenth-century discovery of non-Euclidean geometry, the traditional concept of space has undergone a profound and radical transformation. Contemporary geometry postulates the concept of manifolds, spaces that are greatly more abstract than the classical Euclidean space and that approximately look alike at small scales. These spaces endowed an additional structure: a differentiable structure that allows one to do calculus; that is, a Riemannian metric allows us to measure angles and distances; symplectic manifolds serve as the phase spaces in the Hamiltonian formalism of classical mechanics, and the 4D Lorentzian manifolds serve as a space–time model in general relativity. Algebraic geometry is considered a branch of the mathematics that combines techniques of abstract algebra with the language and problems of geometry. It plays a central role in modern mathematics and has multiple connections with a variety of fields: complex analysis, topology, and number theory. Algebraic geometry is fundamentally concerned with the study of algebraic varieties that are geometric manifestations of solutions of systems of polynomial equations. One looks for the Gr¨obner basis of an ideal in a polynomial ring over a field. The most studied classes of algebraic varieties are the plane, algebraic curves such as lines, parabolas, lemniscates, and Cassini ovals. Most of the developments of algebraic geometry in the twentieth century were within an abstract algebraic framework, studying the intrinsic properties of algebraic properties independent of a particular way of embedding the variety in a setting of a coordinated space as in topology and complex geometry. A key distinction between projective geometry and algebraic geometry is that the former is more concerned with the more geometric notion of the point, whereas the latter puts major emphasis on the more analytical concepts of a regular function and a regular map and extensively draws on sheaf theory. In the field of mathematical physics, geometric algebra is a multilinear algebra described more technically as a Clifford algebra that includes the geometric product. As a result, the theory, axioms, and properties can be built up in a more intuitive and geometric way. Geometric algebra is a coordinate-free approach to geometry based on the algebras of Grassmann [74] and Clifford [42]. Since the 1960s, David Hestenes [86] has contributed to developing geometric algebra as a unifying language for mathematics and physics [87, 94]. Hestenes also presented a study of projective geometry using Clifford algebra [95] and, recently, the essential concepts of conformal geometric algebra [119]. Hestenes summarized and precisely defined the role of algebra and geometry in a profound comment emphasizing the role of the capacities of language and spatial perception of the human mind, which, in fact, is the goal of this section. We reproduce it to finalize this part as a prelude to the next section to motivate and justify why geometric algebra can be of great use to build the intelligent machine of which Turing dreamed. In his famous survey of mathematical ideas, F. Klein championed the fusing of arithmetic with geometry as a major unifying principle of mathematics. Klein’s seminal analysis of the structure and history of mathematics brings to light two major processes by which mathematics grows and becomes organized. They may be aptly referred to as the algebraic and geometric. The one emphasizes algebraic structure, while the other emphasizes geometric interpretation. Klein’s analysis shows one process alternatively dominating the other in the historical development of mathematics. But there is no necessary reason that
2.2 Geometric Algebra: A Unified Mathematical Language
47
the two processes should operate in mutual exclusion. Indeed, each process is undoubtedly grounded in one of two great capacities of the human mind: the capacity for language and the capacity for spatial perception. From the psychological point of view, then, the fusion of algebra with geometry is so fundamental that one could say, Geometry without algebra is dumb! Algebra without geometry is blind! —D. Hestenes, 1984
2.2 Geometric Algebra: A Unified Mathematical Language First of all, let us analyze the problems of the community when they use classical mathematical systems to tackle problems in physics or robotics. In this regard, let us resort to an enlightening paper of Hestenes [92] where the author discusses the main issues for modeling physical reality. The invention of analytical geometry and calculus was essential for Newton to create classical mechanics. On the other hand, the invention of tensor analysis was essential for Einstein to create the theory of relativity. The point here is that without essential mathematical concepts, both theories would not have been developed at all. We can observe in some periods of the history of mathematics certain stagnation and from time to time, thanks to new mathematical developments, astonishing progress in diverse fields. Furthermore, we can notice that as researchers attempt to combine different mathematical systems, unavoidably this attitude in research leads to a fragmentation of knowledge. Each mathematical system brings about some parts of geometry; however, together they constitute a highly redundant system, that is, an unnecessary multiplicity of representations for geometric concepts; see Fig. 2.2. This approach in mathematical physics and in robotics has the following pressing defects: – Restricted access: The ideas, concepts, methods, and results are unfortunately disseminated across the mathematical systems. Being proficient only in a few of the systems, one does not have access to knowledge formulated in terms of the other mathematical systems. – Redundancy: Less efficiency due to the repetitive representation of information in different mathematical systems. – Deficient integration/articulation: Incoherences, incongruences, lack of generalizations, and ineffective integration and articulation of the different mathematical systems. – Hidden common knowledge: Intrinsic properties and relations represented in different symbolic systems are very difficult to handle. – Low information density: The information of the problem in question is reduced due to distribution over different symbolic systems. According to Hestenes [92], the development of a unified mathematical language is, in fact, a problem of the design of mathematical systems based on the following major considerations: – Optimal encoding of the basic geometric intrinsic characteristics: dimension, magnitude, direction, sense, or orientation.
48
2 Geometric Algebra for Modeling in Robot Physics
– Coordinate-free methods to formulate and solve problems in physics and robotics. – Optimal uniformity of concepts and methods across different domains, so that the intrinsic common structures are made as explicit as possible. – Smooth articulation between different alternative systems in order to access and transfer information frictionlessly. – Optimal computational efficiency: The computing programs using the new system should be as or more efficient than any alternative system in challenging applications. Note that geometric algebra was constructed following these considerations, and in view of the progress of scientific theory, geometric algebra helps greatly to optimize expressions of the key ideas and consequences of the theory.
2.3 What Does Geometric Algebra Offer for Geometric Computing? Next, we will describe the most remarkable features of geometric algebra from the perspective of geometric computing for perception action systems.
2.3.1 Coordinate-Free Mathematical System In geometric algebra, one writes coordinate-free expressions to capture concepts and constraints in a sort of high-level geometric reasoning approach. It is expected that the geometric information encoded in the expressions involving geometric products and the actions of operators should not suffer from interference from the reference coordinate frame. As a matter of fact, the results obtained by algebraic computing based on coordinates are geometrically meaningless or difficult to interpret geometrically. This is essentially because they are neither invariant nor covariant under the action of coordinate transformations. In fact, geometric algebra enables us to express fundamental robotics physics in a language that is free from coordinates or indices. The geometric algebra framework gives many equations a degree of clarity that is definitively lost in matrix algebra or tensor algebra. The introduction of coordinates by R. Descartes (1596–1650) started the algebraization of geometry. This step in geometry caused a big change from a qualitative description to a qualitative analysis. In fact, coordinates are sequences of numbers and do not have geometric meaning themselves. G. Leibniz (1646–1716) dreamed of a geometric calculus system that deals directly with geometric objects rather than with sequences of numbers. More precisely, in a mathematical system, an element of an expression should have a clear meaning of being a geometric object or a transformation operator for algebraic manipulations such as addition, subtraction, multiplication, and division.
2.3 What Does Geometric Algebra Offer for Geometric Computing?
49
We can illustrate the concept of independence under the action of coordinate transformations by analyzing in the Euclidean plane geometry the following operation with complex numbers a; b 2 C: aN b WD .a1 ; a2 /.b1 ; b2 / D .a1 b1 a2 b2 ; a1 b2 C a2 b1 /. This product is not invariant under the Euclidean group; that is, the geometric information encoded in the results of the product cannot be separated from the interference of the reference coordinate frame. However, if we change the complex product to the following product: aN b WD .a1 ; a2 /.b1 ; b2 / D .a1 b1 C a2 b2 ; a1 b2 a2 b1 /, then under any rotation r W a ! ae i centered at N the origin, this product remains invariant: r.a/r.b/ WD .aeNi /.be i / D aN b. Consequently, if the complex numbers are equipped with a scalar multiplication, addition, subtraction, and a geometric product, they turn from a field to a 2D geometric algebra of the 2D orthogonal geometry. It is clear that increasing the dimension of the geometric space and the generalization of the transformation group, the desired invariance will be increasingly difficult. Leibniz’s dream is fulfilled for the nD classical geometries using the framework of geometric algebras. In this book, we present the following coordinate-free geometric algebra frameworks: for 2D and 3D spaces with a Euclidean metric, for 4D spaces with a non-Euclidean metric, and for RnC1;1 spaces the conformal geometric algebras. Assuming that we are handling expressions as independently as possible upon a specific coordinate system, in view of the implementation, one converts these expressions into low-level, coordinate-based ones that can be directly executed by a fast processor. In general, geometric algebra can be seen as a geometric inference engine of an automated code generator, which is able to take a high-level specification of a physical problem and automatically generate an efficient and executable implementation.
2.3.2 Models for Euclidean and Pseudo-Euclidean Geometry When we are dealing with problems in robotics or neural computing, an important question is in which metric space we should work. In this book, we are basically concerned with three well-understood space models: (i) Models for 2D and 3D spaces with a Euclidean metric: 2D and 3D are well suited to handle the algebra of directions in the plane and 3D physical space. 3D rotations are represented using rotors (isomorph to quaternions). You can model the kinematics of points, lines, and planes using G3;0;0 . Rotors can be used for interpolation in graphics and estimation of rotations of rigid bodies. Chapter 3 offers three increasingly powerful models of Euclidean geometry. (ii) Models for 4D spaces with a non-Euclidean metric: If you are interested in linearizing a rigid motion transformation, you will need a homogeneous representation. For that we should use a geometric algebra for the 4D space. Here it C is more convenient to choose the motor algebra G3;0;1 described in Chap. 3. It is the algebra of Pl¨ucker lines, which can be used to model the kinematics of points, lines, and planes better than with G3 . Lines belong to the nonsingular
50
2 Geometric Algebra for Modeling in Robot Physics
C study 6D quadric, and the motors to the 8D Klein quadric. In G3;0;1 , you can formulate a motor-based equation of motion for constant velocity where in the exponent you use a bivector for twists. You can also use motors for the interpolation of 3D rigid motion and estimate trajectories using EKF techniques. When you are dealing with problems of projective geometry, like in computer vision, again you need a homogeneous coordinate representation, so that the image plane becomes P 2 and the visual space P 3 . To handle the so-called n-view geometry [84] based on tensor calculus and invariant theory, you require G3;1 (Minkownski metric) for the visual space and G3;0;0 for the image plane. This is described in Chap. 9. Note that the intrinsic camera parameters are modeled with an affine transformation within geometric algebra as part of the projective mapping via a projective split between the projective space and the image plane. Incidence algebra, an algebra of oriented subspaces, can be used in G3;1 and G3;0;0 to treat problems involving geometric constraints and invariant theory. (iii) Conformal models: If you consider conformal transformations (angle preserving), conformal geometric algebra in Chap. 6 offers a non-Euclidean geometric algebra, Gn;1 , that includes in its multivector basis the null vectors origin and point at infinity. As a computational framework, it utilizes the powerful horosphere (the meet between a hyperplane and the null cone). Even though the computational framework uses a nonlinear representation for the geometric entities, one can recover the Euclidean metric. The basic geometric entity is the sphere, and you can represent points, planes, lines, planes, circles, and spheres as vectors or in dual forms, the latter being useful to reduce the complexity of algebraic expressions. As you may have noticed, the above-presented geometric algebras can be used either for kinematics in robotics or for projective geometry in computer vision. Provided that you calibrate a digital camera, you can make use of the homogeneous models from conformal geometric algebra to handle problems of robotics and those of computer vision simultaneously, however, without the need to abandon the mathematical framework. Furthermore, incidence algebra of points, lines, planes, circles, and spheres can be used in this framework as well. The topic of omnidirectional vision exploits the model of an image projected on the sphere, whereas all problems such as rigid motion, depth, and invariant theory can also be handled using conformal geometric algebra (see Chap. 9).
2.3.3 Subspaces as Computing Elements The wedge product of k basis vectors spans a new entity, the k-vector. A set of all k V k-vectors spans an oriented subspace, V n . Thus, the entire geometric algebra, Gn , is given by Gn D
0 ^
n
V ˚
1 ^
n
V ˚
2 ^
n
V ˚˚
k ^
n
V ˚˚
n ^
V n:
2.3 What Does Geometric Algebra Offer for Geometric Computing?
51
Geometric algebra uses the subspace structure of the k-vector spaces to construct extended objects ranging from lines, planes, to hyperspheres. If we then represent physical objects in terms of these extended objects, we can model physical phenomena like relativistic particle motion or conformal mappings of the visual manifold into the neocortex using appropriate operators and blade transformations.
2.3.4 Representation of Orthogonal Transformations Geometric algebra represents orthogonal transformations more efficiently than the orthogonal matrices by reducing the number of coefficients; think of the nine entries of a 3D rotation matrix and the four coefficients of a rotor. In geometric algebra, a versor product is defined as e; O ! V OV where the versor acts on geometric objects of different grades, subspaces, and also on operators, quite a big difference from the matrices. The versor is applied sandwiching the object, because this is the result of successive reflections of the object with respect to hyperplanes. In geometric algebra, a physical object can be described by a flag, that is, a reference using points, lines, planes, circles, and spheres crossing the gravitational center of the object. By applying versors equivalent to an orthogonal transformation on the flag, the relation of the reference geometric entities of the flag will remain invariant, that is, topology-preserving group action.
2.3.5 Objects and Operators A geometric object can be represented using multivector basis and wedge products, for example, a point, line, or a plane. These geometric entities can be transformed for rigid motion, dilation, or reflection with respect to a plane or a sphere. These transformations depend on the metric of the involved space. Thus, we can model the 3D kinematics of such entities in different computational models as in the 3D C Euclidean geometric algebra G3;0;0 , 4D motor algebra G3;0;1 , or the conformal algebra G4;1 . You can see that the used transformations as versors can be applied to an object regardless of the grade of its k-blades. For example, in Fig. 2.1, we can describe the arm with points, circles, and spheres, and screw lines for the revoluted or prismatic joints, and then using direct/inverse kinematics relate the geometric entities from the basis through the joints until the end effector. Here a pertinent question will be whether or not operators can be located in space like geometric objects. The answer is yes; we can attach to any position the rotors and translators. This is a big difference from matrices; in geometric algebra the operators or versors treated as geometric objects, however, have a functional characteristic as well. In geometric
52
2 Geometric Algebra for Modeling in Robot Physics
Fig. 2.1 Description of the kinematics of a 5-DOF robot arm using geometric entities of the conformal geometric algebra
algebra, objects are specified in terms of basic elements intrinsic to the problem, whereas the operators or versors are constructed depending upon which Lie group we want to use. A versor is built by successive reflections with respect to certain hyperplanes (lines, planes, spheres). In 3D space, an example of a versor is the rotor, which is built by two successive reflections with respect to two planes that intersect the origin. A versor is applied sandwiching a geometric object. Since a versor represents a Lie group of the general linear groups, it can also be represented as an exponential form wherein the exponent in the Lie algebra space is spanned with a bivector basis (Lie generators). The versor and its exponential form are quite effectively represented using bivectors, which is indeed a less redundant representation than that by the matrices. Versor-based techniques can be applied in spaces of arbitrary signature and are particularly well suited for the formulation of Lorentz and conformal transformations. In tasks of kinematics, dynamics, and modern control theory, we can exploit the Lie algebra representation acting on the bivectorial exponents rather than at the level of Lie group versor representation. In Chaps. 3 and 5, we describe bivector representations of Lie groups.
2.3.6 Extension of Linear Transformations Linear transformations act on n-D vectors in Rn . Since in geometric algebra a subspace is spanned by vectors, the action of the linear transformation on each
2.3 What Does Geometric Algebra Offer for Geometric Computing?
53
individual vector will directly affect the spanned subspace. Subspaces are spanned by wedge products; an outermorphism of a subspace equals the wedge products of the transformed vectors. The outermorphism preserves the grade of any k-blade it acts on; for example, the unit pseudoscalar must be mapped onto some multiple of itself; this multiple is the determinant of f .I / D det.f /I . We can proceed similarly when we compute the intersection of subspaces via the meet operation. We can transform linearly the result of the meet using duality and then apply an outermorphism. This would be exactly the same if we first transformed the subspaces and then computed the meet. In Chap. 9, we exploit outermorphisms for computing projective invariants in the projective space and image plane.
2.3.7 Signals and Wavelets in the Geometric Algebra Framework One may wonder why we should interest ourselves in handling n-D signals and wavelets in the geometric algebra framework. The major motivation is that since in image processing, robotics, and control engineering, n-D signals or vectors are corrupted by noise, our filters and observers should smooth signals and extract features but do so as projections on subspaces of the geometric algebra. Thinking in quadrature filters, we immediately traduce this in quaternionic filters, or further we can generalize over non-Euclidean metrics the Dirac operator for multidimensional image processing. Thus, by weighting the bivectors with appropriate kernels like Gauss, Gabor, or wavelets, we can derive powerful Clifford transforms to analyze signals using the extended phase concept, and carry out convolutions in a certain geometric algebra or on the Riemann sphere. So we can postulate that complex filters can be extended over bivector algebras for computing with Clifford–Fourier transforms and Clifford wavelet transforms or even the space and time Fourier transform. In the geometric algebra framework, we gain a major insight and intuition for the geometric processing of noisy n-D signals. In a geometric algebra with a specific non-Euclidean metric, one can compute geometrically coupling, intrinsically different information, for example, simultaneously process color and thermal images with a multivector derivative or regularize color optical flow with the generalized Laplacian. Chapter 8 is devoted to studying a variety of Clifford–Fourier and Clifford wavelet transforms.
2.3.8 Kinematics and Dynamics In the past, some researchers computed the direct and inverse kinematics of robot arms using matrix algebra and geometric entities like points and lines. In contrast, working in geometric algebra, the repertoire of geometric entities and the use of efficient representation of 3D rigid transformations make the computations easy and intuitive, particularly for finding geometric constraints. In conformal geometric
54
2 Geometric Algebra for Modeling in Robot Physics
algebra, we can perform kinematic computations using meets of spheres, planes, and screw axes, so that the resulting pair of points yields a realistic solution to the problem. Robot object manipulation together with potential fields can be reformulated in conformal geometry using a language of spheres for planning, grasping, and manipulation. The dynamics of a robot mechanism is normally computed using a Euler– Lagrange equation, where the inertial and Coriolis tensors depend on the degrees of freedom of the robot mechanism. Even though conformal geometry does not have versors for coding affine transformations, we can reformulate these equations, so that the entries of the tensors are projections of the centers of mass points of the limbs with respect to the screw–axes of the joints; as a result, we can avoid quadratic entries and facilitate the estimation of the tensor parameters. This is the benefit to handling this kind of a problem in either motor algebra or conformal algebra, making use of the algebra of subspaces and versors. Chapters 11 and 12 cover the computation of various problems of the kinematics and dynamics of robot mechanisms.
2.4 Solving Problems in Perception and Action Systems In this section, we outline how we approach the modeling and design of algorithms to handle tasks of robotic systems acting within the perception–action cycle. In the previous section, we explained what geometric algebra offers as a mathematical system for geometric computing. Here we will be slightly more concrete and precise, illustrating the design and implementation of algorithms for real-time geometric computing. Figure 2.2 shows an abstraction of the attitudes of many researchers and practitioners: How do they approach developing algorithms to solve problems in the domain of PAC systems? Briefly, they split the knowledge across various mathematical systems. As a consequence, as we discussed in Sect. 2.2, the ideas, concepts, methods, and results are unfortunately disseminated across various mathematical systems. Being proficient in only a few of the systems, one does not have access to knowledge formulated in terms of other mathematical systems. There is high redundancy due to the repetitive representation of information in different mathematical systems. A deficient articulation of the different mathematical systems degrades their efficiency. The intrinsic properties and relations represented in different symbolic systems are very difficult to handle. The information density is low, because the information from the problem is reduced due to distribution over different symbolic systems. Bear in mind that geometric algebra was constructed for the optimal encoding of geometric intrinsic characteristics using coordinate-free methods. It ensures an optimal uniformity of concepts and methods across different domains, and it supports a smooth articulation between different alternative systems. The efficient representation of objects and operations guarantees computational efficiency. Since geometric
2.4 Solving Problems in Perception and Action Systems
55
Fig. 2.2 Application of diverse mathematical systems to solve PAC problems
algebra was constructed following these considerations and in view of the progress of scientific theory, geometric algebra is the adequate framework to optimize expressions of the key ideas and consequences of the theory related to perception–action systems; see Fig. 2.3. Of course it will not be possible for all PAC problems to be formulated and computed in geometric algebra. We have to ensure that the integration of techniques proceeds in a kind of top-down approach. First, we should get acquainted with the physics of the problem aided by the contributions of researchers, paying attention particularly to how they solve the problems. Recall the old Western interpretation of the saying Nanos gigantum humeris insidentes: one who develops future intellectual pursuits by understanding the research and works created by notable thinkers of the past. For a start, we should identify where we can make use of geometric algebra. Since geometric algebra is a powerful language for efficient representations and finding geometric constraints, we should first, in a high-symbolic-level approach, postulate formulas (top-down reasoning), symbolically simplify them optimally, and execute them using cost-effective and fast hardware. To close the code generation loop, the conflicts and contradictions caused by our algorithms in their application are fed back in a bottom-up fashion to a negotiation stage in order to ultimately improve our geometric algorithms. Let us now briefly illustrate this procedure. As a first example, we compute efficiently the inverse kinematics of a robot arm. Figure 2.4 shows the rotation planes
56
2 Geometric Algebra for Modeling in Robot Physics Coordinate Geometry Tensors
Complex Variables
Spinors Vector Analysis Geometric Concepts Differential Forms
Matrix Algebra
Synthetic Geometry
Quaternions
Fig. 2.3 Geometric algebra framework includes for the development of PAC systems and essential mathematical systems Fig. 2.4 Computing the inverse kinematics of a 5-DOF robot arm
π2 l2
ly
s2 p2
d
2
s1 p1 z1
z2
θ2 d1
θ3
l1 π1
of a robot arm. While computing its inverse kinematic, one considers a circle that is the intersection of two reference spheres z D s1 ^s2 . We then compute the pair of points (PP) as the meet of the swivel plane and the circle: PP D z ^ swivel , so that we can choose one as a realistic position of the elbow. This is a point that lies on the meet of two spheres. After the optimization of the whole equation of
2.4 Solving Problems in Perception and Action Systems
57
the inverse kinematics, we get an efficient representation for computing this elbow position point and other parts of the robot arm, which can all be executed using fast hardware. Note that using a matrix formulation, it will not be possible to generate an optimal code. A second example involves the use of the meet of ruled surfaces. Imagine that a laser welding device has to follow the intersection of two highly nonlinear surfaces. You can estimate these using a vision system and model them using the concept of ruled surface. In order to control the welding laser attached to the end-effector of the robot manipulator, we can follow the contour gained by computing the meet of the ruled surfaces, as depicted in Fig. 2.5. A third example is to solve the sensor–body calibration problem depicted in Fig. 2.6. This problem can be simplified and linearized, exploding the intrinsic
meet (A,B)
B
A
Fig. 2.5 Robot arm welding with laser along the meet of two ruled surfaces Pan Rotate Tilt
Fig. 2.6 Calibration of the coordinate system of a binocular head with respect to the robot coordinate system
Body–Eye Transform
58
2 Geometric Algebra for Modeling in Robot Physics
relation of screw lines of the sensors and the robot body. The problem is reduced to finding only the unknown motors between these lines. Using matrices will yield a nonlinear problem of the kind AX D XB. As is evident, a line language of motor algebra suffices to tackle such a problem. A fourth problem entails representing 3D shapes in graphics engineering or medical image processing that traditionally should involve the standard method called marching cubes. However, we generalized this method as marching spheres [163] using the conformal geometric algebra framework. Instead of using points on simplexes, we use spheres of G4;1 . In this way, not only do we give more expressive power to the algorithm but we also manage to reutilize the existing software by using, instead of the 2D or 3D vectors, the 4D and 5D vectors, which represent circles and spheres in conformal geometric algebra, respectively. In Fig. 2.7, see the impressive results of carving a 3D shape, where the spheres fill the gaps better than the cubes. As a fifth and last example, let us move to geometric computing. Traditionally, neural networks have been vector-based approaches with the burden of applying a coordinate-dependent algorithm for adjusting the weights between neuron layers. In this application, we have to render a contour of a certain noisy shape like a brain tumor. We use a self-organizing neural network called the neural gas [63]. Instead of adjusting the weights of the neurons to locate them along the contour or shape, we adjust the exponents of motors. In this way, we are operating in the linear space of the bivector or Lie algebra. The gain is twofold. On the one hand, due to the properties of the tangential Lie algebra space, the approach is linear. On the other hand, we exploit the coordinate-free advantage by working on the Lie algebra manifold using bivectors. Figure 2.8 shows the excellent results from segmenting a section of a brain tumor.
Fig. 2.7 Approximation of shape of three-dimensional objects with marching spheres. (a) Approximation of brain structure extracted from CT images (synthetic data); (b) approximation of a tumor extracted from real patient data
2.4 Solving Problems in Perception and Action Systems
59
Fig. 2.8 Algorithm for a 3D object’s shape determination. (a) 3D model of the patient’s head containing a section of the tumor in the marked region; (b) final shape after training is completed with a total of 170 versors M (associated with 170 neural units)
Finally, in Parts V and VI, the reader will find plenty of illustrations using real images and robots where we have used, in a smart and creative manner, the geometric algebra language. We do hope to encourage readers to use this powerful and promising framework to design new real-time algorithms for perception and action systems.
Part II
Euclidean, Pseudo-Euclidean, Lie and Incidence Algebras, and Conformal Geometries
Chapter 3
2D, 3D, and 4D Geometric Algebras
It is believed that imaginary numbers appeared for the first time around 1540 when the mathematicians Tartagalia and Cardano represented real roots of a cubic equation in terms of conjugated complex numbers. A Norwegian surveyor, Caspar Wessel, was, in 1798, the first to represent complex numbers by points on a plane with its vertical axis imaginary and horizontal axis real. This diagram was later known as the Argand diagram, although Argand’s true achievement was an interprep tation of i D .1/ as a rotation by a right angle in the plane. Complex numbers received their name due to Gauss and their formal definition as a pair of real numbers was introduced by Hamilton in 1835.
3.1 Complex, Double, and Dual Numbers In a broad sense, the most general complex numbers [108, 196] on the plane can be categorized into three different systems: ordinary complex numbers, double numbers, and dual numbers. In general, a complex number can be represented as a composed number, a D b C !c, using the algebraic operator !, in which ! 2 D 1 in the case of complex numbers, ! 2 D 1 in the case of double numbers, and ! 2 D 0 in the case of dual numbers. For dual numbers, b represents the real term and c the dual term. For this book, it is useful to recall the notion of a function of a dual variable, in which a differentiable real function f W R ! R with a dual argument ˛ C !ˇ, where ˛, ˇ 2 R, can be expanded using a Taylor series. Because ! 2 D! 3 D! 4 D D0, the function reads f .˛ C !ˇ/ D f .˛/ C !f 0 .˛/ˇ C ! 2 f 00 .˛/ D f .˛/ C !f 0 .˛/ˇ:
ˇ2 C 2Š (3.1)
A useful illustration of this expansion is the exponential function of a dual number: e ˛C!ˇ D e ˛ C !e ˛ ˇ D e ˛ .1 C !ˇ/: E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 3, c Springer-Verlag London Limited 2010
(3.2) 63
64
3 2D, 3D, and 4D Geometric Algebras
In his seminal paper “Preliminary sketch of bi-quaternions” [41], Clifford introduced the use of dual numbers, the motors or bi-quaternions, to represent screw motion. Later, Study [183] used dual numbers to represent the relative position of two skew lines in space, that is, O D C !d , where O represents the dual angle, the difference of the line orientation angles, and d the distance between both lines. The algebras of complex, double (hyperbolic), and dual numbers are isomorphic to the center of certain geometric algebras. For these algebras, we must choose the appropriate multivector basis, so that the unit pseudoscalar squares to 1 for the case of double numbers, to 1 for complex numbers, and to 0 for dual numbers. Note that the pseudoscalar for these numbers maintains its geometric interpretation as a unit hypervolume, and that, as is the case with !, they are commutative with either vectors or bivectors, depending only upon the type of the geometric algebra used. In Sect. 3.2, we consider some examples of composed numbers in geometric algebra: complex numbers in the space G0;1;0 , double numbers in the space G1;0;0 , and dual complex numbers in the space G1;0;1 . We shall describe complex and dual numbers for 2D, 3D, and 4D spaces in some detail. The dual numbers will be used later for the modeling of points, lines, and planes, as well as for the modeling of motion.
3.2 2D Geometric Algebras of the Plane In this section, we want to illustrate the application of different 2D geometric algebras for the modeling of group transformations on the plane. In doing so, we can also clearly see the geometric interpretation and the use of complex, double, and dual numbers for the cases of rotation, affine, and Lorentz transformations, respectively [164,196]. We find these transformations in various tasks of image processing. For the modeling of the 2D space, we choose a geometric algebra that has 22 D 4 elements, given by 1 ; „ƒ‚… scalar
e1 ; e2 ; „ƒ‚… vectors
e1 e2 I: „ƒ‚…
(3.3)
bivector
The highest grade element for the 2D space, called the unit pseudoscalar I e1 e2 , is a bivector. According to the used vector basis, the signature of the geometric algebra will change, yielding complex, double, or dual numbers. Each of these cases is illustrated below. In the geometric algebra G2;0;0 , where I D e1 e2 with I 2 D 1, we want to represent the rotation of the points (x,y) of the Euclidean plane. Here, a rotation of the point z D xe1 C ye2 D r.cos.˛/e1 C sin.˛/e2 / 2 G2;0;0 can be computed
3.2 2D Geometric Algebras of the Plane
65
as the geometric product of the vector and the complex number e I 2 D cos 2 C C e1 e2 sin 2 D .cos 2 C I sin 2 / 2 G2;0;0 , or 2 Spin(2) (spin group), as follows:
z0 D e I 2 ze I 2 2
D e I r.cos.˛/e1 C sin.˛/e2 /e I 2 1 D cos C I sin r.cos.˛/e1 C sin.˛/e2 / cos C I sin 2 2 2 2 D r.cos.˛ C /e1 C sin.˛ C /e2 /: (3.4) Figure 3.1b illustrates that each point of the 2D image of the die is rotated by . Note that this particular form for representing rotation, e I 2 D .cos 2 C I sin 2 /, can be generalized to higher dimensions (see the algebra of rotors in 3D space in Sect. 3.3.1). Let us now represent the points as dual numbers in the geometric algebra G1;0;1 , where I 2 D 0. A 2D point can be represented in G1;0;1 as z D xe1 C ye2 D x.e1 C se2 /, where s D yx is the slope. The shear transformation of this point can be computed by applying a unit shear dual number e I 2 D .1 C I 2 / 2 G1;0;1 as follows: z0 D e I 2 ze I 2 D 1 I .x.e1 C se2 // 1 C I 2 2 D x.e1 C .s C /e2 /: (3.5) Note that the overall effect of this transformation is to shear the plane, where the points (x,y) lie parallel to the e2 -axis through the shear with a shear angle of tan1 . Figure 3.1c depicts the effect of the shear transformation acting on the 2D image of the die. a
b y
y
x
c
x
d
y
y
x
x
Fig. 3.1 Effects of 2D transformations: (a) original cube, (b) cube after rotation, (c) after shear transformation, and (d) after Lorentz transformation
66
3 2D, 3D, and 4D Geometric Algebras
By using the representation of the double number in G1;1;0 , where I 2 D1, we can implement the Lorentz transformation of the points. This transformation is commonly used in space–time algebra for special relativity computations, and it has been suggested for use in psychophysics as well [49,98]. In this context, a 2D point is associated with a double number, z D te1 C xe2 D .cosh.˛/e1 C sinh.˛/e2 / 2 G1;1;0 . The lines jtj D jxj divide the plane into two quadrants with jtj > jxj and two quadrants with jtj < jxj. If we apply a 2D unit displacement vector ˇ e I 2 D a C I b D .cosh.ˇ/ C I sinh.ˇ// 2 G1;1;0 from one of the quadrants, jtj > jxj, to an arbitrary point z D t C I x, we get ˇ
z0 D e I 2 ze I 2 ˇ
D e I 2 .cosh.˛/e1 C sinh.˛/e2 /e I 2 ˇ 1 ˇ ˇ ˇ D cosh C I sinh ..cosh ˛e1 C sinh ˛e2 // cosh C I sinh 2 2 2 2 D .cosh.˛ C ˇ/e1 C sinh.˛ C ˇ/e2 /: (3.6) The point is displaced along a particular hyperbolic path through the interval ˇ in jtj < jxj. Figure 3.1d illustrates the effect of the Lorentz transformation acting on the 2D image of the die.
3.3 3D Geometric Algebra for the Euclidean 3D Space For the case of embedding the Euclidean 3D space, we choose the geometric algebra G3;0;0 , which has 23 D 8 elements given by 1 ; „ƒ‚… scalar
fe ; e2 ; e3 g; „ 1 ƒ‚ … vectors
fe2 e3 ; e3 e1 ; e1 e2 g; „ ƒ‚ … bivectors
fe e e3 g I : „ 1 2ƒ‚ …
(3.7)
trivector
The highest-grade algebraic element for the 3D space is a trivector called a unit pseudoscalar I e1 e2 e3 , which squares to 1 and which commutes with the scalars and bivectors in the 3D space. In the algebra of three-dimensional space, we can construct a trivector a^b^c D I , where the vectors a, b, and c are in the general position and 2 R. Note that no 4-vectors exist since there is no possibility of sweeping the volume element, a^b^c, over a fourth dimension. Multiplication of the three basis vectors e1 , e2 , and e3 by I results in the three basis bivectors e2 e3 D Ie1 , e3 e1 D Ie2 , and e1 e2 D Ie3 . These simple bivectors rotate vectors in their own plane by 90ı , for example, .e1 e2 /e2 D e1 , .e2 e3 /e2 D e3 , etc. Identifying the unit vectors i , j , k of quaternion algebra with Ie1 , I 2 , I 3 allows us to write the famous Hamilton relations i 2 D j 2 D k2 D ijk D 1. Since i ; j ; k are really bivectors, it comes as no surprise that they represent 90ı rotations in orthogonal directions and provide a system well suited for the representation of general 3D rotations (see Fig. 1.1c). Rotors are isomorphic with quaternions. The quaternion and rotor follow the left-hand and the right-hand rotation rule, respectively.
3.3 3D Geometric Algebra for the Euclidean 3D Space
67
3.3.1 The Algebra of Rotors In geometric algebra, a rotor (short name for rotator), R, is an even-grade element of the Euclidean algebra of 3D space. If Q D fr0 ; r1 ; r2 ; r3 g 2 G3;0;0 represents a unit quaternion, then the rotor that performs the same rotation is simply given by RD
r0 C r1 .Ie1 / r2 .Ie2 / C r3 .Ie3 / : „ƒ‚… „ ƒ‚ … scalar
(3.8)
bivectors
C is, therefore, a subset of the Euclidean geometric algebra of The rotor algebra G3;0;0 3D space. Consider in G3;0;0 two nonparallel vectors a and b, which are referred to the same origin. In general, a rotation operation of a vector, a, toward the vector, b, can be performed by two reflections, respective to the unit vector axes n and m (see Fig. 3.2). The components of the first reflection are
n n D jajjnj cos.˛/ 2 D .a n/n1 ; jnj jnj a? D a ak D a .a n/n1 D .an a n/n1 ak D jaj cos.˛/
D .a^n/n1 ;
(3.9)
(3.10)
so the vector a after the first reflection becomes a0 D ak a? D .a n/n1 .a^n/n1 D .a n a^n/n1 D .n a C n^a/n1 D nan1 :
(3.11)
n a⬘ = nan−1 a
a⊥
−a⊥
m
a|| b=ma⬘m−1 = m(nan−1)m−1 ~ =mna(mn)−1 = RaR
Fig. 3.2 Rotor in the 3D space formed by a pair of reflections
68
3 2D, 3D, and 4D Geometric Algebras
The second reflection respective to the axis unit m completes the vector rotation of a toward b, as follows: b D m.a0 /m1 D m.nan1 /m1 D mnan1 m1 D .mn/a.mn/1 Q D RaR 1 D RaR: (3.12) The rotor R composed by these two reflections performs a rotation that is two times greater than the angle between m and n. Now, if we consider successive reflections of the vector a with respect to j planes, we get the following resultant transformation: b D .mj m2 m1 /a.mj m2 m1 /1 D .mj R .j 1/.j 2/ R 32 R 21 /a.mj R .j 1/.j 2/ R 32 R 21 /1 D mj R 1.j 1/ aR .j 1/1 mj ; for j odd D R j1 aR 1j ;
for j even:
(3.13)
Figure 3.3 shows the case for three reflections or j D 3. According to Eqs. 1.83–1.88, the reversion and magnitude of a rotor R are, respectively, given by Q D r0 r1 e2 e3 r2 e3 e1 r3 e1 e2 D r0 r; R Q jj R jj2 D R R:
(3.14)
This implies that the unique multiplicative inverse of R is given by Q k R k2 : R 1 D R If a rotor R satisfies the equation
Fig. 3.3 Trajectory of successive reflections
(3.15)
3.3 3D Geometric Algebra for the Euclidean 3D Space
Q Dk R k2 D r 2 r r D 1; RR 0
69
(3.16)
then we say that this rotor is a unit rotor and its multiplicative inverse is simply Q as denoted previously in Eq. 3.12. R 1 D R, Equation 3.12 shows that the unit rotor corresponds to the geometric product of two unit vectors, R D mn D mn C m^n:
(3.17)
The components of Eq. 3.17 correspond to the scalar and bivector terms of an equivC alent quaternion in G3;0;0 , and thus R 2 G3;0;0 . This even subalgebra corresponds to the algebra of rotors. Considering the scalar and the bivector terms of the rotor of Eq. 3.17, we can further write the Euler representation of a 3D rotation with angle in the left-hand sense, as follows: R D r 0 C r D r0 C r1 e2 e3 C r2 e3 e1 C r3 e1 e2 C sin rN n D ac C as rN n D cos 2 2 N D e 2 rn;
(3.18)
where rN n is the unitary rotation axis–vector spanned by the bivector basis e2 e3 , e3 e1 , and e1 e2 , and the scalars ac and as 2 R. The polar representation of a rotor given in Eq. 3.18 is possible, because the rotor as a Lie group can be expressed in terms of the Lie algebra of bivectors: The orbits on the Lie group manifold describe the evolution of the actions of rotors. The bivector rN n corresponds to the Lie operator tangent to an orbit or geodesic. Q D p0 is a very general way of handling The transformation of a rotor p 7! RpR rotations, which works for multivectors of any grade and in spaces of any dimension. Rotors combine in a straightforward manner, that is, a rotor R 1 followed by a rotor R 2 is equivalent to a total rotor R D R2 R1:
(3.19)
The composition of 3D rotations in different planes is described in Eqs. 3.13 and 3.19; the latter can be visualized geometrically in Fig. 3.4, where the half-angle of each rotor is depicted as a directed arc vector called i , confined to a great circle on the unit sphere. The product of rotors R 1 and R 2 is depicted in Fig. 3.4 by connecting the corresponding arcs at the point z where the two great circles of 1 and 2 intersect. As a result, x D z 1 and y D 2 z; thus the half-angle of the rotor can be computed depending of a common arc vector: 1 D zx and 2 D yz. Combining these results we get 3 D 2 1 D .yz/.zx/ D yx. A rotor is isomorphic with a quaternion. As a result, we can embed quaternions in the more comprehensive mathematical system offered by geometric algebra.
70
3 2D, 3D, and 4D Geometric Algebras
Fig. 3.4 Rotors represented on a sphere
Different as in quaternion theory, in geometric algebra the quaternions or rotors have a clear geometric interpretation due to the representation in space of the rotations as described above by using reflections with respect to planes. Section 3.4 is devoted to quaternion algebra, including more details useful for applications in image processing.
3.3.2 Orthogonal Rotors For the rotation of a vector p in the right-hand sense, we simply adopt the rotor with the minus sign to agree with the standard right-hand rule for the direction of the rotation: rN N R D e 2 D e 2 r n D cos.=2/ sin.=2/rN n :
(3.20)
This rotation operation is depicted in Fig. 3.5. The rotated vector p0 is given by Q D cos.=2/ sin.=2/rN n p cos.=2/ C sin.=2/rN n : (3.21) p 0 D RpR Since the rotation path from p and p 0 is not necessarily unique, neither is the rotor R unique. The shortest path determined by the endpoints p and p 0 lies on a great circle of a sphere with radius k p k, and this is called orthogonal rotation. The rotor itself is called an orthogonal rotor R ? , and it can be calculated using the unit p 0 Cp p vectors kp 0 Cpk and kp k , as follows: R? D
.p0 C p/ p C .p0 C p/^p .p 0 C p/p D k p0 C p k k p k k p0 C p k k p k
3.3 3D Geometric Algebra for the Euclidean 3D Space Fig. 3.5 Geometric interpretation of rotation
71 rn
p+p’ r p θ
p’
θ
O
p^.p0 C p/ .p 0 C p/ p 0 k p C p k k p k k p0 C p k k p k D r ?0 C r ? D cos.? =2/ sin.? =2/r n;? ;
D
(3.22)
where the rotation axis bivector r ? , or the unit rotation axis bivector r n;? , is perpendicular to both p and p 0 , and the angle ? =2 is the angle between the vectors p and p 0 C p. Rotors are isomorphic with quaternions. In signal analysis, quaternions have been used quite often in an operational sense. In contrast, rotors have a clear geometric interpretation, and they are used for geometric operations.
3.3.3 Recovering a Rotor Using previous results, we will derive a procedure to recover a rotor based on two observer measurements. Assume that we have two sets of vectors in 3D fek g and ffk g that are not necessarily orthonormal and are related by a rotation: Q fk D Rek R:
(3.23)
Since R is unknown, we should derive a simple method to recover R. A rotor can be written as B R D e 2 ; jBj B jBj Q D eB 2 D cos C sin : R 2 2 jBj
(3.24)
72
3 2D, 3D, and 4D Geometric Algebras
Using this equation, we find that
Q k D ek cos jBj C sin jBj B e k ek Re 2 2 jBj jBj B jBj sin D 3 cos 2 2 jBj jBj Q R: D 4 cos 2
(3.25)
Combining Eq. 3.23 and Eq. 3.25 we get jBj k Q R 1: fk e D Rek Re D 4 cos 2 k
(3.26)
We can see that the unknown rotor is a scalar multiple of 1+fk e k ; thus one can establish the simple and useful formula RD
1 C fk e k D q ; k j1 C fk e j Q
(3.27)
where D 1 C fk e k . Using this formula, one can recover the rotor directly from the frame vectors.
3.4 Quaternion Algebra The quaternion algebra H was invented by W. R. Hamilton in 1843 [78, 79], when he intended for almost 10 years to find an algebraic system which would do for the space R3 the same as complex numbers do for the space R2 . Interestingly enough, the current formalism of vector algebra was simply extracted out from the quaternion product of two vectors by Gibbs in 1901, namely ab D a b C a b. Hamilton tried to find a multiplications rule for triplets a D a1 i C a2 j C a3 k and b D b1 i C b2 j C b3 k, so that jabj D jajjbj corresponds to a multiplicative product of the vectors a; b 2 R3 . However, according to a result of Legendre (1830) such a bilinear product confined in R3 does not exist; that is, no integer of the form 4a .6b C 7/ with a 0; b 0 can be obtained as a sum of three squares. Hamilton searched for a generalized complex number system in three dimensions, but no such associative hypercomplex numbers exist in three dimensions. One can see this fact easily by considering two imaginary units i and j such that i 2 D j 2 D 1 and furthermore that 1, i , j span R3 . So the multiplication has to be of the form i j D ˛ C i ˇ C j for ˛; ˇ; 2 R. Let us prove its consistency, on the one hand i .i j / D ˛i C ˇ C .i j / D ˇ C ˛i C .˛ C i ˇ C j / D ˇ C ˛ C .˛ C ˇ /i C 2 j :
(3.28) (3.29)
3.4 Quaternion Algebra
73
On the other hand, by associativity, i .i j / D .i i /j D i 2 j D j
(3.30)
is a contradiction with the above equation, since 2 0 for any 2 R. The Frobenius theorem, proved by Ferdinand Georg Frobenius in 1877, characterizes the finite dimensional associative algebras over the real numbers. It proves that if A is a finite dimensional division algebra over the real numbers R then one of the following cases is true: A D R, A D C (complex numbers) and A is isomorphic to the quaternion algebra H. The key idea of Hamilton’s discovery was to move to a 4D space and to consider elements of the form q D s C xi C yj C zk for s; x; y; z 2 R, where the orthogonal imaginary numbers i , j and k obey the following multiplicative rules i 2 D j 2 D 1; k D i j D j i ! k2 D 1:
(3.31)
Hamilton named his four-component elements quaternions. Quaternions form a division ring and are denoted by H in honor of Hamilton. The quaternion algebra is nonassociative. Arthur Cayley (1821–1895) was the first person, after Hamilton, to publish a paper on quaternions. The conjugated of a quaternion is given by qN D s xi yj zk:
(3.32)
Cayley in 1855 discovered the quaternionic representation of four-dimensional rotations, namely R4 ! R4 ;
q 1 ! q 2 q 1 qN 2 ;
(3.33)
where R4 stands for H and q 1 ; q 1 2 H. For the quaternion q, we can compute its partial angles as argi .q/Ds tan 2.x; s/;
argj .q/Ds tan 2.y; s/; argk .q/Ds tan 2.z; s/; (3.34)
and its partial modules and its projections on its imaginary axes as p p s 2 C x 2 ; modj .s/ D s 2 C y 2 ; p modk .q/ D s 2 C z2 ; modi .q/ exp.i argi .q// D s C xi; modi .q/ exp.j argj .q// D s C yj;
modi .q/ D
modk .q/ exp.k argk .q// D s C zk:
(3.35)
In signal analysis, quaternions have been used quite often in an operational sense. In contrast, rotors were introduced for geometric operations. Next we provide some definitions for quaternions which will be useful for the analysis and processing of signals described in Chap. 8.
74
3 2D, 3D, and 4D Geometric Algebras
In a similar way as the complex numbers which can be expressed in a polar representation, we can also represent a quaternion in a polar form. The polar repC resentation of a quaternion, q D r C xi C yj C zk 2 G3;0;0 , is given when the quaternion, seen as a Lie group, is expressed in terms of the Lie algebra of bivectors: q D jqje e2 e3 e e1 e2 e e1 e3 D jqje i e k e j ;
(3.36)
in which .; ; / 2 Œ ; Œ Œ 2 ; 2 ŒŒ 4 ; 4 . For a unit quaternion q D q0 C qx i C qy j C qz k, jqj D 1, its phase can be arcsin.2.qx qy q0 qz // evaluated first by computing D and then by checking that it 2 adheres to the following rules: N // N /q / arg.q Ij .q arg.Ii .q – If 2 4 ; 4 Œ, then D and D . 2 2 N /q / arg.Ik .q or D 0 and – If D ˙ 4 , then select either D 0 and D 2 N // arg.q Ik .q . D 2
k j i – If e e e D q and 0, then ! . – If e i e k e j D q and < 0, then ! C . The reader can find the details of the development of these rules in [29]. The concept of a quaternionic Hermitian function is very useful for the computation of the inverse quaternionic Fourier transform using the quaternionic analytic signal, as we will see in Chap. 8. As an extension of the Hermitian function f W R ! C with f .x/ D f .x/ for every x 2 R, we regard f W R2 ! H as a quaternionic Hermitian function if it fulfills the following nontrivial involution rules [37]: f .x; y/ D jf .x; y/j D Tj .f .x; y//; f .x; y/ D if .x; y/i D Ti .f .x; y//; f .x; y/ D if .x; y/i D i.jf .x; y/j /i D .i j /f .x; y/.j i / D kf .x; y/k D Tk .f .x; y//:
(3.37)
3.5 Lie Algebras and Bivector Algebras This section begins with a brief introduction of Lie group theory. In an abstract way using matrix algebra, a Lie group is defined as a manifold M embedded in a Rnn space together with a product .x; y/. Points on the manifold are vectors of dimension n2 , which represent a group matrix of size n n. The product .x; y/ encodes the group product; it takes two points as arguments and returns a third one. The following conditions applied to the product .x; y/ ensure that this product has the correct group properties. (i) Closure: .x; y/ 2 M, 8 x; y 2 M. (ii) Identity: There exists an element e 2 M such that .e; x/ D .x; e/ D x, 8 x 2 M.
3.5 Lie Algebras and Bivector Algebras
75
(iii) Inverse: For every element x 2 M there exists a unique element xO such that .x; x/ O D .x; O x/ D e. (iv) Associativity: Œ.x; y/; z D Œx; .y; z/, 8 x; y; z 2 M. In general, any manifold equipped with a product that fulfills the properties i i v is called a Lie group manifold. Most of the group properties can be elucidated by analyzing these properties near the identity element e. The product .x; y/ induces a Lie bracket structure on the elements of the tangent space at the identity e. This tangent space is linear, and it is spanned by a set of Lie algebra vectors also called Lie algebra generators. These basis vectors together with their bracket form a Lie algebra. In the geometric algebra framework, we can reveal conveniently the properties of Lie algebra using bivector algebras and furthermore bivector algebra makes it possible to represent Lie groups of the general linear group using few coefficients, for example, a 3D rotation matrix has nine real entries and its isomorphic counterpart rotor only four.
3.5.1 Lie Group of Rotors Let us analyze the Lie group properties of the 3D geometric algebra G3 using rotors. Choose a rotor, R ˛ , and imagine a family of rotors, R.˛/, for which R.0/ D 1: R.˛/ D R ˛ :
(3.38)
This shows that the rotor can be obtained from the identity by a continuous set of rotor transformations. In fact, there are many possible paths to connect R ˛ to the identity, but there is only one path which has the additional property that R.˛ C ˇ/ D R.˛/R.ˇ/:
(3.39)
These properties belong to the one-parameter subgroup of the rotor group, and it represents all rotations in a fixed oriented plane. Now consider a family of vectors Q where v0 is some fixed initial vector, and the differentiation of the v.˛/ D Rv0 R, Q D1 relationship R R d Q D R0 R Q C RR Q 0 D 0; .R R/ d˛
(3.40)
where the dash symbol stands for the differentiation. By using these relations, we can compute d Q C Rv0 R Q 0 D .R 0 R/v.˛/ Q Q v.˛/ D R 0 v0 R v.˛/.R 0 R/ d˛ Q v.˛/: D .2R0 R/
(3.41)
76
3 2D, 3D, and 4D Geometric Algebras
Here we make use of the inner product between a bivector and a vector. In this equation, the derivative of a vector should yield a vector as well, as given by the Q v.˛/. Since R 0 R Q is a bivector now called B.˛/, inner product of .2R 0 R/ Q D B.˛/ ! d .R/ D R 0 D 1 B.˛/R: 2R 0 R d˛ 2
(3.42)
This result is valid for any parameterized set of rotors, however, restricted to the curve defined by Eq. 3.39; one gets the following equation: d 1 1 R.˛ C ˇ/ D B.˛ C ˇ/R.˛ C ˇ/ D B.˛ C ˇ/R.˛/R.ˇ/ d˛ 2 2 d 1 D ŒR.˛/R.ˇ/ D B.˛/R.˛/R.ˇ/: (3.43) d˛ 2 This result shows that the bivector B follows a constant along this curve. Integrating Eq. 3.42, one gets R.˛/ D e
˛
B 2
:
(3.44)
Let us find the equivalent bivector expression of a rotated vector Q D e v.˛/ D R.˛/v0 R.˛/
˛
B 2
v0 e
˛
B 2
:
(3.45)
Differentiating it successively, dv ˛B ˛B D e 2 v0 Be 2 ; d˛ d2 v ˛B ˛B D e 2 .v0 B/ Be 2 ; 2 d˛ ::: (3.46)
etc:
One sees that for every derivative, the inner product of the rotated vector is carried out with one extra B. Since this operation is grade-preserving, the resulting rotated vector can be expressed as a useful Taylor expansion Q R.˛/vR.˛/ D e
˛
B 2
ve
˛
B 2
DvCvB C
1 .v B/ B C : 2Š
(3.47)
3.5.2 Bivector Lie Algebra It is easy to prove that the operation of commuting a multivector with a bivector is always grade-preserving. In particular, the commutator of two bivectors yields a third
3.5 Lie Algebras and Bivector Algebras
77
bivector; thus, it follows that the space of bivectors is closed under the commutator product. This closed algebra is in fact a Lie algebra, which preserves most of the properties of the associated Lie group of rotors. Thus, the Lie group of rotors is formed from the bivector algebra by means of the exponentiation operation. In Lie algebra, elements are acted on by applying the Lie bracket, which is antisymmetric and satisfies the so-called Jacobi identity. In bivector algebra, the Lie bracket is just the commutator of bivectors. Given three bivectors X ; Y ; Z , the Jacobi identity is given by .X Y / Z C .Z X / Y C .Y Z / X D 0:
(3.48)
This equation is simple, and it requires the expansion of each bivector cross product. Given a set of basis bivectors fB i g, the commutator of any pair of these bivectors returns a third bivector, which can be expanded as a linear combination of this set of basis bivectors. Therefore, we can express the commutator operation as follows i B j B k D Cjk Bi :
(3.49)
i is called the structure constants of the Lie algebra, it can be used to The set Cjk recover most properties of the corresponding Lie group. The classification of all possible Lie algebras can be carried out using the structure constants. The mathematician E. Cartan completed the solution of this problem.
3.5.3 Complex Structures and Unitary Groups In geometric algebra, one can represent complex numbers defining one basis vector for the real axis and the other which squares to 1 for the imaginary axis. This suggests that an n-D complex space can have a natural realization in a 2n-D space. Suppose that the n-D space has the vector basis fei g; now we expand this basis set by using a second set of basis vectors feNi g, which fulfill the following properties: ei ej D eNi eNj D ıij ;
ei eNj D 0;
8i; j:
(3.50)
One introduces a complex structure through the so-called doubling bivector: J D e1 eN1 C e2 eN2 C e3 eN3 C C en eNn D ei ^ eNi :
(3.51)
This sum consists of n commuting blades of grade two; each one plays the role of an imaginary number representing an oriented rotation plane. The doubling bivector satisfies the following conditions: J eNi D .ej ^ eNj / eNi D ej ıij D ei ; J ei D .ej ^ eNj / ei D eNi :
(3.52)
78
3 2D, 3D, and 4D Geometric Algebras
This computation shows the interesting role of the doubling vector J , which relates one n-D space with the other. Other useful relations follow: J .J ei / D J eNi D ei ; J .J eNi / D J ei D eNi ;
(3.53)
thus for any vector v in the 2n-D space J .J v/ D .v J / J D v;
8 v:
(3.54)
Similarly as in Eq. 3.47, the Taylor expansion involving the doubling vector J is the description of a series of coupled rotations with respect to ei ^ eNi planes. Using Q can be written as follows: trigonometric expressions, the Taylor expansion of RvR 2 Q D e J 2 ve J 2 D v C v J C .v J / J RvR 2Š 4 3 2 C C vC v J D 1 2Š 4Š 3Š D cos v C sin v J : (3.55)
3.5.4 Hermitian Inner Product and Unitary Groups In the study of unitary groups, one focus is the Hermitian inner product as it is left invariant under group actions. Consider a pair complex vectors: Xi D xi C iyi ;
Yi D zi C i wi 2 C:
(3.56)
Their Hermitian inner product is hX jY i D Xi Yi D xi zi C yi wi C i.yi zi xi wi /:
(3.57)
We are looking for an analog for the 2n-D space, for that let us first introduce the following vectors: x D xi ei C yi eNi ;
y D zi ei C wi eNi :
(3.58)
The real and imaginary components of the Hermitian inner product are x ei y ei C x eNi y eNi D xi zi C yi wi D x y; x ei y eNi x eNi y ei D .x ei y y ei x/ eNi D Œ.y ^x/ ei eNi D .y ^x/ .ei ^ eNi / D .y ^x/ J ;
(3.59)
3.5 Lie Algebras and Bivector Algebras
79
thus we can write the Hermitian inner product in a compact form as hxjyi D x y i.y ^x/ J :
(3.60)
This is a mapping from the 2n-D space onto the complex numbers. Here hxjxi is real. In order to prove the invariance group of the Hermitian inner product, let us start with the equality: .x 0 ^y 0 / J D .x^y/ J ;
(3.61)
Q and y 0 D Ry R. Q We compute where x 0 D Rx R Q Q i D hxy RJ Q Ri RJ .x 0 ^y 0 / J D hx 0 ^y 0 J i D hRx RRy Q R/; D .x^y/ .RJ
(3.62)
this equation holds for all x; y of 2n-D space, thus the following equality has to be true: Q J D RJR:
(3.63)
Since we are dealing with the rotor group that leaves the J invariant, this equality is satisfied. This defines the unitary group, which is denoted as U.n/. This little computation shows us the way we can formulate complex groups as subgroups of real rotation groups in 2n-D spaces. A rotor is given by R D e
B 2
;
(3.64)
where the bivector generators of the unitary group satisfies B J D 0:
(3.65)
This defines the bivector realization of the Lie algebra of the unitary group, and it is called u.n/. In order to construct bivectors satisfying this relation, we resort to the Jacobi identity to prove that Œ.x J /^.y J / J D .x J /^y C .y J /^x D .x^y/ J ; (3.66) this leads to Œx ^y C .x J /^.y J / J D 0:
(3.67)
Note that a bivector similar to the form on the left-hand side will commute with J . Now, if we try all the combinations of the pair fei ; eNi g, we find the following Lie algebra basis for u.n/:
80
3 2D, 3D, and 4D Geometric Algebras
J i D ei eNi ; E ij D ei ej C eNi eNj F ij D ei eNj eNi ej
.i < j D 1; : : : ; n/; .i < j D 1; : : : ; n/:
(3.68)
It is easy to establish the closure of this algebra under the commutator product. The doubling bivector J belongs to this algebra, and it commutes with all other elements of this algebra. Without considering the term J , this algebra defines the special unitary group S U.n/.
3.6 4D Geometric Algebra for 3D Kinematics Usually, problems of robotics are treated in algebraic systems of 2D and 3D spaces. In the case of 3D rigid motion, or Euclidean transformation, we are confronted with a nonlinear mapping; however, if we employ homogeneous coordinates in 4D geometric algebra, we can linearize the rigid motion in 3D Euclidean space. That is why we choose three basis vectors that square to one and a fourth vector that squares to zero – to provide dual copies of the multivectors of the 3D space. In other words, we extend the Euclidean geometric algebra G3;0;0 to the special or degenerated geometric algebra G3;0;1 , which is spanned via the following basis: 1 ; k ; 2 3 ; 3 1 ; 1 2 ; 4 1 ; 4 2 ; 4 3 ; „ƒ‚… „ƒ‚… „ ƒ‚ … scalar
4 vectors
6
bivectors
4
I k „ƒ‚…
;
pseudovectors
I „ƒ‚…
unit pseudoscalar
(3.69) where 42 D 0; k2 D C1 for k D 1; 2; 3. The unit pseudoscalar is I D 1 2 3 4 , with I 2 D .1 2 3 4 /.1 2 3 4 / D .3 4 /.3 4 / D 0:
(3.70)
C , which is the even subalgebra of G3;0;1 , can be utilized to The motor algebra G3;0;1 obtain linear 4D models of the 3D motion of points, lines, and planes.
3.6.1 Motor Algebra The word motor is an abbreviation of “moment and vector.” Clifford introduced motors with the name bi-quaternions [41]. Motors are isomorphic to dual quaternions, with the necessary condition I 2 D 0. They can be found in the special 4D even subalgebra of G3;0;1 introduced in Sect. 3.6. This even subalgebra is denominated C by G3;0;1 and is only spanned via a bivector basis, as follows:
3.6 4D Geometric Algebra for 3D Kinematics
1 ; „ƒ‚… scalar
; ; ; ; ; ; „2 3 3 1 1 2ƒ‚ 4 1 4 2 4 …3 6
bivectors
81
I „ƒ‚…
:
(3.71)
unit pseudoscalar
This kind of basis structure also allows us to represent spinors, which are composed of scalar and bivector terms. Motors, then, are also spinors, and as such, they represent a special kind of rotor. Because a Euclidean transformation includes both rotation and translation, we will show in the following section a spinor representation for both transformations in the definition of motors. But we must first show the relationship between motors and the screw motion theory. Note that the bivector terms of the basis correspond to the same basis for spanning 3D lines. Note also that the dual of a scalar is the pseudoscalar P and that the duals of the first three basis bivectors are actually the next three bivectors, that is, .2 3 / D I 2 3 D 4 1 . We mentioned in Sect. 3.3.1 that a rotor relates two vectors in 3D space. According to Clifford [41], a motor operation is necessary to convert the rotation axis of a rotor into the rotation axis of a second rotor. Each rotor can be geometrically represented as a rotation plane with the rotation axis normal to this plane. Figure 3.6a
a
c
Fig. 3.6 Screw motion about the line axis l (t s : longitudinal displacement by d and R s : rotation angle ): (a) motor relating two axis lines, (b) motor applied to an object, (c) degenerated motor relating two coplanar rotors (note: indicated 3D vectors are represented as bivectors in text)
82
3 2D, 3D, and 4D Geometric Algebras
depicts a motor action in detail. Note that the involved rotor axes are represented as line axes. In the figure, we first orient one axis parallel to the other by applying the rotor R s . Then, we slide the rotated axis a distance d along the connecting axis, so that it ends up overlapping the axis of the second rotor. Altogether, this operation can be described as forming a twist about a screw with the line axis l , whose pitch relationship pitch equals d for ¤ 0. A motor, then, is specified only by its direction and the position of the screw-axis line, twist angular magnitude, and pitch. Figure 3.6b shows an action of a motor on a real object. In this case, the motor relates the rotation-axis line of the initial position of the object to the rotation-axis line of its final position. Note that in both figures the angle and sliding distance indicate how rigid displacement takes place around and along a screw-axis line l , respectively. A degenerated motor can only rotate and not slide along the line l as Fig. 3.6c shows. In this case, therefore, the two axes are coplanar.
3.6.2 Motors, Rotors, and Translators in G C 3;0;1 Since a rigid motion consists of the rotation and translation transformations, it should be possible to split a motor multiplicatively in terms of these two spinor transformations, which we will call a rotor and a translator. In the following discussion, we will denote all bivector components of a spinor by bold lowercase letters. Let us now express this procedure algebraically. First of all, let us consider a simple rotor in its Euler representation for a rotation with an angle , R D a0 C a1 2 3 C a2 3 1 C a3 1 2 D a0 C a C sin n D cos 2 2 D ac C as n;
(3.72)
where n is the unit 3D bivector of the rotation axis spanned by the bivector basis 2 3 , 3 1 , 1 2 , and ac , as 2 R. Now, dealing with the rotor of a screw motion, the rotation-axis vector should be represented as a screw-axis line. For that, we must relate the rotation axis to a reference coordinate system at the distance tc . A 3D translation in motor algebra is represented by a spinor T c , called a translator. If we apply a translator from the left to the rotor R, and then apply the translator’s conjugate from the right, we get a modified rotor, ec Rs D T c R T tc tc .a0 C a/ 1 I D 1CI 2 2 tc tc tc tc D a0 C a C Ia0 C I a Ia0 I a 2 2 2 2
3.6 4D Geometric Algebra for 3D Kinematics
tc tc aa 2 2 D a0 C a C I .a t c / : D a0 C a C I
83
(3.73)
Here, t c is the 3D vector of translation spanned by the bivector basis 2 3 , 3 1 , 1 2 . Then, expressing the last equation in Euler terms, we get the spinor representation, R s D a0 C as n C Ias n^t c D ac C as .n C I m/ C sin .n C I m/ D cos 2 2 D cos C sin l: 2 2
(3.74)
This result is indeed interesting because the new rotor R s can now be applied with respect to an axis line l expressed in dual terms of direction n and moment m D n ^ t c . Now, to define the motor finally, let us slide the distance t s D d n along the rotation-axis line l . Since a motor is applied from the left and conjugated from the right, we should use half of t s in the spinor expression of T s when we define the motor: ts M D T s Rs D 1 C I .a0 C a C I a^t c / 2 dn D 1CI .ac C as n C Ias n^t c / 2 d d D ac C as n C Ias n^t c C I ac n I as nn 2 2 d d .n C I n^t c / D ac I as C as C Iac 2 2 d d D ac Ias C as C Iac l: (3.75) 2 2 Note that this expression of the motor makes explicit the unit line bivector of the screw-axis line l . Now let us express representation. By substituting the con a motor using Euler stants ac D cos 2 and as D sin 2 in the motor equation (3.75) and using the property of Eq. 3.1, we get d d M D T s R s D cos I sin C sin C I cos l 2 2 2 2 2 2 d d D cos CI C sin CI l; (3.76) 2 2 2 2
84
3 2D, 3D, and 4D Geometric Algebras
which is a dual-number representation of the spinor. Now, let us analyze the resultant expressions: C sin n; 2 2 R s D cos C sin l; 2 2 d d M D cos CI C sin CI l: 2 2 2 2 R D cos
(3.77)
We can see that the rotation axis n of the simple rotor R is changed to a rotation-axis line, so that R s now rotates about an axis line. And in the motor expression, the information for the sliding distance d is now made explicit in terms of dual arguments of the trigonometric functions. It is also interesting to note that the expression for the motor using dual angles simply extends the expression of R s . If we expand the exponential function of the dual bivectors using a Taylor series, the result will follow the general expression e ˛CIˇ D e ˛ C Ie ˛ ˇ D e ˛ .1 C Iˇ/, which is a special case of Eq. 3.1. Once again, we obtain the motor expression as the spinor: ts l CI t2s 2 e el 2 D T s Rs ; D 1CI 2
(3.78)
where I t2s D I 12 .t1 2 3 C t2 3 1 C t3 1 2 / D 12 .t1 4 1 C t2 4 2 C t3 4 3 /. If we want to express the motor using only rotors in a dual spinor representation, we proceed as follows: ts M D T s Rs D 1 C I Rs 2 ts D Rs C I Rs : 2
(3.79)
Let us consider carefully the resultant dual part of the motor. This is the geometric product of the bivector t s and the rotor R s . Since both are expressed in terms of the same bivector basis, their geometric product will also be expressed in this basis, which can be considered as a new rotor R 0s . Thus, we can further write M D Rs C I
ts R s D R s C I R 0s : 2
(3.80)
In this equation, the line axes of the rotors are skewed (see Fig. 3.6a). This means that they represent the general case of non-coplanar rotors. If the sliding distance t s is zero, then the motor will degenerate to a rotor: ts 0 Rs D 1 C I R s D Rs : M D Ts Rs D 1 C I 2 2
(3.81)
3.6 4D Geometric Algebra for 3D Kinematics
85
In this case, that is, when the two generating axis lines of the motor are coplanar, we get the so-called degenerated motor (see Fig. 3.6c). Finally, the bivector t s can be expressed in terms of the rotors using previous results: ts 0e es I Rs R Rs Rs D (3.82) 2 therefore, es : t s D 2R0s R
(3.83)
Figure 3.6 shows that the 3D vector t, expressed in the bivector basis, is referred to as the rotation axis of the rotor, and that ts is a bivector along the motor-axis line. Thus, t, considered here as a bivector, can be computed in terms of the bivectors t c and t s , as follows: t D t ? C t k; e s / C .t n/n D .t c R s t c R es / C d n t D .t c R s t c R e D t c Rs t c R s C t s e s C 2R0s R es : D t c Rs t c R
(3.84)
So far, we have analyzed the motor from a geometrical point of view. Next, we will look at the motor’s relevant algebraic properties.
3.6.3 Properties of Motors A general motor can be expressed as M ˛ D ˛M ;
(3.85)
where ˛ 2 R and M is a unit motor, as explained in the previous sections. In this section, we will employ unit motors. The norm of a motor M is defined as follows: ts ts f e e Q jM j D M M D T s R s R s T s D 1 C I Rs Rs 1 I 2 2 ts ts (3.86) D 1 C I I D 1; 2 2 f is the conjugate motor and 1 is the identity of the motor multiplication. where M Now, using Eq. 3.80 and considering the unit motor magnitude, we find two useful properties, expressed by f D .R s C I R 0s /.R es C I R e 0s / jM j D M M e s C I.R 0 R e s C Rs R e 0 / D 1: D Rs R s
s
(3.87)
86
3 2D, 3D, and 4D Geometric Algebras
These equations require the following constraints: e s D 1; Rs R es R 0s R
C
e 0s : Rs R
(3.88) (3.89)
Now we can show that the combination of two rigid motions can be expressed using two consecutive motors. The resultant motor describes the overall displacement, namely, M c D M a M b D .R sa C I R 0sa /.R sb C I R 0sb / D R sa R sb C I.Rsa R 0sb C R 0sa R sb / D R sc C I R 0sc :
(3.90)
Note that, on the one hand, pure rotations combine multiplicatively, and, on the other hand, the dual parts containing the translation combine additively. Using Eq. 3.80, let us express a motor in terms of dual spinors: M D T s R s D R s C I R 0s D .a0 C a1 2 3 C a2 3 2 C a3 2 1 / CI.b0 C b1 2 3 C b2 3 2 C b3 2 1 / D .a0 C a/ C I.b0 C b/:
(3.91)
We can use another notation to enhance the components of the real and dual parts of the motor, as follows: M D .a0 ; a/ C I.b0 ; b/:
(3.92)
Here, each term within the brackets consists of a scalar part and a 3D bivector. A motor expressed in terms of a translator and a rotor is manipulated similarly as in the case of a rotor, from the left, and its conjugate from the right. These left and right operations, called motor reflections, are used to build an automorphism equivalent to the screw. Yet, by conjugating only the rotor or only the translator for the second reflection, we can derive different types of automorphisms. By changing the sign of the scalar and bivector in the real and dual parts of the motor, we get the following variations: M D .a0 C a/ C I.b0 C b/ D T s R s ; fs T fs ; f D .a0 a/ C I.b0 b/ D R M fs ; N D .a0 C a/ I.b0 C b/ D R s T M f fs T s : N D .a0 a/ I.b0 b/ D R M
(3.93)
3.7 4D Geometric Algebra for Projective 3D Space
87
The first, second, and fourth versions will be used for modeling the motion of points, lines, and planes, respectively. Using Eq. 3.93, it is now straightforward to compute the expressions for the individual components: 1 M 4 1 I b0 D M 4 1 aD M 4 1 Ib D M 4 a0 D
f fCM N CM N ; CM f fM N M N ; CM f fCM N M N ; M f fM N CM N : M
(3.94)
3.7 4D Geometric Algebra for Projective 3D Space To this point, we have dealt with transformations in three-dimensional space. When we use homogeneous coordinates, we increase the dimension of the vector space by one. As a result, the transformation of 3D motion becomes linear. Let us now model the projective 3D space, P 3 . This space corresponds to the homogeneous extended space, R4 . In real applications, it is important to regard the signature of the modeled space to facilitate the computations. In the case of the modeling of the projective plane using homogeneous coordinates, we adopt G3;0;0 of the ordinary space, E 3 , which has the standard Euclidean signature. For the four-dimensional space R4 , we are forced to adopt the same signature as in the case of the Euclidean space. This geometric algebra, G1;3;0 , is spanned with the following basis: k ; 2 3 ; 3 1 ; 1 2 ; 4 1 ; 4 2 ; 4 3 ; 1 ; „ƒ‚… „ƒ‚… „ ƒ‚ … scalar
4 vectors
6
4
bivectors
I k „ƒ‚…
;
pseudovectors
I „ƒ‚…
(3.95)
unit pseudoscalar
where 42 D C1; k2 D 1 for k D 1; 2; 3. The unit pseudoscalar is I D 1 2 3 4 , with I 2 D .1 2 3 4 /.1 2 3 4 / D .3 4 /.3 4 / D 1:
(3.96)
The geometric algebras G3;0;0 and G1;3;0 will be used in Chap. 9 for the geometric modeling of the image plane and the visual 3D space. In space–time algebra, the fourth basis vector 4 of G1;3;0 is selected as the time axis for applications of the projective split [115]. This helps to associate multivectors of the 4D space with multivectors of the 3D space. The role and use of the projective split for a variety of problems involving the algebra of incidence will also be discussed in Chap. 7.
88
3 2D, 3D, and 4D Geometric Algebras
3.8 Conclusion This chapter gives an outline of geometric algebra. In particular, it explains the geometric product and the meaning of multivectors. The geometric approach adopted is a version of Clifford algebra, and it is used in the whole book as the unifying language for the design of artificial perception-action systems. The chapter presents the geometric algebras of the plane and 3D space, where we can find the planar and 3D quaternions. Finally, we introduce 4D geometric algebras useful for computations involving dual quaternions (motors) or projective geometry. The chapter offers various exercises, so that the reader can start to learn to compute in Clifford algebra easily.
3.9 Exercises 3.1 Given a D e2 C 2e1 e2 , b D e1 C 2e2 , and c D 3e1 e2 2 G2;0;0 , compute ab, ba, and bac. 3.2 Given the coordinates (2,5), (5,7), and (6,0) for the corners of a triangle, compute in G2;0;0 , the triangle’s area using multivectors. 3.3 Given a D e1 2e2 , b D e1 C e2 , and v D 5e1 e2 2 G2;0;0 , compute ˛ and ˇ for r D ˛a C ˇb. 3.4 Given a D 8e1 e2 and b D 2e1 C 2e2 2 G2;0;0 , compute ajj and a? . N and compute the inverse x 1 when 3.5 For each x 2 G2;0;0 , prove x xN D xx, x xN 6D 0. 3.6 For a Euclidean unit 2-blade, I 2 D 1; interpret this geometrically in terms of versors. 3.7 Prove, using multivectors in G2;0;0 , the following sinus identity: sin ˇ sin sin ˛ D D : jaj jbj jcj 3.8 Show in G3;0;0 that the pseudoscalar I commutes with e1 , e2 , and e3 . Compute the volume of a parallelepiped spanned by the vectors a D 3e1 4e2 C 4e3 , b D 2e1 C 4e2 2e3 , and c D 4e1 2e2 3e3 . 3.9 Compute AB in G3;0;0 if A D 3e1 e2 C 6e2 e3 and B D a ^ b where a D 3e1 4e2 C 5e3 and b D 2e1 C 4e2 2e3 .
3.9 Exercises
89
3.10 Given in G3;0;0 the vector a D 3e1 C 5e2 C 6e3 and the bivector B D 3e2 e3 C 6e2 1 C 2e1 e2 , compute the parallel and orthogonal projections of a with respect to the plane B. Then compute the cross product of these projected components and interpret geometrically the dual relationship of the result with respect to B. 3.11 Show in G3;0;0 that the geometric product of the bivector A and any vector x can be decomposed as follows: 1 Ax D A x C .Ax xA/ C A ^x: 2 V 3.12 Given x D 1 C a C B, where a 2 R3 and B 2 2 R3 , and given the outer inverse of x is x ^.1/ D 1 a B C ˛a^B, where ˛ 2 R. (a) Compute ˛. Hint: Use the power series or x^x ^.1/ D 1; the outer square root 1 of x is x ^. 2 / D 1 C 12 a C 12 B C ˇa^B, where ˇ 2 R. 1 .1 (b) Compute ˇ. Hint: x ^. 2 / ^x 2 / D x. (c) Give a geometric interpretation of the outer inverse and of the outer square root and suggest their applications. 3.13 Using CLICAL [124] in G3;0;0 , rotate the vector r D 2e1 C 3e2 C 2e3 about the axis n D 1:4e1 C 1:9e2 with angle D jnj. Since jj , the rotation is well defined. 3.14 Using CLICAL [124] in G3;0;0 , rotate the vector r D 2e1 C 3e2 C 2e3 about the axis n D 1:4e1 C 1:9e2 with angle D jnj. Since jj , the rotation is well defined. 3.15 Two consecutive rotations in G3;0;0 , one about the axis a with the angle ˛ D jaj and another about the axis b with the angle ˇ D jbj, are equivalent to the rotation about an axis c. Prove this statement using the Rodriguez formula, c0 D
a0 C b0 a0 b0 ; 1 a0 b0
˛ where a0 D a ˛ tan. 2 / and ˛ D jaj. (Hint: Compare the scalar and bivector terms of ˇ ˛ e 2 I c D e 2 I a e 2 I b .)
3.16 Using the vectors a and b of G3;0;0 , prove that the rotation of a to b can be represented by the rotor 1
R D .ab/ 2 D
a.b C a/ .a C b/b D : ja C bj jb C aj 1
Since the norm jaCbj D Œ2.1 C a b/ 2 is not relevant, you can write R as follows:
90
3 2D, 3D, and 4D Geometric Algebras
R D.a P C b/b D a.b C a/ D 1 C ab: Also prove this equation. You can interpret the symbol D P as a projective identity or an identity up to a scalar factor. (Hint: Each rotation can be represented by two reflections.) 3.17 Given the rotors R 1 D e B 1 =2 and the rotor R 2 D e B 2 =2 , show that their product is R 3 D R 2 R 1 D e B 3 =2 :
(3.97)
i of the 3.18 Show that for an orthonormal bivector basis, the structure constants Cjk Lie algebra of the 3D rotation group are simply ijk .
3.19 Given the bivectors X , Y and Z , prove the Jacobi identity: .X Y / Z C .Z X / Y C .Y Z / X D 0:
(3.98)
3.20 Lie algebra: the bivector generators of the unitary group are Eij D ei ej C fi fj ; .i < j D 1; : : : ; n/ Fij D ei fj C fi ej ; .i < j D 1; : : : ; n/ Ji D ei fi :
(3.99)
Also show that this algebra is closed under the commutator product and give some examples as well. O and YO is defined as shown in 3.21 The dihedral angle between two planes X O and YO and zO the unit vector along the intersecFig. 3.7a. Given the unit bivectors X tion line, prove that O YO D e I z ; X
(3.100)
where z D zO. 3.22 Consider the spherical triangle with corners described by three unit vectors coming from the origin to the surface of a unit sphere; see Fig. 3.7b. The lengths of the three arcs are given by jAj, jBj, and jC j; prove the following formulas aO bO D e C ; O c D eA ; bO cO aO D e B :
3.9 Exercises
91 c γ ^ Z A B
C
b
^ X
^ Y
b
O a
F
C
a
Fig. 3.7 (a) Dihedral angle between two planes. (b) The spherical triangle
3.23 Following exercise 3.23, the angles ˛; ˇ and between the planes are called the dihedral angles, prove that OA O D e I c ; kck D ; B O D e I a ; kak D ˛; CO B O D e I b ; kbk D ˇ: CO A Also prove that e C e A e B D 1; e I c e I b e I a D 1:
(3.101)
3.24 Following exercise 3.23, take the scalar part of the equation e I c D e I b e I a and prove the cosine law for the angles in spherical trigonometry, namely cos. / D cos.˛/ cos.ˇ/ C sin.˛/ sin.ˇ/ cos.jC j/:
(3.102)
Since this formulation is more advantageous than traditional formulations, suggest some applications using this equation. e 2 . What is the geometric mean3.25 Using a sphere, draw the rotated rotor R 2 R 1 R ing of this operation with respect to the action of this sequence of rotors acting on a vector? 3.26 Given the bivector B, and X and Y two general multivectors, prove that B .X Y / D .B X /Y C X .B Y /;
(3.103)
92
3 2D, 3D, and 4D Geometric Algebras
hence show that B .v^V r / D .B v/^V C v^.B V r /:
(3.104)
By using this result, establish the fact that the operation of commuting with a bivector is grade preserving. 3.27 Given a linear function f .x/ and an orthonormal frame fek g, one can form the following matrix fij D ei f .ej /: gi;j D ei g.ej /
(3.105) (3.106)
Prove that the matrix fN.x/ is nothing else than the transport matrix. Furthermore, prove that the product transformation h D fg is determined by the multiplication of fij and gij . 3.28 Explain the geometric meaning of the projection PR .x/ D .x R/R 1 . 3.29 In R4;0 with an associated orthonormal basis fei g4iD1 , perform a rotation in the plane e1 ^e2 , followed with a rotation in the plane e3 ^e4 . Compute the rotor of this rotor composition and show that this is the exponent of a bivector, not a 2-blade. The resulting rotor is not of a simple form consisting of a scalar plus a 2-blade or even scalar plus a bivector, why? 3.30 Using the vectors a and b of G3;0;0 , prove that the rotation of a to b can be represented by the rotor 1
R D .ab/ 2 D
a.b C a/ .a C b/b D : ja C bj jb C aj 1
Since the norm jaCbj D Œ2.1 C a b/ 2 is not relevant, you can write R as follows: RD P .a C b/b D a.b C a/ D 1 C ab: Also prove this equation. You can interpret the symbol D P as a projective identity or an identity up to a scalar factor. (Hint: Each rotation can be represented by two reflections.) 3.31 Check by hand, if Eqs. 3.77 are correct. Note that the equations have a similar format and that by the equation for the motor the angle is changed to a dual angle and the rotation axis n is changed to a screw line l D n C I m.
Chapter 4
Kinematics of the 2D and 3D Spaces
4.1 Introduction This chapter presents the geometric algebra framework for dealing with 3D kinematics. The reader will see the usefulness of this mathematical approach for applications in computer vision and kinematics. We start with an introduction to 4D geometric algebra for 3D kinematics. Then we reformulate, using 3D and 4D geometric algebras, the classic model for the 3D motion of vectors. Finally, we compare both models, that is, the one using 3D Euclidean geometric algebra and our model, which uses 4D motor algebra.
4.2 Representation of Points, Lines, and Planes Using 3D Geometric Algebra The modeling of points, lines, and planes in 3D Euclidean space will be done using the Euclidean geometric algebra G3;0;0 , where the pseudoscalar I 2 D 1. A point in 3D space represents a position and thus can be simply spanned using the vector basis of G3;0;0 : x D xe1 C ye2 C ze3 ;
(4.1)
where x; y; z 2 R. In classical vector calculus, a line is described by a position vector x that touches any point of the line and by a vector n for the line direction, that is, l D x C ˛n, where ˛ 2 R. In geometric algebra, we employ a multivector concept, and we can thus compactly represent in G3;0;0 any line, using a vector n for its direction and a bivector m for the orientation of the plane within which the line lies. Thus, l D n C x^n D n C m:
(4.2)
Note that the moment bivector m is computed as the outer product of the position vector x and the line direction vector n. We can also compute m as the dual of a vector, that is, I x n D m. E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 4, c Springer-Verlag London Limited 2010
93
94
4 Kinematics of the 2D and 3D Spaces
The representation of the plane is even more striking. The plane is a geometric entity one grade higher than the line, so we would expect that the multivector representation of the plane would be a natural multivector grade extension from that of the line. In classical vector calculus, a plane is described in terms of the Hesse distance, which represents the distance from the origin to the plane, and a vector that indicates the plane orientation, that is, fd; ng. Note that this description is composed of two separate attributes which come from the equation: nx x C ny y C nz z d D nT x d D 0. Once again, using geometric algebra we can express the plane more compactly and with clearer geometric sense. In G3;0;0 , for example, the extension of the line expression to a plane would be expressed in terms of a bivector n and a trivector Id : h D n C x^n D n C Id;
(4.3)
where the bivector n indicates the plane orientation, and the outer product of the position vector x and the bivector n builds a trivector that can be expressed using the Hesse distance, a scalar value, and the unit pseudoscalar I . Note that the trivector represents a volume, whereas the scalar d represents the Hesse distance. Figure 4.1 compares the different representations of points, lines, and planes using classical vector calculus, Euclidean geometric algebra for G3;0;0 , and motor algebra C for G3;0;1 .
a
b
c
Fig. 4.1 Representations of (a) points, (b) lines, and (c) planes using vector calculus, G3;0;0 and G3;0;1
4.3 Representation of Points, Lines, and Planes Using Motor Algebra
95
4.3 Representation of Points, Lines, and Planes Using Motor Algebra In this section, we will model points, lines, and planes in 4D space using the special C algebra of the motor G3;0;1 , which spans in 4D the line space using a bivector basis. For the case of the point representation, we proceed by embedding a 3D point on C the hyperplane X4 D 1, so that the equation of the point X 2 G3;0;1 reads X D 1 C x1 4 1 C x2 4 2 C x3 4 3 D 1 C I.x1 2 3 C x2 3 1 C x3 1 2 / D 1 C I x;
(4.4)
or X D .1; 0/ C I.0; x/. We can see that in this expression the real part consists of the scalar 1 and the dual part of a 3D bivector. C Since we are working in the algebra G3;0;1 spanned only by bivectors and scalars, we see that this special geometric algebra is the most appropriate system for line modeling. Unlike the line representation, the point and the plane are in some sense asymmetric representations with respect to the scalar and bivector parts. Let us now C rewrite the line equation (4.2) of G3;0;0 in the degenerated geometric algebra G3;0;1 . C We can express the vector and the dual vector of Eq. 4.2 in G3;0;1 as a bivector and a dual bivector. Since the product of the unit pseudoscalar, I D 1 2 3 4 , and any dual bivectors built from the basis, f4 1 ; 4 2 ; 4 3 g, is zero, we must select the bivector basis, f2 3 ; 3 1 ; 1 2 g, for representing the line L D n C I m:
(4.5)
In this case, the bivectors for the line direction and the moment are computed using two bivector points, x 1 and x 2 , lying on the line, as follows: n D .x 2 x 1 / D .x21 x11 /2 3 C .x22 x12 /3 1 C .x23 x13 /1 2 ; D Ln1 2 3 C Ln2 3 1 C Ln3 1 2 m D x1 x2 D .x12 x23 x13 x22 /2 3 C .x13 x21 x11 x23 /3 1 C C C.x11 x22 x12 x21 /1 2 D Lm1 2 3 C Lm2 3 1 C Lm3 1 2 :
(4.6)
This line representation using dual numbers is easy to understand and to manipulate algebraically, and it is fully equivalent to the representation in terms of Pl¨ucker coordinates. Using bracket notation, the line equation becomes L .0; n/ C I.0; m/, where n and m are spanned with a 3D bivector basis. For the equation of the plane, we proceed in a similar manner as for Eq. 4.3. We represent the orientation of the plane via the bivector n and the outer product between a bivector touching the plane and its orientation n. This outer product results
96
4 Kinematics of the 2D and 3D Spaces
in a quatrivector, which we can express as the Hesse distance d D .x n/ multiplied by the unit pseudoscalar, H D n C x^n D n C I.x n/ D n C Id;
(4.7)
or H D .0; n/ C I.d; 0/. Note that the plane equation is the dual of the point equation H D .d C I n/ D .I n/ C .d / D n C Id;
(4.8)
where the plane orientation is given by the unit bivector n and the Hesse distance d by the scalar 1.
4.4 Representation of Points, Lines, and Planes Using 4D Geometric Algebra We can also represent point, lines, and planes using the entire 4D geometric algebra G3;0;1 . As opposed to the previous representations that use only bivectors, this representation uses vectors and the trivectors’ basis. The point expressed in terms of trivectors is given by X D 1 2 3 C x1 2 3 4 C x2 3 1 4 C x3 1 2 4 :
(4.9)
The equation of the line using bivectors’ basis is exactly the same as that for Eq. 4.5: L D Ln1 2 3 C Ln1 3 1 C Ln3 1 2 C Lm1 4 1 C Lm2 4 2 C Lm3 4 3 D Ln1 2 3 C Ln1 3 1 C Ln3 1 2 C I.Lm1 2 3 C Lm2 3 1 C Lm3 1 2 / D n C I m:
(4.10)
The equation of the plane is spanned using basis vectors in terms of the normal of the plane and the Hessian distance: H D nx 1 C ny 2 C nz 3 C d4 :
(4.11)
Note that in this equation the multivector basis of the point and plane have been swapped. This equation corresponds to the dual of point equation (4.9) and makes use of a vector as the dual of each trivector: H D .d1 2 3 C nx 2 3 4 C ny 3 1 4 C nz 1 2 4 / D d.1 2 3 / C nx .2 3 4 / C ny .3 1 4 / C nz .1 2 4 / D nx 1 C ny 2 C nz 3 C d4 :
(4.12)
4.5 Motion of Points, Lines, and Planes in 3D Geometric Algebra
97
Note that the dual operation is actually not carried out via the pseudoscalar I D 1 2 3 4 because this will lead to the square of 4 , which equals zero. In order to explain the relation between Eqs. 4.9 and 4.11, we are simply relating in the dual sense each basis vector with a basis trivector. Since Eqs. 4.4 and 4.7 are the duals of Eqs. 4.9 and 4.11, we can reconsider Fig. 4.1c, now using a trivector coordinate basis for depicting the point equation (4.9), and similarly, Fig. 4.1a, now using a vector coordinate basis for the plane equation (4.11). The following sections are concerned with the modeling of the motion of basic geometric entities in 3D and 4D spaces. By comparing these motion models, we will show the power of geometric algebra in the representation and linearizing of the translation transformation achieved in 4D geometric algebra.
4.5 Motion of Points, Lines, and Planes in 3D Geometric Algebra The 3D motion of a point x in G3;0;0 is given by the following equation: Q C t: x 0 D Rx R
(4.13)
Using Eq. 4.2, the motion equation of the line can be expressed as follows: l 0 D n0 C m0 D n0 C x 0 ^n0 Q C .Rx R Q C t/^.RnR/ Q D RnR Q C Rx R^Rn Q Q C t ^RnR Q D RnR R t Q RnR Q t Q C Rx R^Rn Q Q C RnR D RnR R 2 2 t Q t Q Q Q D RnR C Rn R C RnR C RmR; 2 2
(4.14)
where x 0 stands for the rotated and shifted position vector, n0 stands for the rotated orientation vector, and m0 for the new line moment. The model of the motion of the plane in G3;0;0 can be expressed in terms of the multivector Hesse equation (4.3), as follows: h0 D n0 C Id 0 D n0 C x 0 ^n0 Q C .Rx R Q C t/^.RnR/ Q D RnR Q C Rx R^Rn Q Q C t ^RnR Q D RnR R Q C t ^RnR Q C Rx^nR Q D RnR Q C t ^RnR Q C R.Id /R Q D RnR Q C Id Q C t RnR D RnR Q Q C d /; D RnR C I.t RnR
(4.15)
98
4 Kinematics of the 2D and 3D Spaces
where n0 stands for the rotated bivector plane orientation, x 0 stands for the rotated and shifted position vector, and d 0 for the new Hesse distance. Here, we use the Q D t RnR Q D .I t/ RnR. Q concept of duality to claim that t ^RnR
4.6 Motion of Points, Lines, and Planes Using Motor Algebra The modeling of the 3D motion of geometric primitives using the motor algebra C G3;0;1 takes place in a 4D space where rotation and translation are multiplicative operators that are applied as multiplicative operators, and the result is that the 3D general motion becomes linear. Having created a linear model, we can then compute simultaneously the unknown rotation and translation. This will be useful for the case of the hand-eye problem, or when we apply the motor extended Kalman filter (see Chap. 18). For the modeling of the point motion, we use the point representation of Eq. 4.4 and the motor relations given in Eq. 3.93, with I 2 D 0: f f N D M .1 C I x/M N X 0 D 1 C I x0 D M X M Q D T s R s .1 C I x/R s T s ts ts Q R s .1 C I x/ R s 1 C I D 1CI 2 2 t ts s Q 1 C I Rs xRs 1 C I D 1CI 2 2 ts t s D 1 C I C I R s x RQ s C I 2 2 D 1 C I.R s x RQ s C t s /:
(4.16)
Note that the dual part of this equation in 4D space is fully equivalent to Eq. 4.13, which is in 3D space. The motion of a 3D line or screw motion can be seen as the rotation of the line about the axis line Ls and its translation along this axis line, as depicted in Fig. 4.2. Note that in the figure the line Ls is shifted a distance t c from the origin. Now, using line equation (4.5), we can express the motion of a 3D line as follows: Q D T s R s .n C I m/RQ s TQ s : L0 D n0 C I m0 D M .L/M This equation can be further expressed purely in terms of rotors, as follows: ts ts Q R s .n C I m/R s 1 I L D 1CI 2 2 ts ts Q Q Q R s nR s C I R s mR s I R s nR s D 1CI 2 2 0
(4.17)
4.6 Motion of Points, Lines, and Planes Using Motor Algebra
99
γ1γ2 Ls L’
θ
ts
L n’
p’
tc O
p
θ n
γ3γ1
γ2γ3
Fig. 4.2 The screw motion of a line
ts ts Q Q Q Q D R s nR s C I R s nR s C R s nR s C R s mR s 2 2 0 0 Q Q Q D R s nR s C I.R s nR s C R s nR s C R s mRQ s /:
(4.18)
Note that in this equation, before we merge the bivector t2s with the rotor R s or RQ s , the real and the dual parts are fully equivalent with the elements of line equation (4.14) of G3;0;0 . Equation 4.18 is very useful as a linear algorithm to estimate simultaneous rotation and translation, as in the case of hand–eye calibration [11] or for the algorithm for the motor extended Kalman filter presented in Chap. 18. C The transformation of a plane under a rigid motion in G3;0;1 can be seen as the motion of the dual of the point. Thus, Eq. 4.7 can be utilized to express the motion equation of the plane: QN D M .n C Id /M NQ H 0 D n0 C Id 0 D M H M D T s R s .n C Id /RQ s T s ts ts Q R s nR s C Id D 1CI 1CI 2 2 t t s s Q Q Q D R s nR s C I R s nR s C R s nR s C d 2 2 D R s nRQ s C I.t s .R s nRQ s / C d /:
(4.19)
100
4 Kinematics of the 2D and 3D Spaces
Note that the real part and the dual part of this expression are fully equivalent to the bivector and trivector parts of Eq. 4.15 in G3;0;0 .
4.7 Motion of Points, Lines, and Planes Using 4D Geometric Algebra For the modeling of the motion of points, lines, and planes in G3;0;1 , only the automorphism equivalent of the screw and its conjugate is required. The motion of the point is given by Q D Ts R s .X /RQ s TQ s X 0 D MX M ts ts R s .1 2 3 C I x/RQ s 1 I D 1CI 2 2 t t .1 2 3 C IR s x RQ s / 1 4 D 1 C 4 2 2 t t s D 1 2 3 C I C IR s x RQ s C I 2 2 Q D 1 2 3 C I.R s x R s C t s /;
(4.20)
where t s D tx 2 3 C ty 3 1 C tz 1 2 , x D xx 1 C xy 2 C xz 3 , and t D tx 1 C ty 2 C tz 3 . Note that the dual part of this equation in 4D space is fully equivalent to Eq. 4.13 in 3D space. The equation of the motion of the line is exactly the same as Eq. 4.18. The motion of the plane is given by Q D M .n C 4 d /M Q H 0 D n0 C 4 d 0 D n0x 1 C n0y 2 C n0z 3 C d 0 4 D MHM Q s TQs D 1 C I t s .R s nRQ s C 4 d / 1 I t s D Ts R s .n C 4 /R 2 2 t t s s D R s nRQ s C I Rs nRQ s R s nRQ s I C 4 d 2 2! Q Q R n R R n R t t s s s s s s C 4 d D R s nRQ s C I 2 D R s nRQ s C I.t s ^Rs nRQ s / C 4 d D n0 C 4 .I3 t s /^n0 C 4 d D n0 C 4 .t s n0 / C 4 d D n0 C 4 .t n0 C d /:
(4.21)
The real and dual parts of this expression are equivalent in a higher dimension to the bivector and trivector parts of Eq. 4.15 in G3;0;0 .
4.8 Spatial Velocity of Points, Lines, and Planes
101
4.8 Spatial Velocity of Points, Lines, and Planes This section begins with the classic formulation of the spatial velocity of a rigid body using matrices; in a later Sect. 4.8.3 we will represent this velocity using instead the more advantageous techniques of motor algebra.
4.8.1 Rigid-Body Spatial Velocity Using Matrices The 3D motion of a rigid body comprises a rotation and a translation and can be computed as a relative motion between a world-coordinate fixed frame and a frame attached to the object. In the case of pure rotations, the body frame is set at the origin of the fixed spatial frame. Translations will displace the object frame apart from the spatial frame. The velocity of a single particle of the object is given by vx .t/ D
dx.t/ : dt
(4.22)
Let us first consider the trajectory of a continuous rotational motion given by R.t/ W R ! SO.3/ that satisfies the constraint R.t/RT .t/ D I:
(4.23)
By computing the derivative of this equation with respect to time t and passing a term to the right, we get a skew-symmetric matrix w.t/: O T T P P R.t/R .t/ C R.t/RP T .t/ D 0 ) R.t/R .t/ D R.t/RP T .t/ D w.t/: O (4.24)
We get the expression for the derivative of the 3D rotation by multiplying both sides of the last equation by R.t/: P R.t/ D w.t/R.t/: O
(4.25)
Considering that at t0 , there is not yet a rotation, R.t0 / D I , the first-order approximation to a rotation matrix is given by P 0 C dt/ I C w.t R.t O 0 /dt:
(4.26)
The linear space of all skew-symmetric matrices is commonly denoted by so.3/ D P fwO 2 R33 jw 2 R3 g:
(4.27)
In Eq. 4.24, R.t/ can be interpreted as the state transition matrix of the following linear ordinary differential equation (ODE): x.t/ P D wx.t/; O x.t/ 2 R3 ;
(4.28)
102
4 Kinematics of the 2D and 3D Spaces
whose solution is given by x.t/ D e wO x.0/;
(4.29)
O can be expanded as a McClaurin expansion as follows: and the matrix e wt
O e wt D I C wt O C
.wt/ O n .wt/ O 2 CC C : 2Š nŠ
(4.30)
According to Rodriguez’ formula related to a rotation matrix, the last equation can also be written in a more compact form as O e wt DI C
wO 2 wO sin.jjwjj/ C .1 cos.jjwjj//: jjwjj jjwjj2
(4.31)
Due to the uniqueness of the solution of Eq. 4.28 and assuming R.0/ D I is the initial condition of Eq. 4.24, we get O O R.t/ D e wt R.0/ D e wt :
(4.32)
Interestingly enough, we have derived the exponential relation between the linear spaces of so.3/ and SO.3/ exp W so.3/ ! SO.3/ W wO 7! e wO ;
(4.33)
where the inverse of the exponential map is given by wO D log.R/. The motion of a rigid body can be represented as a transformation matrix using homogeneous coordinates. The set of all motion matrices forms a group that is no longer Euclidean; thus, the formulation of the rigid-body velocity cannot be done as a straightforward extension of Eq. 4.22 of the velocity of a single particle. One can derive the velocity equations of the rigid body based on the parametrization of motion using homogeneous matrices. Let us represent the trajectory of a body as a time-dependent curve using a homogeneous matrix: g.t/ D
R.t/ t.t/ 2 R44 : 0T 1
(4.34)
The inverse of g.t/ is given by g 1 .t/ D
R.t/ t.t/ 0T 1
1
D
RT RT t.t/ : 1 0T
(4.35)
In analogy to the case of pure rotation (see Eqs. 4.24–4.28), we will obtain a matrix consisting of the instantaneous spatial angular velocity component expressed as an antisymmetric matrix wO and the linear velocity component v; for this we start
4.8 Spatial Velocity of Points, Lines, and Planes
103
deriving the following identity with respect to time t: dI dg.t/g 1 .t/ D D 0; dt dt 1 .t/ C g.t/gP 1 .t/ D 0: g.t/g P
(4.36)
b.t/, Now, passing one term to the right, we get a skew-symmetric matrix, V 1 .t/ D g.t/gP 1 .t/ D VO .t/ 2 R44 ; g.t/g P
(4.37)
which equals
P R.t/ tP.t/ RT .t/ RT .t/t.t/ 1 P b V D g.t/g .t/ D 0 1 0T 0T
T T P P R.t/R .t/ R.t/R .t/t.t/ C tP.t/ w.t/ O v.t/ D D : 0 0T 0 0T
(4.38)
One gets the expression for the derivative of the 3D motion by multiplying both sides of the last equation by g.t/: 1 b g.t/: g.t/ P D .g.t/g P .t//g.t/ D V
(4.39)
b can be viewed as the tangent vector along the curve of g.t/ and can be used to V approximate g.t/ locally: bg.t/dt D .I C V bdt/g.t/: g.t P C dt/ g.t/ C V
(4.40)
The 44 matrix of the form of VO is called a twist. The set of all twists builds a linear space denoted by wO se.3/ D P VO 2 T 0
ˇ v ˇ ˇw 2 so.3/; v 2 R3 2 R44 : 0
(4.41)
se.3/ is called the tangent space or Lie algebra of the matrix Lie group SE.3/. b is considered constant, we have a time-invariant linear ordinary If, in Eq. 4.39, V equation, which can be integrated to get the following expression: Vt g.t/ D eb g.0/:
(4.42)
If we assume the initial condition g.0/ D I , we obtain Vt g.t/ D eb :
(4.43)
104
4 Kinematics of the 2D and 3D Spaces
As a result, we can claim that the exponential map defines a transformation from se.3/ to SE.3/, namely, V b ! eb V :
exp W se.3/ ! SE.3/I
(4.44)
This exponential twist can be further expanded using the McClaurin series: b 2 bn Vt bt C .V t/ C C V C : DI CV eb 2Š nŠ
(4.45)
Using the Rodriguez formula (4.31) and certain additional properties of the exponential matrix, one can establish the following relationship: " e wO b V e D T 0
Tv .I e wO /wvCww O jjwjj
1
# ; if
w ¤ 0:
(4.46)
b becomes If w D 0, the exponential of V V eb D
I 0T
v : 1
(4.47)
Let us study the rigid motion with respect to a spatial framework X and other framework Y attached to a rigid body (see Fig. 4.3). Let us again consider the equation of the twist derived above:
s
T T s P P P bsxy D Rxy Rxy Rxy Rxy txy C txy D wO xy vxy : V (4.48) 0T 0 0T 0
Fig. 4.3 Linear and angular velocities of a body with respect to the spatial frame X and the body frame Y
4.8 Spatial Velocity of Points, Lines, and Planes
105
Note that the physical meaning of the linear velocity component is not very intuitive; T due to the antisymmetric matrix RP xy Rxy D wO sxy , there is a component orthogonal to the actual translation and parallel to the velocity with respect to the object frame, namely .wO sxy txy D wsxy txy / jj tPxy (see Fig. 4.3). We will see below that only with the bivector notation of the motor algebra, we can achieve a much more intuitive representation of the angular and linear velocities than when we use the transformation matrices. s The rigid-body spatial velocity, Vxy , is a vector comprised of the instantaneous s spatial angular velocity, wxy , and the linear component, vsxy : s Vxy
# " wsxy : D s vxy
(4.49)
The tangential velocity of a point px attached to the rigid body measured with respect to the spatial frame X is given by vspx
px s b D wO sxy px C vsxy D wsxy px C vsxy : D V xy 1
(4.50)
After this review of well-known motion formulas using matrices, we will represent the rigid-body spatial velocity now using the more advantageous techniques of motor algebra.
4.8.2 Angular Velocity Using Rotors The angular momentum of a particle with momentum m and position vector x is usually defined in 3D by using the cross product L D x m:
(4.51)
In geometric algebra, one replaces axial vectors with bivectors: thus, we rewrite the last equation using a bivector: L D x^m:
(4.52)
This formula substitutes the old notion of angular momentum as “an axial vector” with a geometric expression that describes the angular momentum as a particle sweeping out a plane (see Fig. 4.4a). Since angular momentum is described as a bivector, the angular velocity must be represented as a bivector as well. To do so, we resort to a rotor equation. Suppose
106
4 Kinematics of the 2D and 3D Spaces
a b m
Fig. 4.4 (a) (left) The particle sweeps out the plane L D x ^m. (b)(right) Rotating frame fek g with R. Bivector angular velocity ˝
the orthonormal frame fuk g is rotating in the 3D space and it is related to another via a rotor R: Q uk D R.t/ek R.t/:
(4.53)
Traditionally, the angular momentum vector, w, is defined using the cross product uP k D w uk D I w^uk D .I w/ uk :
(4.54)
In this equation, the (space) angular-velocity bivector is ˝ S D I3 w;
(4.55)
where the pseudoscalar I3 2 G3;0;0 . The sign ensures the orientation sense followed by the involved rotor. Now, let us analyze the time dependency with respect to the frame fuk g: PQ PQ D R P kR Q C Rek R P Ru Q k C uk R R: uP D Re
(4.56)
Q D 1, we derive Since RR PQ D 0; Q DR PR Q C RR @t .RR/
(4.57)
PQ PR Q D RR: R
(4.58)
which leads to
4.8 Spatial Velocity of Points, Lines, and Planes
107
We substitute Eq. 4.58 into Eq. 4.56: the result is simply the inner product of a bivector with a vector: P Ru Q k uk R PR Q D .2R P R/ Q uk : uP D R
(4.59)
Comparing Eqs. 4.54 and 4.55, we find an expression for the angular velocity bivector in terms of the rotor: P R: Q ˝ S D 2R
(4.60)
If we multiply from the right by a rotor and divide by 2, we easily obtain the dynamic equation reduced to a rotor equation P D 1 ˝S R; R 2
(4.61)
PQ D 1 R˝ Q S: R 2
(4.62)
or
The body angular velocity ˝ B related to a fixed-space frame is transformed back as follows: Q ˝ S D R˝B R:
(4.63)
Replacing the last equation into Eq. 4.61 gives P D 1 R˝B D 1 ˝SR R 2 2
(4.64)
PQ D 1 ˝ R: Q R B 2
(4.65)
and into Eq. 4.62 gives
Assuming that the rotor motion is constant through time, such as a body rotating with a fixed angle, then we fix ˝ S constant and the rotor equation (4.64) can be integrated to give t R.t/ D e ˝ S 2 R.0/;
(4.66)
which represents a rotor that rotates with a constant-frequency rotation in the righthand sense.
108
4 Kinematics of the 2D and 3D Spaces
4.8.3 Rigid-Body Spatial Velocity Using Motor Algebra In motor algebra, we represent the rigid motion of lines by Eq. 4.17. Lines are represented in terms of two dual bivector bases. Following the same idea followed by the case of the pure rotor motion, we represent the relationship of the moving frames as follows: f .t/; lk0 .t/ D M .t/lk M
(4.67)
where lk0 and lk are the coefficients of the two frames. Taking its time derivative P ; P lk M f C M lk M f lPk0 D M
(4.68)
and substituting M lk D lk0 M , one gets P : P M fl 0 C l 0 M M f lPk0 D M k k
(4.69)
f D 1, we get the useful relations Taking the time derivative of the identity M M f / D 0; @t .M M P D 0; P M f C MM f M P ; P M f D M M f M
(4.70)
which are substituted in Eq. 4.69 to yield 0 P M fl 0 l 0 M P M f: lPk .t/ D M k k
(4.71)
Since the right side of the last equation is the inner product between a bivector and the vector basis coefficients, we rewrite it as follows: 0 P M f/ l 0 : lPk D .2M k
(4.72)
P M f V S D 2M
(4.73)
Let us call the bivector
the spatial velocity; as we will show next it comprises a bivector for the linear velocity and a bivector for the angular velocity. Recalling that M D T R, we proceed as follows P M f D 2.TR P C T R/ P R eT e D 2TP RR eT e C 2T R PR eT e V S D 2M tS 1 e C 2 1 C I tS ˝S 1I D 2TP T 2 2 2
4.8 Spatial Velocity of Points, Lines, and Planes
109
tS tS tS C ˝S C I ˝S 1I D I tP S 1 I 2 2 2 tS tS D I tP S C ˝ S C I ˝ S I ˝ S 2 2 D ˝ S C I tP S D .˝ S C I vS / ;
(4.74)
where the angular-velocity bivector ˝ S is the dual of the linear-velocity bivector vS . Multiplying Eq. 4.73 from the right by M and dividing by 2, we get the dynamic motor equation P D 1V SM: M 2
(4.75)
Assuming that the screw motion of the body is constant through time, the dynamic motor equation (4.75) can be integrated to give M .t/ D e
VS 2
M .0/ D e
˝ S CI vS 2
M .0/:
(4.76)
This represents a motor that rotates with a constant-frequency rotation in the righthand sense and has a constant linear velocity as well. Finally, compare the different algebraic treatments to get Eqs. 4.24, 4.39, 4.61, and 4.75 for the analysis of the rigid-body spatial velocity using matrices, rotor, and motor algebras.
4.8.4 Point, Line, and Plane Spatial Velocities Using Motor Algebra According to Eq. 4.16, the motion of a point is represented as f N ; X D 1 C I x D M X 0M
(4.77)
f N D RT Q . If we take the time derivative of this equation, we get where M PN f N C M X 0f P D MX P 0M M X PQ C RT X R Q C T RX P 0 RT Q C RT X 0 RT Q P D TP RX 0 RT 0 T Q CTR P RRX Q Q Q PQ Q P Q D TP TQ T RX 0 RT 0 RT C RT X 0 RR RT C RT X 0 R T T T
1 1 D ˝ S X 0 X 0 ˝ S C I tP S X 0 C X 0 tP S D ˝ S ^X 0 C .I vS / X 0 2 2 D I w C .I vS / X 0 : (4.78)
110
4 Kinematics of the 2D and 3D Spaces
According to Eqs. 4.5 and 4.7 to the motion equations of a line and a plane are represented as Q ; L D n C I m D ML0 M f N : H D n C Id D MH 0 M
(4.79)
If we take the time derivative of these equations and follow simple algebraic equations similar to the motion equation of the point, we get P D ˝ S ^L; L P D ˝ S ^H C .I tP S / H D ˝ S ^H C .I vS / n: H
(4.80) (4.81)
Let us consider the velocity V S when a composition of two motors M D M 2 M 1 . According to Eq. 4.73, we get P M f D 2.M P 2M 1 C M 2M P 1 /.Mg V S D 2M 2M 1/ Q 1M P 2M P 1M Q 2 P 1 /M Q 2 D 2M Q 2 C M 2 .2M Q 1 /M P 2M 1 C M 2M D 2.M Q 2: D V 2 C M 2V 1M (4.82) In general, for a sequence of n motors, the overall velocity V S is given as follows: Q n1 M Q n1 M Q n V S D V n C M n M n1 M n2 M n1 V n1 M Q n3 M Q n1 M Q n2 M Q n1 CM n1 M n2 M n3 M n1 V n2 M Q Q Q C C M 3M 2V 2M 3M 2 C M 2V 1M 2 D
n n X Y
Q j: Mj V iM
(4.83)
i D1 j Di C1
4.9 Incidence Relations Between Points, Lines, and Planes The geometric relations between points, lines, and planes expressed in terms of incidence relations are very useful when we are dealing with the geometry of configurations and the relative motion of objects. Blaske introduced the basic relations of incidence using dual quaternions [25]. These relations can be also formulated C for the representations of points, lines, and planes in the motor algebra G3;0;1 or in the 4D degenerated geometric algebra G3;0;1 . In general, the incidence relations between a point, line, and a plane are given by e C LP e D 2I.x n/; PL e C˘ e P D 2I.d x n /; P˘ e C˘ e L D 2.nL n C mL n /; L˘ e and L e are the conjugated (reversion) of P and L respectively. where P
(4.84) (4.85) (4.86)
4.9 Incidence Relations Between Points, Lines, and Planes
111
Thus, a point P lying on a line L fulfills the equation e C LP e D 0: PL
(4.87)
The distance d of a point relative to a plane ˘ is given by e C˘ e P D d1 2 3 4 : P˘
(4.88)
If d D 0, the point lies on the plane; if d is negative, it is behind the plane; and if d is positive, it is in front of the plane. The intersection point P of a line L crossing a plane ˘ can be computed as e C˘ e L D P: L˘
(4.89)
In this case, if the line is parallel to the plane, the equation equals zero. Incidence relations are fundamental in projective geometry; we will study them in more detail in Chaps. 5 and 9.
4.9.1 Flags of Points, Lines, and Planes When dealing with geometric configurations as in object modeling or robot navigation planing, it is useful to resort to a type of geometric indicator, which allows us to detect whether a geometric condition is fulfilled or not. These kinds of indicators are called flags, and they are expressions relating points, lines, and planes that have some common attributes. Flags generate varieties [169]. We can express a point touching a line as the equation of the so-called point-line flag: F PL D P C L:
(4.90)
e D 1 and LL e D 1, then FPL F D 1. If P P PL If we have a point that touches a line and also a plane, and in which the orientation of the plane is parallel to the line, then we can represent the line–plane flag as F L˘ D L C ˘ :
(4.91)
e D 1 and ˘ ˘ e D 1, then FL˘ F e L˘ D 1. In this equation, if LL Finally, if a point touches a line and a plane and the line has the same orientation as the normal to the plane, we can assign to this geometry the following point–line–plane flag: F PL˘ D P C L C ˘ : e , LL=1, e e D 1, then FPL˘ F e PL˘ D 1. Here, if P P=1 and ˘ ˘
(4.92)
112
4 Kinematics of the 2D and 3D Spaces
4.10 Conclusion This chapter presents the Clifford or geometric algebra for computations in visually guided robotics. Looking for other suitable ways of representing algebraic relations of geometric primitives, we consider the complex and dual numbers in the geometric algebra framework. It turns out that in this framework the algebra of motors is well suited to express the 3D kinematics. Doing this, we can linearize the nonlinear 3D rigid motion transformation. In this chapter, the geometric primitive points, lines, and planes are represented using the 3D Euclidean geometric algebra and the 4D motor algebra. Next, the rigid motions of these geometric primitives are elegantly expressed using rotors, motors, and concepts of duality. In the algebra of motors, we extend the 3D Euclidean space representation to a 4D space by means of a dual copy of scalars, vectors, and rotors or quaternions. Finally, we formulate incidence relations between points, lines, and planes using the motor algebra framework.
4.11 Exercises 4.1 Split the vector v D 2e1 C 7e2 C 10e3 into the components x D 3e1 1:5e2 , y D 2e1 C 4e2 2e3 , and z D 5e1 C 3e2 C 3e3 . In other words, compute the coefficients ˛; ˇ; of the equation v D ˛x C ˇy C z. Draw in three dimensions the vectors and the volume a^b^c. 4.2 The outer square root of a multivector M is x, which is the solution of the equap 1 tion M D x ^ x. If M D ˛ C a, where ˛ 2 R, we can write x D .˛/ 1 C a 2 . ˛
This equation can be expanded using the following series: a 12 1a C 1C D1C ˛ 2˛
1 2
1
1 a^a 2 C 2 ˛
2
1 2
1 2
1 12 2 a^a^a C 3Š ˛3
The quantity of the elements of this series should be at least equal the dimension of the considered geometric algebra. Generate a program file for CLICAL to compute the outer square root of M , where Re(M ) > 0. Give any M and verify the identity x^x D M . 4.3 Rotate in 4D Euclidean space, using CLICAL [124], the vector v D e1 Ce2 Ce3 , first related to the plane e1 e2 by the angle 4 , and then related to the plane e4 e2 by the angle 5 . 4.4 Given x D x1 e1 C x2 e2 C x3 e3 , y D y1 e1 C y2 e2 C y3 e3 , X D xe1 e2 e3 , and Y D ye1 e2 e3 , compute the bivectors 12 .1 C 1 e2 e3 e4 /X and 12 .1 e1 e2 e3 e4 /Y and show that they commute. 4.5 Given the bivector B D ˛e1 e2 C ˇe4 e3 , compute B B, B^B, and B B and explain what kinds of multivectors result.
4.11 Exercises
113
4.6 For this problem, use the same multivectors X and Y that you used in exercise 4.4. Given Z D 12 .1 C e1 e2 e3 e4 /X C 12 .1 e1 e2 e3 e4 /Y , express exp(Z ) using jxj and jyj. What are the rotation angles of the rotation R4 ! R4 , u ! zuz1 , where z D exp.Z /? 4.7 Given in G3;0;0 the points a D 0:5e1 C2:0e2 C1:3e3 , b D 1:5e1 C1:2e2 C2:3e3 , and c D 0:7e1 C 1:2e2 0:3e3 , compute the line l crossing a and b, and the plane tangent to the points a, b, and c. 4.8 Using the software packet CLIFFORD 4.0 [1] and point a, line l , and plane from exercise 4.7, in G3;0;0 compute new values for the point, line, and plane after undergoing a rigid motion given by the translation t D 1:0e1 2:7e2 C 5:3e3 . Also compute the rotor R D cos. 2 / C sin. 2 /n, where D 6 and n is the unit 3D bivector of the rotation axis given by n D 0:7e2 e3 C 1:2e3 e1 C 0:9e1 e2 . 4.9 Express point a, line l , and plane given in exercise 4.7 in the algebra of C motors G3;0;1 . 4.10 Using the software package CLIFFORD 4.0 and the point, line, and plane given in exercise 4.7, compute their new values after undergoing the rigid motion given by the translator T D 1 C I 1:0 2 3 2:7 23 1 C5:3 1 2 and the rotor R s D cos. 2 / C sin. 2 /l , where D 6 and l is the screw-axis line. Compare your results with the results of exercise 4.8. 4.11 Explain the following form of Taylor’s expansion formula of a function G around the location x: G.x C p/ D e p ıx G.x/:
(4.93)
Compare this formula with Eq. 4.76 and derive your conclusions regarding the Lie derivative operator at the exponent. 4.12 A particle in 3D space moves along a curve x.t/ such that jvj D juj P is constant. Prove that there exists a bivector ˝ such that uP D ˝ u: Give an explicit formula for the bivector ˝. Is this bivector unique? 4.13 Imagine you measure the components of the position vector x in a rotating frame ffi g. Referring this frame to a fixed frame, prove that the components of x are given by e xi D ei .RxR/:
(4.94)
114
4 Kinematics of the 2D and 3D Spaces
Now, differentiate this expression twice and prove that one can write fi xR i D xR C ˝ .˝ x/ C 2˝ xP C ˝P x: Using this, deduce the expressions for the centrifugal, Coriolis and Euler forces in terms of the angular velocity bivector ˝. 4.14 Express the geometric objects of exercise 4.7 in the degenerated algebra G3;0;1 : the point a using trivectors, the line l using bivectors and the plane using vectors. 4.15 Using the software package CLIFFORD 4.0 and point a, line 1, and plane from exercise 2.11, compute new values for the point, line, and plane after undergoing the rigid motion given by the translator T D 1 C I 1:0 2 3 2:7 23 1 C5:3 1 2 . Also compute the rotor R s D cos. 2 / C sin. 2 /l , where D 6 and l is the screw-axis line. Compare your results with the results of exercises 4.8 and 4.10. 4.16 Prove Rodriguez’s formula for 3D rigid motion using motor algebra. Compare you results with the results from exercise 2.18. Why is the motor algebra expression superior? 4.17 In the degenerated algebra G3;0;1 , choose a point P using trivectors and a line L using bivectors, and prove that the following expression e C LP e PL equals zero if the point lies on the line. Prove the same equation using representaC tions of the point and line in the motor algebra G3;0;1 . 4.18 In the degenerated algebra G3;0;1 for the point P using trivectors, and the plane ˘ using vectors, show that the following expression: e C˘ e P D d1 2 3 4 P˘ will describe a particular geometric configuration, depending upon the value of d . If d D 0, the point lies on the plane; if d 6D 0, d indicates the distance of the point to the plane; and if d is negative, the point is behind the plane. Using CLIFFORD 4.0 and some points of the 3D space, check this equation. Check the equation again C using the representations of the point and plane in the motor algebra G3;0;1 . 4.19 In the degenerated algebra G3;0;1 for the line L using bivectors, the plane ˘ using vectors show that the following expression: e C L˘ e L˘
4.11 Exercises
115
will describe a particular geometric configuration. If this expression equals zero, the line lies on the plane or is parallel to it; if not, the equation yields the intersecting point. This equation can be seen as a kind of meet equation when using a degenerated algebra. Using CLIFFORD 4.0 and some points of 3D space, check the equation. Check the same equation using the representations of the line and plane C in the motor algebra G3;0;1 . 4.20 Barycentric coordinates: three points in general position (lying on a plane) can be used to describe any point p on this plane: p D ˛x C ˇy C z: Using normalized points, the point p can represented as an affine combination. The scalars ˛; ˇ, and are known as barycentric coordinates, and they can be computed using the relative vectors: r D x z, s D y z, and t D p z ˛D
t ^s t ^r t ^.s r/ ; ˇD ; D1 : r ^s s^r r ^s
Interpret geometrically these equations in terms of areas in the plane for the case when p lies inside the triangle formed by x; y, and z, What are the barycentric coordinates of the center of gravity? 4.21 Construct the dual representation of the mid-plane between points x and y. 4.22 Flags: Flags are very useful when we are interested in relating geometrical points, lines, and planes. Show in the degenerated algebra G3;0;1 that the plane perpendicular to the line L passing through the point P is given by ˘? D
1 e e .P L LP/: 2
This equation is a kind of extension that relates the mapping of the flags 1 1 fPL D p .P C L/ ! fP ˘ D p .P C ˘ ? /: 2 2 The inverse of this mapping relates the plane to the line perpendicular to the plane passing through the point. That is, L? D Check that .L? /? D L.
1 e P˘ e /: .˘ P 2
Chapter 5
Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
5.1 Introduction In this chapter, we give the fundamentals of Lie algebra and the algebra of incidence using the computational frameworks of the null cone and the n-dimensional affine plane. Using Lie algebra within this computational framework has the advantage that it is easily accessible to the reader because there is a direct translation of the familiar matrix representations to representations using bivectors from geometric algebra. Generally speaking, Lie group theory is the appropriate tool for the study and analysis of the action of a group on a manifold. Since we are interested in purely geometric computations, the use of geometric algebra is appropriate for carrying out computations in a coordinate-free manner by using a bivector representation of the most important Lie algebras. We will represent Lie operators using bivectors for the computation of a variety of invariants. This chapter benefits from work done in collaboration with Sobczyk [18, 179]. It is usual to use a geometric algebra Gp;q;0 with a Minkowski metric for computations of projective geometry and algebra of incidence, and separately a degenerated algebra for the computation of rigid motion, for example, the motor C algebra G3;0;1 . In contrast, here, we use the affine plane framework, which allows us to make computations involving both the algebra of incidence and Euclidean rigid transformations. The organization of this chapter is as follows. Section 5.2 introduces the geometric algebra of reciprocal null cones. Section 5.3 explains the computational frameworks of the horosphere and affine plane. Section 5.4 examines the basic properties of the general linear group from the perspective of geometric algebra. Section 5.5 shows the computing of rigid motion in the affine plane. Section 5.6 studies the Lie group and Lie algebra of the affine plane. Section 5.7 presents the algebra of incidence in the n-dimensional affine plane. Concluding remarks are given in Section 5.8 and exercises are in Section 5.9.
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 5, c Springer-Verlag London Limited 2010
117
118
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
5.2 Geometric Algebra of Reciprocal Null Cones This section introduces the 22n -dimensional geometric algebra Gn;n . This geometric algebra is best understood by considering the properties of two 2n -dimensional reciprocal Grassmann subalgebras. These subalgebras are generated by the vector bases of the reciprocal null cones N and N . Let us start by explaining the meaning of null cones.
5.2.1 Reciprocal Null Cones The reciprocal null cones N and N are real linear n-dimensional vector spaces whose vectors square to null, that is, for x 2 N , x 2 D 0 and for x 2 N , x 2 D 0. In this regard, an associative geometric multiplication also equals zero, namely, for x; y; z 2 N , z2 D .xCy/2 D .xCy/.xCy/ D x 2 Cxy C yxCy 2 D xy Cyx D 0: (5.1) This result indicates that the symmetric part of the geometric product of two vectors is simply zero. Thus, the geometric product of two vectors will be equal to the outer product: xy D x y C x^y D
1 1 1 .xy C yx/ C .xy yx/ D .xy yx/ D x^y: 2 2 2 (5.2)
Similarly, for the vectors in the reciprocal null cone, xy D x y C x^y D
1 1 1 .xyCyx/C .xy yx/ D .xy yx/ D x^y: 2 2 2
(5.3)
The reciprocal vector bases feg and feg span the reciprocal null cones: N D spanfe1 ; : : : ; en g
N D spanfe1 ; : : : ; e n g:
(5.4)
The neutral pseudo-Euclidean space Rn;n is the linear space spanned by the null cones: Rn;n D spanfN; N g D fx C xjx 2 N; x 2 N g:
(5.5)
The reciprocal vector bases satisfy the following defining inner product relations: ei e j D e j ei D ıij
(5.6)
5.2 Geometric Algebra of Reciprocal Null Cones
119
and ei ej D 0;
e i e j D 0;
(5.7)
for all i , j = 1, 2, . . . , n. The inner product relations of Eqs. 5.6 and 5.7 tell us that the reciprocal basis vectors ei and e i are all null vectors and are mutually orthogonal. Because of Eq. 5.6, the reciprocal vector bases are said to be dual, because they satisfy the relationship feg feg D id, where id stands for the identity. The outer product defining relations are ei ^ej D ei ej D ej ei ;
e i ^ej D e i e j D e j e i :
(5.8)
Note that the geometric product of two vectors equals the outer product only when both belong either to the null cone N or to the reciprocal cone N . This is no longer true for arbitrary x; y 2 Rn;n owing to the dual relationship of the reciprocal bases expressed by Eqs. 5.7 and 5.8.
5.2.2 The Universal Geometric Algebra Gn;n The vector bases of the reciprocal null cones N and N generate the 2n -dimensional subalgebra GN D genfe1 ; : : : ; en g and the 2n -dimensional reciprocal subalgebra GN D genfe 1 ; : : : ; e n g, which have the structure of Grassmann algebras. The geometric algebra Gn;n is built by the direct product of these 2n -dimensional Grassmann subalgebras: Gn;n D GN ˝ GN D genfe1 ; : : : ; en ; e 1 ; : : : ; e n g:
(5.9)
When n is countably infinite, we call G1;1 the universal geometric algebra. The universal algebra G1;1 contains all of the algebras, Gn;n , as subalgebras. The reciprocal bases feg 2 N and feg N 2 NN are also called dual, because they fulfill Eq. 5.6: feg feg N D id. They generate the k-vector bases of Gn;n , n
o 1; feg; feg; N fei ej g; feNi eNj g; fei eNj g; fei ej eNk g; : : : ; fej1 : : : ; eNji eNjl : : : ejk g; I; IN ; (5.10)
consisting of scalars, vectors, bivectors, trivectors, . . . , and the dual pseudoscalars I D e 1 ^e2 ^e3 ^en and IN D eN1 ^ eN2 ^ eN3 ^ eNn that satisfy I IN D 1. Note that
the 2n 2n sets -dimensional bases of k-vectors fej1 ; : : : ; eNji eNjl ; : : : ; ejk g for the k k of indices 1 j1 < j2 < < jk 2n are generated by different combinations of e’s and e’s. N
120
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
5.2.3 The Lie Algebra of Null Spaces Similarly as in Sect. 3.5.3, we introduce the balanced analog of the complex ‘doubling’ bivector by defining K D ei eNi D e1 ^ eN1 C e2 ^ eN2 C C en ^ eNn :
(5.11)
This bivector has the following properties: ei K D eNi ; eNi K D eNi .ej ^ eNj / D eNi eNj ei D ei ;
(5.12)
.x K / K D K .K x/ D x
(5.13)
thus 8x:
The difference between the doubling bivector J of Section 3.5.3 and K is crucial, because J generates a complex structure and in contrast K generates, instead, a null structure. In order to see this clearly, let us take any vector x 2 Gn;n and define p D x ˙ x K;
(5.14)
taking its square p 2 D x 2 ˙ 2x .x K / C .x K /2 D x 2 hx K K xi D x 2 Œ.a K / K x D x 2 x 2 D 0;
(5.15)
as expected by a null vector. Recalling the concept of projective split described in 1 Sect. 1.2.5, in a similar fashion here the bivector K splits vectors in x 2 Gn;n into two separate null vectors x D xC C x;
(5.16)
where xC D
1 .x C x K /; 2
x D
1 .x x K /: 2
(5.17)
1 We see that the space of vectors Gn;n decomposes into a direct sum of two null spaces: N; N , where x C 2 N . The vectors in N fulfill
xC K D xC
8x C 2 N:
(5.18)
According to Eq. 5.15, we can see that all vectors x C 2 N also square to zero. The space N defines a Grassmann algebra.
5.2 Geometric Algebra of Reciprocal Null Cones
121
Mappings Acting on the Null Space Every linear function x ! f.x/ acting on an n-dimensional vector space can be represented in the null space N using an operator O as follows: xC ! Ox C O 1 ;
(5.19)
where O is built by the geometric product on an eve n number of unit vectors. Here vectors x 2 Gn are mapped to null vectors x C in Gn;n which in turn are acted on by the multivector O so that the following is true: f.x/ C f.x/ K D O.x C x K /O 1 :
(5.20)
This equation defines a map between linear functions f.x/ and multivectors O 2 Gn;n . Note that the map is not quite an isomorphism, due to the fact that both O and O generate the same function, that is, O forms a double-cover representation. The map will work, only if the action of O does not leave the space N , for that O must fulfill Eq. 5.18: .Ox C O 1 / K D Ox C O 1 :
(5.21)
After a simple algebraic manipulation involving the inner product between vectors and bivectors x C D O 1 .Ox C O 1 / K O 1 D O 1 .Ox C O 1 K K Ox C O 1 /O D x C .O 1 K O/: 2
(5.22)
In this equation, according to Eq. 5.18, it follows that it is required that O 1 K O D K or OK D K O:
(5.23)
Since O is built by a product of even number of unit vectors, it follows that OO 1 D ˙1. The subgroup that fulfills OO 1 D 1 are rotors in Gn;n and their Q DK generators (Lie algebra basis elements) are bivectors. Thus the condition RKR is the direct analog of the condition that defined the unitary group expressed in terms of rotors, which leaves in this case the J invariant; see Eq. 3.63. The Lie Algebra The bivector or Lie algebra generators are the set of bivectors that commute with K . The Jacobi identity guarantees that the commutator of two bivectors that commute with K yields a third which in turn also commutes with K . Similarly as the case of the unitary group of Eq. 3.66, we formulate the following algebraic constraint: Œ.x K / ^ .y K / K D x^.y K / C .x K /^y D .x ^ y/ K ; (5.24)
122
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
after passing the rightmost element to the left of the equation Œx^y .x K /.y K / K D 0:
(5.25)
By using this constraint, we can again try all combinations of fei ; eNi g to produce the bivector basis for the Lie algebra of the general linear group, K i D ei eNi ; E ij D ei ej eNi eNj EN ij D ei eNj eNi ej
.i < j D 1; : : : ; n/; .i < j D 1; : : : ; n/:
(5.26)
Note that the difference in structure between the Lie algebra of the linear group and the unitary group is only due to the different signatures of their underlying spaces. By the case of the conformal geometric algebras presented in Chap. 6, the Lie algebras are related to a space with pseudo-Euclidean signature.
5.2.4 The Standard Bases of Gn;n In the previous sections, we have discussed in some detail the orthogonal null vector bases fei ; e i g of the geometric algebra Gn;n . Using these orthogonal null vector bases, we can construct new orthonormal bases f ; g of Gn;n , 1 i D p .ei C e i /; 2 1 i D p .ei e i /; 2
(5.27)
for i D 1; 2; 3; : : : ; n. According to the properties of Eqs. 5.6 to 5.8, these bases vectors for i 6D j satisfy i 2 D 1; i j D ıi;j ; i j D j i ; i 2 D 1; i j D ıi;j ; i j D j i ; i j D j i ; i j D 0:
(5.28)
The basis f g spans a real Euclidean vector space Rn and generates the geometric subalgebra Gn;0 , whereas fg spans an anti-Euclidean space R0;n and generates the geometric subalgebra G0;n . We can now express the geometric algebra Gn;n as the direct product of these geometric subalgebras: Gn;n D Gn;0 ˝ G0;n D genf 1 ; : : : ; n ; 1 ; : : : ; n g:
(5.29)
The dual pseudoscalars are given by I D 1 ^ 2 ^ 3 n and IN D 1 ^ 2 ^ 3 ^ n that satisfy IIN D 1.
5.2 Geometric Algebra of Reciprocal Null Cones
123
5.2.5 Representations and Operations Using Bivector Matrices In this section, we use a notation that extends the familiar addition and multiplication of matrices of real numbers to matrices consisting of vectors and bivectors. The notation is somewhat similar to the Einstein summation convention of tensor calculus, and it can be used to directly express the close relationships that exist between Clifford algebra and matrix algebra [179]. We begin by writing the Witte basis of null vector feg and the corresponding reciprocal basis feg of Gn;n in row form and column form, respectively: 3 eN1 6 eN 7 6 27 6 7 feg N D 6 7: 6 7 4 5 eNn 2
feg D Œe1 e2 en ;
(5.30)
Taking advantage of the usual matrix multiplication between a row and a column and the properties of the geometric product, we get 2
eN1 ^e1 6 eN ^e 6 2 1 6 fegfeg N D feg N feg C feg^feg N D I C 6 ::: 6 4 ::: eNn ^e1
eN1 ^e2 eN2 ^e2 ::: ::: eNn ^e2
::: ::: ::: ::: :::
3 eN1 ^en eN2 ^en 7 7 7 : : : 7 (5.31) 7 ::: 5 eNn ^en
where I is the n n identity matrix. Similarly, fegfeg N D feg feg N C feg^feg N D
n X
ei eNi C
i D1
D nC
n X
ei ^ eNi :
n X
ei ^ eNi
(5.32)
i D1
(5.33)
i D1
In terms of the null cone bases, a vector x 2 N is given by 3 x1 6x 7 6 27 6 7 D .e1 ; e2 ; : : : ; en / 6 7 6 7 4 5 xn 2
x D fegxfeg
(5.34)
124
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
3 e1 x 6e x7 n 7 6 2 X 7 6 D .e1 ; e2 ; : : : ; en / 6 7 D feg.feg x/ D xi ei : 7 6 i D1 4 5 en x 2
(5.35)
The vectors x 2 N behave like column vectors, and the vectors x 2 N like row vectors. This property makes it possible to define the transpose of the vector x as follows: 2 3 e1 6e 7 6 27 6 7 x T D .fegxfeg /T D xeT feg D .x1 x2 xn / 6 7 : (5.36) 6 7 4 5 en Note that using the transpose operation it is possible to move between the null cones N and N .
5.2.6 Bivector Representation of Linear Operators One important application of the bivector matrix representation is the representation of a linear operator f 2 End.N /. Recalling the basic Clifford algebra identity, .a^b/ x D .b x/a .a x/b
(5.37)
between the bivector a^b and the vector x, we can use the reciprocal bases to express the vector x in the form x D fegxfeg D .feg^feg/ N x;
(5.38)
where x feg represents the column vector components of x: 3 3 2 eN1 x x1 6 x 7 6 eN x 7 7 6 27 6 2 7 6 7 6 N x: D 6 7 D 6 7 D feg 7 6 7 6 4 5 4 5 xn eNn x 2
x feg
(5.39)
This is the key idea to the bivector representation of a linear operator. Let f 2 End.N /, and we then have the following relationships:
5.3 Horosphere and n-Dimensional Affine Plane
125
f .x/ D f .fegx feg / D fegF xfeg
D .fegF/^feg N x
D F x;
(5.40)
where the bivector F 2 G is defined by F D .fegF /^feg N D
n n X X
fij ei ^eNj D feg^.F feg/: N
(5.41)
i D1 j D1
Thus, a linear operator f 2 End.N / can now be pictured as a linear mapping f W N ! N of the null cone N onto itself. Furthermore, it can be represented in the bivector form f .x/ D F x, where F D .fegF /^feg N is a bivector in the enveloping geometric algebra G. As an example, we can show for the linear operator T its representation as a bivector matrix: 2
t11 t12 6 t21 t22 T D6 4 t31 t32 t41 t42 2 t11 e1 ^ eN1 6 t21 e2 ^ eN1 6 4 t31 e3 ^ eN1 t41 e4 ^ eN1
t13 t23 t33 t43
3 t14 t24 7 7 t34 5 t44
t12 e1 ^ eN2 t22 e2 ^ eN2 t32 e3 ^ eN2 t42 e4 ^ eN2
t13 e1 ^ eN3 t23 e2 ^ eN3 t33 e3 ^ eN3 t43 e4 ^ eN3
3 t14 e1 ^ eN4 t24 e2 ^ eN4 7 7: t34 e3 ^ eN4 5 t44 e4 ^ eN4
(5.42)
Now, by considering f; g 2 gl.N / in the bivector form f .x/ D F x and g.x/ D G x, and by calculating the commutator Œf; g, we find Œf; g.x/ D F .G x/ G .F x/ D .F G/ x;
(5.43)
where the commutator product of bivectors F G is defined by F G D 12 ŒF G GF . Thus, the Lie bracket of the linear operators f and g becomes the commutator product of their respective bivectors F and G.
5.3 Horosphere and n-Dimensional Affine Plane This section explains briefly the meaning of the computational frameworks, the horosphere, and the n-dimensional affine plane, which are useful in the study of conformal transformations [157]. This kind of transformation preserves the angles between tangent vectors at each point. A common conformal transformation is the one defined by any analytic function in the complex plane. Conformal transformations also exist in the pseudo-Euclidean space Rp;q . Since conformal transformations are nonlinear transformations, it will be desirable to linearize them.
126
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
One way to do so is by moving up from the affine plane, Ae .Rp;q /, to the .p; q/horosphere, Hep;q .RpC1;qC1 /. The n-dimensional affine plane Ae .Rp;q / is a homogeneous representation of the points x 2 Rp;q . It extends the Rp;q to a projective space with signature Rp;q;1 using a null vector e as follows: x h D x C e 2 Ae .Rp;q /:
(5.44)
The .p; q/-horosphere is normally defined by Hep;q .RpC1;qC1 / D
1 x h ex h jx h 2 Ae .Rp;q / 2 RpC1;qC1 ; 2
(5.45)
where the space Rp;q has been extended to RpC1;qC1 in order to make available two null vectors, e D enC1 and e D e nC1 . The conformal representation x c of both a point x 2 Rp;q and a point x h 2 Ae .Rp;q / is given by 1 1 1 1 x c D x h ex h D Œ.x h e/x h C .x h ^e/x h D x h x 2h e D x x 2h e C e 2 2 2 2 1 1 xe e exp xe : (5.46) D exp 2 2 p;q
This equation tells us that all points on He can be obtained by a simple rotation of e with respect to the plane indicated by the bivector xe. The points of the horosphere can be projected down into the affine plane by applying the simple formula, x h D .x c ^e/ e;
(5.47)
and into the space Rp;q by using x D .x c ^e^e/ .e^e/:
(5.48)
Figure 5.1 depicts ˚ A2e D x h jx h D x C e; x 2 R2 ; 1 He2 D x c D x h ex h jxh 2 A2e : 2 Since x 2 R2 D spanf 1 ; 2 g, e D horosphere in these terms is given by
1 . 2
(5.49)
C /, and e D , any point on the
1 1 1 x c D x x 2 e C e D x .x 2 1/ C .x 2 C 1/ D x C x3 C x4 : 2 2 2 (5.50)
5.4 The General Linear Group Fig. 5.1 Horosphere of R2 with triangles of the 2D affine plane projected into the horosphere
127
1
1 0.5
0
-0.5
-1
0.9 0.8 0.7 0.6 0.5
-1
-0.5
0
0.5
1
In order to be able to depict the horosphere in 3D, in Fig. 5.1 we have ignored the coordinate x3 , considering instead the condition that is orthogonal to 1 ; 2 . In Sect. 5.7, we will use the frameworks of the n-dimensional affine plane and horosphere for computations of incidence algebra.
5.4 The General Linear Group The general linear group, GL.N /, is defined to be the subset of all endomorphisms f 2 End.N /, with the property that f 2 GL.N / if and only if det.f / ¤ 0 [20]. The determinant of f is defined in the algebra GN by f .e1 /^f .e2 /^ ^f .en / D det.F /e1 ^e2 ^ ^en ;
(5.51)
where det.F / is just the ordinary determinant of the matrix of f with respect to the basis feg. Choosing the basis feg makes explicit the isomorphism between the general linear groups GL.N / and GL.n; C/. The latter corresponds to the general linear group of all complex n n matrices F with det F ¤ 0. The theory of Lie groups and their corresponding Lie algebras can be considered to be largely the study of the group-manifold GL.n; C/, since any Lie group is isomorphic to a subgroup of GL.n; C/ [65, p. 501]. Since we have referred to GL.N / as a manifold, we must be careful to give it the structure of an n2 -dimensional topological metric space. We define the inner product hf; gi of f; g 2 GL.N / to be the usual Hermitian positive definite inner product n n X X hf; gi D fij gij ; j D1 i D1
where fij ; gij 2 C are the components of the matrices F and G of f and g, respectively, with respect to the basis feg. The positive definite norm jf j of f 2 GL.N / is defined by n n X X jf j2 D hf; f i D fij fij j D1 i D1
and is clearly zero if and only if f D 0.
128
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
The crucial relationship between a Lie group and its corresponding Lie algebra is almost an immediate consequence of the properties of the exponential of a linear operator f 2 End.N /. The exponential mapping may be directly defined by the usual Taylor series 1 X fi ef D ; iŠ i D0
where convergence is with respect to the norm jf j. Note that f 0 D 1 is the identity operator on N , and that f k is the composition of f with itself k times. The logarithm of a linear operator, f D log.f /, exists and is well defined for any f 2 GL.N /. The logarithm can also be defined in terms of an infinite series, or more directly, in terms of the spectral form of f [177]. Since the logarithm is the inverse function of the exponential function, we can write f D e f for any f 2 GL.N /. The logarithmic form f D e f of f 2 GL.N / is useful for defining the one-parameter group fft g of the operator f 2 GL.N /, ft .x/ D e t f x:
(5.52)
The one-parameter group fft g is continuously connected to the identity in the sense that f0 .x/ D x, and f1 .x/ D f .x/. Note that f00 .x/ D f e t f jt D0 .x/ D f .x/;
(5.53)
so f is tangent to ft at the identity. The reason that fft g is called a one-parameter group is because it satisfies the basic additive property fs ft D e t f e sf D e .sCt /f D fsCt :
(5.54)
Since each linear operator f 2 End.N / can be represented according to Eq. 5.40 in the bivector form f .x/ D F x, we can express the one-parameter group gt x D e tf x of the skew-symmetric transformation f .x/ D F x in the form t
t
gt x D e tf x e 2 F xe 2 F :
(5.55)
This equation can be proved by showing that the terms of the Taylor series expansion of both sides of Eq. 5.55 are identical at t D 0. We begin with t t e tf x D P e 2 F xe 2 F : (5.56) Clearly, for t D 0, we have e 0f x D e 0F xe 0F D x: Next, taking the first derivative of both sides of Eq. 5.56, we get 1 1 t t t t t t e tf f xPD F e 2 F xe 2 F F e 2 F xe 2 F D e 2 F .F x/e 2 F : 2 2 Setting t equal to zero gives the identity f .x/ D F x.
(5.57)
5.4 The General Linear Group
129
Taking the derivative of both sides of (5.57) gives 1 1 t t t t t t P F e 2 F .F x/e 2 F F e 2 F .F x/e 2 F D e 2 F ŒF .F x/e 2 F ; e tf f 2 x D 2 2 and setting t equal to zero gives the identity f 2 .x/ D F .F x/. Continuing to take successive derivatives of (5.56) gives t
t
e tf f k .x/ D P e 2 F .F k W x/e 2 F ;
(5.58)
where F k W x is defined recursively by F 1 W x D F x and F k W x D F .F k1 W x/:
(5.59)
Finally, setting t equal to zero in Eq. 5.58 gives the identity f k .x/ D F k W x: Equation 5.59 is interesting because it expresses the powers of a linear operator in terms of “powers” of its defining bivector. It is clear that each bivector defines a unique skew-symmetric linear operator, and conversely, that each skew-symmetric linear operator defines a unique bivector (see Eq. 5.40). Thus, the study of the structure of a bivector is determined by and uniquely determines the corresponding structure of the corresponding linear operator. The proof of the above theorem is attributable to Marcel Riesz [162].
5.4.1 The General Linear Algebra gl.N / of the General Linear Lie Group GL.N / We can now define the general linear Lie algebra gl.N / of the general linear Lie group GL.N /. As a set, gl.N / End.N /, which is just the set of all tangent operators f D log.f / 2 End.N / to the one-parameter groups ft D e t f defined for each f 2 GL.N /. But to complete the definition of gl.N /, we must specify the algebraic operations of addition and multiplication which allow End.N / to be seen as the Lie algebra gl.N /. Addition requires only the ordinary addition of linear operators, but multiplication is defined by the Lie bracket Œf ; g for f ; g 2 gl.N /. We will give an analytic definition of the Lie bracket [137, p. 3], which directly ties it to the group structure of GL.N /: Œf ; g D
d 1 d ft gt ft gt jt D0 D ft gt ft gt jt D0 : 2 d.t / 2t dt
130
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
Evaluating the Lie bracket by using the Taylor series expansions, ˇ 1 d .ft gt ft gt /ˇt D0 2t dt 1 f ft gt ft gt C ft g gt ft gt ft gt f ft gt ft gt ft g gt D 2t 1 1 ft .f gt gt f /ft gt jt D0 C ft gt .g ft ft g /gt jt D0 D 2t 2t 1 1 D .f g g f / C .g f C f g / 2 2 D f g g f ; (5.60)
Œf ; g D
we find that gt D 1 C tg C
and ft D 1 tf C :
We have thus demonstrated that the Lie bracket, defined analytically above, reduces simply to the commutator product of the linear operators f and g in gl.N /. As such, it is not difficult to show that it satisfies the famous Jacobi identity, which is equivalent to the distributive law Œf ; Œg ; h D ŒŒf ; g ; h C Œg ; Œf ; h :
(5.61)
When we choose a particular basis feg of N , the isomorphism between the general linear Lie algebra gl.n; C/ and gl.N / becomes explicit, and the Lie bracket of linear operators just becomes the Lie bracket of n n matrices, Œf; gfeg D fgfeg gf feg D feg.F G GF/ D fegŒF; G;
(5.62)
where ŒF ; G are the commutator products of the matrices F and G. Alternatively, using the bivector representation of Eq. 5.40, the Lie bracket of linear operators is expressed in terms of the Lie bracket of the bivectors of the operators (5.43).
5.4.2 The Orthogonal Groups The simplest well-known example of an orthogonal group is SO.2/, which is a subgroup of the general linear group GL.N 2 /. As a matrix group, it is generated by all 2 2 matrices of the form
cos sin X D (5.63) sin cos
5.4 The General Linear Group
131
The matrix X generates a counterclockwise rotation in the xy-plane through the angle . Using (5.40), we get the corresponding bivector representation X D cos./e1 ^eN1 sin./e1 ^eN2 C sin./e2 ^eN1 C cos./e2 ^eN2 : (5.64) For matrices X1 ; X2 2 SO.2/, the group operation is an ordinary matrix multiplication, X2 X1 D X1 C2 . For the bivector representation X 1 ; X 2 2 SO.2/, the group operation is defined by its generalized dot product, that is, for x 2 N 2 , .X 1 W X 2 / X 2 .X 1 x/ D X 1 C2 x:
(5.65)
2 . Note that the bivectors X are in Gn;n Taking the derivatives of X and X with respect to and evaluating at D 0 gives the corresponding generators of the associated Lie algebra SO.2/. As a matrix Lie algebra under the bracket operation of matrices, we find the generator
dX 0 1 j !0 D : 1 0 d
(5.66)
As a bivector Lie algebra under the bracket operation of bivectors, we use Eq. 5.64 to find the bivector generator BD
dX j !0 D e 1 ^e 2 C e 2 ^e 1 D 12 C 12 : d
(5.67)
The spinor group Spin.2/ is defined by taking the exponential of the bivector (see Eq. 5.67), 1 B j 2 R : Spin.2/ D exp 2 According to Sobczyk [178], the exponential, exp. 12 B/, can be calculated by noting that the bivector B satisfies the minimal polynomial B 3 C 4B D B.B 2i /.B C 2i / D 0; which implies the decomposition B D 0p1 C 2ip2 2ip3 ; where the mutually annihilating idempotents are defined by p1 D
B2 C 4 1 1 ; p2 D B.B C 2i /; p3 D B.B 2i /: 4 8 8
132
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
Using this decomposition, we find that exp
1 B 2
0 p1 C exp.i/p2 C exp.i/p3 2 D p1 C cos./.p2 C p3 / C sin./i.p2 p3 / B D p1 C cos./.p2 C p3 / C sin./: 2
D exp
(5.68)
The group action of the spinor group Spin(2) is given by x 0 D exp
1 1 B x exp B ; 2 2
where x D fegxfe g D x1 e 1 C x2 e 2 . We say that Spin.2/ is a “double covering” of the orthogonal group SO.2/ because the spinors ˙ exp. 12 B/ represent the same group element. Note that now we can formulate the easy rule for the composition of two group elements, exp. 12 1 B/ and exp. 12 2 B/, exp
1 1 1 1 B exp 2 B D exp .1 C 2 /B : 2 2 2
If we are only interested in the group SO.2/, a more natural place to carry out the calculations is in the Euclidean space R2 . We project the null cone N 2 down to R2 by using the reciprocal pseudoscalars I2 and I2 defined by I2 D 1 2
and I2 D .2
p 2 2/ .eN2 C 2 /.eN1 C 1 /:
Thus, for x D fegxfeg D x1 e1 C x2 e2 2 N 2 , the projection x 0 D PI .x/ gives x 0 D PI .x/ D .x I / I D x1 1 C x2 2 2 R2 : Note that this projection is invertible, in the sense that we can find PI 0 such that x D PI 0 .x 0 /. The projection PI 0 is specified by x D PI 0 .x 0 / D .x 0 I / I 0 D x1 e1 C x2 e2 ; where I is defined as before and where I 0 D e1 e2 . In R2 , the generator of rotations is the simple bivector 2 1 . This bivector can be obtained from the bivector (5.67) in spin.2; 2/ by the simple projection of PI .B/ D I21 I2 B D 2 1 D I2 onto the Lie algebra so.2/. For x 0 D x1 1 C x2 2 2 R2 , the equivalent rotation is given by 1 1 2 1 x 0 exp 2 1 : y 0 D exp 2 2
5.5 Computing Rigid Motion in the Affine Plane
133
The above ideas can be immediately generalized to the general Lie group GL.N n / of null cone N n and the orthogonal subgroups SO.p; q/ where pCq D n. The orthogonal group SO.p; q/ acts on the space Rp;q . Thus, if we wish to work in this Lie group or in the corresponding Lie algebra, we first project the null cone N n onto Rp;q by using the reciprocal vector basis elements, then carry out the rotation, and finally return to the null cone by using the inverse projection.
5.5 Computing Rigid Motion in the Affine Plane A rotation in the affine n-plane Ane D Ae .Rn /, just as in the Euclidean space Rn , is the product of two reflections through two intersecting hyperplanes. If the normal unit vectors to these hyperplanes are m and n, respectively, then the versor of the rotation is given by
R D mn D e 2 B D cos
C B sin ; 2 2
(5.69)
where B is the unit bivector defining the plane of the rotation. A translation of the vector xh 2 Ane , along the vector t 2 Rn , to the vector 0 xh D xh C t 2 Ane , is effected by the versor 1 1 t eN D 1 C t eN T D exp 2 2
(5.70)
when it is followed by the projection PA .x 0 / .x^e/ N e, which brings the horosphere back into the affine plane. Thus, for xh 2 Ane , we get 1 1 t eN xh exp t eN x D T xT D exp 2 2 1 1 1 1 1 N h xh t eN t ex N h t eN D 1 C t eN xh 1 t eN D xh C t ex 2 2 2 2 4 1 D xh C t C t .e^x N h / t 2 eN 2 1 2 N (5.71) D xh C t t xh C t e: 2 0
1
Applying PA to this result, we get the expected translated vector xh0
1 2 D PA .x / D PA xh C t t xh C t eN D xh C t: 2 0
(5.72)
134
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
The advantage of carrying out translations in the affine plane rather than in the horosphere is that the affine plane is basically still a linear model of Euclidean space, whereas the horosphere is a more complicated nonlinear model. Combining the versors for a rotation and a translation, we get the expression for the versor M D TR of a rigid motion. For xh 2 Ane , we then find that xh0 D PA ŒM xh M 1 D PA ŒTRxh R1 T 1 :
(5.73)
f, expressing M 1 in terms of the Equivalently, we will often write M 1 M operation of conjugation. Whenever a calculation involves a translation, we must always apply the projection PA to guarantee that our end result will be in the affine plane. The above calculations can be checked with the Clifford algebra calculator CLICAL 4.0 [124]. Comparisons can also be made to the corresponding calculations made by Hestenes and Li [120] on the horosphere.
5.6 The Lie Algebra of the Affine Plane The Lie algebra of the neutral affine plane, Ae3 .N 2 /, is useful in the analysis of visual invariants [98], so we will begin with its treatment here. The well-known matrix representation of the Lie group of affine transformations in the plane has six independent parameters, or degrees of freedom, and consists of all matrices of the form 2
a11 g.A; v/ D 4 a21 0
a12 a22 0
3 a b5; 1
(5.74)
where det g.A; v/ D det A ¤ 0. The one-parameter subgroups are generated by the matrices 2
1 Tx D 4 0 0 2 u e Du D 4 0 0 2 v e Sv D 4 0 0
0 1 0 0 eu 0 0 e v 0
3 2 x 1 Ty D 4 0 05; 1 0 2 3 cos./ 0 4 5 D ; R sin./ 0 0 1 2 3 cosh./ 0 0 5 ; H D 4 sinh./ 0 1
3 0 0 1 y 5; 0 1
3 sin./ 0 cos./ 0 5 ; 0 1 3 sinh./ 0 cosh./ 0 5 : 0 1
(5.75)
Using Eq. 5.53, we obtain the matrix representation of the Lie algebra basis generators by taking the derivative of Eq. 5.75 and evaluating the parameter at zero:
5.6 The Lie Algebra of the Affine Plane
2
0 Lx D 4 0 0 2 1 Ls D 4 0 0 2 1 Lb D 4 0 0
135
3 2 0 1 0 0 0 5 ; Ly D 4 0 0 0 0 3 2 0 0 0 5 4 ; L D 1 0 1 r 0 0 0 3 2 0 0 0 1 0 5 ; LB D 4 1 0
0
3 0 0 0 15; 0 0 3 1 0 0 05; 0 0 3 1 0 0 05: 0 0 0
(5.76)
The above matrix Lie group and matrix Lie algebra can be directly translated into the corresponding Lie group and Lie algebra of the affine plane Ae3 .N 2 /. Each of the matrix generators in (5.75) and (5.76) can be replaced by its corresponding bivector representation (5.40). For example, the bivector representations of the generators of the Lie algebra are Lx D bivector.Lx / D e1 ^eN3 ; Ly D bivector.Ly / D e2 ^eN3 ; Ls D bivector.Ls / D e1 ^eN1 C e2 ^eN2 ; Lr D bivector.Lr / D e2 ^eN1 e1 ^eN2 ; Lb D bivector.Lb / D e1 ^eN1 e2 ^eN2 ; LB D bivector.L / D e1 ^eN2 C e2 ^eN1 :
(5.77)
Expanding these bivector generators in the standard basis (5.27), we get Lx D 12 1 3 12 1 3 12 3 1 12 1 3 ; Ly D 12 2 3 12 2 3 12 a3 2 12 2 3 ; Ls D 1 1 2 2 ; Lr D 1 2 C 1 2 ; Lb D 1 1 C 2 2 ; LB D 1 2 2 1 :
(5.78)
Let us see how the Lie algebra of the affine plane can be represented as a Lie algebra of vector fields over the null cone N 3 . The vector derivative or gradient @x D @@x at the point x D xe1 C ye2 C ze3 2 N 3 is defined by requiring a @x to be the directional derivative in the direction of a. It follows that a @x x D a. We also have 3 X @x x D @x x C @x ^ x D 3 C eNi ^ ei ; i D1
where e and eN are reciprocal bases for the reciprocal null cones N 3 and N 3 .
136
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
Now, let a D a.x/ and b D b.x/ be vector fields in N 3 . The Lie bracket Œa; b is defined by Œa; b D a @x b b @x a: Since in N 3 , @x ^@x D 0, we have the important integrability condition that .a^b/ .@x ^@x / D Œa; b @x Œa @x ; b @x D 0; where Œa @x ; b @x D a @x b @x b @x a @x is the Lie bracket or commutator product of the partial derivatives a @x and b @x . It follows from this identity that Œa; b @x D Œa @x ; b @x ; which relates the Lie bracket of the vector fields Œa; b to the standard Lie bracket of the partial derivatives Œa @x ; b @x . Let us consider in detail the translation of the Lie algebra of the affine plane to the null-vector formulation in the null cone N 2 . Recall that the two-dimensional affine plane Ae .N 2 / in N 3 is defined by Ae .N / D fx 2 N 3 j x D xe1 C ye2 C e3 g:
(5.79)
We have already seen that the Lie algebra of the affine plane can be defined by a Lie algebra of matrices, or by an equivalent Lie algebra of bivectors. We now define this same Lie algebra as a Lie algebra of partial derivatives or as a Lie algebra of vector fields. We have the following correspondences: Lx D
@ D e1 @x D Lx .x^@x / $ Lx x D Lx x D e1 D Lx ; (5.80) @x
where Lx D e1 ^eN3 ; Ly D
@ D e2 @x D Ly .x^@x / $ Ly x D Ly x D e2 D Ly ; (5.81) @y
where Ly D e2 ^eN3 ; @ @ Cy D .x e3 / @x D Ls .x^@x / @x @y $ Ls x D Ls x D xe1 C ye2 D x e3 D Ls ;
Ls D x
(5.82)
where Ls D e1 ^eN1 C e2 ^eN2 ; Lr D y
@ @ Cx D Lr .x^@x / $ Lr x D Lr x D Lr ; @x @y
(5.83)
5.6 The Lie Algebra of the Affine Plane
137
where Lr D e2 ^eN1 e1 ^eN2 ; Lb D x
@ @ y D Lb .x^@x / $ Lb x D Lb x D Lb ; @x @y
(5.84)
where Lb D e1 ^eN1 e2 ^eN2 ; and LB D y
@ @ Cx D LB .x^@x / $ LB x D LB x D LB ; @x @y
(5.85)
where LB D e1 ^eN2 C e2 ^eN1 . Thus, the Lie algebra of the affine plane is generated by the bivectors Mbivectors D fLx ; Ly ; Ls ; Lr ; Lb ; LB g;
(5.86)
or, equivalently, by the vector fields of the form Mvector fields D fLx x; Ly x; Ls x; Lr x; Lb x; LB xg D fLx ; Ly ; Ls ; Lr ; Lb ; LB g;
(5.87)
where L x for L 2 Mbi vect ors . The Lie bracket ŒL1 x; L2 x is given by ŒL1 x; L2 x D L2 .L1 x/ L1 .L2 x/ D .L2 L1 / x; where L1 L2 D 12 .L1 L2 L2 L1 / is the commutator product of the bivectors L1 ; L2 2 M. The Lie algebra of the affine plane is useful for the analysis of motion in the image plane [98]. The vector fields of this Lie algebra are tangent to the flows or integral curves of their group action on the manifold and are presented in Fig. 5.2 as real images. We have found the generators Lx D
@ @x
@ @ @ @ Lr D y @x C x @y LB D x @x y @y
Ly D
@ @y
@ @ Ls D x @x C y @y
@ @ Lb D y @x C x @y
(5.88)
of the Lie algebra of the affine plane Ae3 .N 2 / as vector fields along integral curves. Taking the commutator products of these infinitesimal differential generators gives the following multiplication Table 5.1 for this Lie algebra. Using Table 5.1, we can verify the Jacobi identity for Lx ; Ls , and Lb , getting ŒLx ŒLs Lb C ŒLs ŒLb Lx C ŒLb ŒLx Ls D ŒLx 0 ŒLs Ly C ŒLb Lx D 0 C Ly Ly D 0:
(5.89)
138
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
Lx
Lr
LB
Ly
Ls
Lb
Fig. 5.2 Lie algebra basis in the form of real images Table 5.1 Lie algebra of the affine plane Œ; Lx Ly Ls Lr Lx 0 0 Lx Ly Ly 0 0 Ly Lx Ls Lx Ly 0 0 Lr Ly Lx 0 0 Lb Lx Ly 0 2LB LB Ly Lx 0 2Lb
Lb Lx Ly
LB Ly Lx
0 2LB 0 2Lr
0 2Lb 2Lr 0
Or, equivalently, using CLICAL and the bivector representation for Lx ; Lr , and Lb , we calculate ŒLx ŒLr Lb C ŒLr ŒLb Lx C ŒLb ŒLx Lr D 2 ŒLx LB ŒLr Ly C ŒLb Ly D 2Lx Lx Lx D 0:
(5.90)
5.7 The Algebra of Incidence In various applications in robotics, image analysis, and computer vision, the use of projective geometry and the algebra of incidence is extremely useful. Fortunately, these mathematical systems can be efficiently handled within the geometric algebra framework.
5.7 The Algebra of Incidence
139
In projective geometry, points are represented using homogeneous coordinates of nonzero vectors in the .n C 1/-dimensional Euclidean space RnC1 . These can be seen as projective rays identified as points in the n-dimensional projective plane ˘ n of RnC1 . Furthermore, points, lines, planes, and higher-dimensional k-planes in ˘ n are related to 1, 2, 3, and .k C 1/-dimensional subspaces S r of RnC1 , where k n. Since each k-subspace can be associated with a nonzero k-blade Ak of the geometric algebra G.RnC1 /, it follows that the corresponding .k 1/-plane in ˘ n can be named by the k-direction of the k-blade Ak . The meet and join in ˘ n are the principal operations of the algebra of incidence to compute the intersection and union of the k-planes. Suppose that the set of r points a1 ; a2 ; : : : ; ar 2 ˘ n and the set of s points b1 ; b2 ; : : : ; bs 2 ˘ n are both in a general position (linearly independent vectors in RnC1 ), then the .r 1/-plane in ˘ n is specified by the r-blade Ar D a1 ^a2 ^ ^ar ¤ 0;
(5.91)
and the .s 1/-plane by the s-blade Bs D b1 ^b2 ^ ^bs ¤ 0:
(5.92)
Considering the a’s and b’s to be the basis elements of respective subspaces Ar and Bs , they can be sorted in such a way that Ar [ B s D spanfa1 ; a2 ; : : : ; as ; b 1 ; : : : ; b k g:
(5.93)
Bs D b 1 ^ ^b k ^b˛1 ^ ^b˛sk ;
(5.94)
Supposing that
it follows that the “meet” and “join” of the r-blade Ar and s-blade Bs are respectively given by Ar [ Bs D Ar ^b 1 ^ ^b k ;
(5.95)
A \ B D spanfb˛1 ; : : : ; b˛sk g:
(5.96)
r
s
Note that if the meet of Ar and Bs D 0, their join equals the wedge of the blades Ar [ Bs D Ar ^Bs . After the join of Ar and B s has been computed, the r C k-blade I Ar [Bs D Ar [ B s
(5.97)
can be used for computing the meet of the r- and s-blades Ar and Bs : Ar \ Bs D Ar .Bs IAr [Bs / D .IAr [Bs Ar / Bs :
(5.98)
140
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
This expression holds for the positive definite metric of RnC1 . If we use any nondegenerated pseudo-Euclidean space Rp;q , where p C q D n C 1, we must use instead the reciprocal r Ck-blade I Ar [Bs , for which the property IAr [Bs I Ar [Bs ¤ 0 is satisfied. For this case, the meet equation reads Ar \ Bs D Ar .Bs I Ar [Bs / D .I Ar [Bs Ar / Bs :
(5.99)
Note that if the grade of the blade Ar [ Bs equals n D p C q, we can simply use the inverse of the pseudoscalar, so that I I = 1. In the case of the geometric algebra of the null cone G.N nC1 /, we define the following reciprocal r C k-blade for the meet Eq. 5.99: I Ar [Bs D a1 ^a 2 ^ ^as ^b 1 ^ ^b k :
(5.100)
A more complete discussion of these ideas can be found in [157, 179].
5.7.1 Incidence Relations in the Affine n-Plane This section presents incidence relations between points, lines, planes, and higher dimensional k-planes using the useful computational framework of the affine n-plane. Let us rewrite Eq. 5.44 in the larger pseudo-Euclidean space RnC1;1 D Rn ˚ R1;1 , where R1;1 D spanf ; g: Ae .Rn / D fxh D x C ej x 2 Rn g RnC1;1 :
(5.101)
The null vector e 2 R1;1 is given by e D 12 . C /, and the reciprocal null vector e D n n fulfills the condition e e D 1. Now, if we merge the n-affine plane Ae .Rn / together with the plane at infinity, we obtain the projective plane ˘ n . Each point x 2 Ae .Rn / is called a homogeneous representant of the corresponding point in ˘ n . Now points in the affine plane can be represented as rays in the projective space: n nC1 Arays and y eN ¤ 0 g RnC1 : e .R / D fyj y 2 R
(5.102)
Note that in this definition we consider y eN ¤ 0, because rays are directions and they remain the same if we multiply for a scalar. Accordingly, a homogeneous point of the n-affine plane can be uniquely computed from a ray as follows: y 2 Ae .Rn /: y eN
(5.103)
Now let us formulate useful incidence relations. If we consider k-points a1h ; a2h ; : : : ; akh 2 Ane , where each aih D ai C e for ai 2 Rn , and then compute their
5.7 The Algebra of Incidence
141
outer product, we get the .k 1/-plane Ah in ˘ n : Ah D a1h ^a2h ^ ^akh D a1h ^.a2h a1h /^a3h ^ ^akh D h D a1h ^.a2h a1h /^.a3h a2h /^ ^.akh ak1 /
D a1h ^.a2 a1 /^.a3 a2 /^ ^.ak ak1 / D .a1 C e/^.a2 a1 /^.a3 a2 /^ ^.ak ak1 / D a1 ^a2 ^ ^ak Ce^.a2 a1 /^.a3 a2 /^ ^.ak ak1 /:
(5.104)
This equation represents a .k 1/-plane in ˘ n , but it also belongs to the affine n-plane Ane and thus contains important metrical information which can be extracted by taking the dot product from the left with e: e Ah D e .a1h ^a2h ^ ^akh / D .a2 a1 /^.a3 a2 /^ ^.ak ak1 /:
(5.105)
Interestingly enough, this result, with a little modification, turns out to be the directed content of the .k 1/-simplex Ah D a1h ^a2h ^ ^akh in the affine n-plane: e .a1h ^a2h ^ ^akh / e Ah D .k 1/Š .k 1/Š .a2 a1 /^.a3 a2 /^ ^.ak ak1 / : D .k 1/Š
(5.106)
5.7.2 Directed Distances Using our previous results, we can propose useful equations in the affine plane to relate points, lines, and planes metrically. The directed distance or foot from the .k 1/-plane a1h ^ ^akh to the point b h is given by d Œa1h ^ akh ; b h Œfe .a1h ^ ^akh /g .e b h /1 Œe .a1h ^ ^akh ^b h /
D Œa2 a1 /^ ^.ak ak1 /1 Œ.a2 a1 /^ ^.ak ak1 /^.b ak /: (5.107)
In the same sense, the equation of the directed distance between the two lines a1h ^a2h and b1h ^b2h in the affine n-plane reads d Œa1h ^a2h ; b1h ^b2h Œfe .a1h ^a2h /g^fe .b1h ^b2h /g1 Œe .a1h ^a2h ^b1h ^b2h / D Œ.a2 a1 /^.b2 b1 /1 Œ.a2 a1 /^.b1 a2 /^.b2 b1 /: (5.108)
142
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
A general equation of the directed distance between the .r 1/-plane, Ah D a1h ^ ^arh , and the .s 1/-plane, B h D b1h ^ ^bsh , in the affine n-plane is similarly given by d Œa1h ^ ^arh ; b1h ^ ^bsh fe .a1h ^ ^arh /g^fe .b1h ^ ^bsh /g1 Œe .a1h ^ ^arh ^b1h ^ ^bsh / D Œ.a2 a1 /^ ^.ar ar1 /^.b2 b1 /^ ^.bs bs1 /1 Œ.a2 a1 /^ ^.ar ar1 /^.b1 ar /^.b2 b1 /^ ^.bs bs1 /: (5.109) We have to be careful, because if Ah ^B h D 0, the directed distance may or may h not be equal to zero. If .a1h ^ ^arh /^.b1h ^ ^bs1 / ¤ 0, we can calculate the h meet between the .r 1/-plane A and the .s 1/-plane B h , p D .a1h ^ ^arh / \ .b1h ^ ^bsh / D .a1h ^ ^arh / Œ.b1h ^ ^bsh / I A[B ;
(5.110)
where h I A[B D fe Œ.a1h ^ ^arh /^.b1h ^ ^bs1 /g^e:
It can happen that the point p D Ah \ B h may not be in the affine n-plane, but the p normalized point p h D ep will either be in the affine plane or will be undefined. Finding the “normalized point” is not necessary in many calculations, but is required when the metric plays an important role or in the case of parallel hyperplanes, when it is used as an indicator.
5.7.3 Incidence Relations in the Affine 3-Plane This section presents some algebra of incidence relations for 3D Euclidean space represented in the affine 3-plane A3e , with the pseudoscalar I D 123 e and the reciprocal pseudoscalar I D e 321 satisfying the condition I I D 1. Similar incidence relations were given by Blaschke [25] using dual quaternions, and later by Selig using the 4D degenerate geometric algebra G3;0;1 [170]. Unlike the formulas given by these authors, our formulas are generally valid in any dimension and are expressed completely in terms of the meet and join operations in the affine plane. Blaschke and Selig could not exploit the meet and join operations because they were using a geometric algebra with a degenerate metric. The distance of a point b h to the line Lh D a1h ^a2h is the magnitude or norm of the directed distance: ˇh i1 h iˇ ˇ ˇ jd j D ˇ feN .a1h ^a2h /g^f.eN .b h /g eN .a1h ^a2h ^b h / ˇ:
(5.111)
5.7 The Algebra of Incidence
143
The distance of a point b h to the plane Ah D a1h ^a2h ^a3h is ˇh i1 h iˇ ˇ ˇ eN .a1h ^a2h ^a3h ^b h / ˇ: jd j D ˇ feN .a1h ^a2h ^a3h /g^f.eN .b h /g
(5.112)
Let us analyze carefully the incidence relation between the lines Lh1 D a1h ^a2h and Lh2 D b1h ^b2h , which are completely determined by their join, ILh [Lh D Lh1 [ Lh2 . 1 2 The following formulas help to test the incidence relations of the lines. – If ILh [Lh is a bivector, the lines coincide and Lh1 D tLh2 for some t 2 R. 1 2 – If ILh [Lh is a 3-vector, the lines are either parallel or intersect in a common 1 2 point. In this case, p D Lh1 \ Lh2 D Lh1 Lh2 I Lh [Lh ; 1
2
(5.113)
where p is the result of the meet. If e p D 0, the lines are parallel; otherwise, p they intersect at the point ph D ep in the affine 3-space A3e . – If ILh [Lh is a 4-vector, the lines are skewed. In this case, the distance is given 1 2 by Eq. 5.108. The incidence relation between a line Lh D a1h^a2h and a plane B h D b1h^b2h^b3h is also determined by their join, Lh [ B h . Clearly, if the join is a trivector, the line Lh lies in the plane B h . The only other possibility is that their join is the pseudoscalar I D 123 e. In this case, p D Lh \ B h D Lh .B h I /:
(5.114)
If e p D 0, the line is parallel to the plane, with the directed distance determined p by Eq. 5.109. Otherwise, their point of intersection in the affine plane is ph D ep . h h h h h h h h Two planes, A D a1 ^a2 ^a3 and B D b1 ^b2 ^b3 , in the affine plane A3e are either parallel, intersect in a line, or coincide. If their join is a trivector, that is, if Ah D tB h for some t 2 R , they obviously coincide. If they do not coincide, then their join is the pseudoscalar I D 123 e. In this case, we calculate the meet as L D Ah \ B h D .I Ah / B h :
(5.115)
If e L D 0, the planes are parallel, with the directed distance determined by Eq. 5.109. Otherwise, L represents the line of intersection in the affine plane having the direction e L. The equivalent of the above incidence relations was given by Blaschke [25] using dual quaternions, and by Selig [169] utilizing a special or degenerate fourdimensional Clifford algebra. Whereas Blaschke uses only pure quaternions (bivectors) for his representation, Selig uses trivectors for points and vectors for planes. In contrast, in the affine 3-plane, points are always represented by vectors, lines by bivectors, and planes by trivectors. This offers a comprehensive and consistent
144
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
interpretation that greatly simplifies the underlying conceptual framework. The following equations compare our equations (left side) with those of Blaschke and Selig (right side). 1 Q .pl Q C lp/; 2 1 Equation .5.112/ .p
Q C p/; Q 2 1 Q Equation .5.114/ .l
C l/: Q 2
Equation .5.111/
(5.116) (5.117) (5.118)
5.7.4 Geometric Constraints as Flags It is often necessary to check a geometric configuration during a rigid motion in Euclidean space, and simple geometric incidence relations can be used for this purpose. For example, a point p is on a line L if and only if p^L D 0:
(5.119)
Similarly, a point p is on a plane A if p^A D 0:
(5.120)
L \ A D A:
(5.121)
A line L will lie in plane A if
Alternatively, the line L can meet the plane A in a single point p, in which case, L \ A D p; or, if the line L is parallel to the plane A, L \ A D 0:
(5.122)
5.8 Conclusion We have shown how geometric algebra can effectively be used to carry out analysis on a manifold, which is useful in robotics and image analysis. Geometric algebra offers a clear and concise geometric framework of multivectors in which
5.9 Exercises
145
calculations can be carried out. Since the elements and operations in geometric algebra are basis-free, computations are simpler and geometrically more transparent than in more traditional approaches. Stereographic projection and its generalization to the conformal group and projective geometry have direct applications in image analysis from one or more viewpoints. The key idea is that an image is first represented on the null cone, and then projected onto affine geometries or onto an n-dimensional affine plane, where the image analysis takes place. Since every Lie algebra can be represented by an appropriate bivector algebra in an affine geometry, it follows that a complete motion analysis should be possible using its bivector representation in geometric algebra. In Chap. 11, we explore applications in robotics of the n-dimensional affine plane as a computing framework both to analyze rigid motion and to apply the algebra of incidence. In Chap. 8, we employ Lie operators expressed in terms of bivectors to detect visual invariants.
5.9 Exercises 5.1. Prove that x c D 12 x h ex h D exp. 12 xe/ e exp. 12 xe/, where x c 2 Hep;q , n xh 2 Ap;q e , and x 2 R . 5.2. The bases of the reciprocal null cones feg2N and feg2N N are called reciprocal or dual bases because they fulfill the relationship fegfeg N D id , where id is an nn identity matrix. The pseudoscalar of G.N / is I D e1 ^e2 ^e3 ; : : : ; en , and of G.NN / is IN D eN1^Ne2 ^Ne3 ; : : : ; eNn , both of which satisfy the condition I IN D 1. According to Eq. 5.34, we can express a second basis fag D fegA D fe1 ; e2 ; e3 ; : : : ; en gA 2 N , where the matrix A is V responsible for this change of basis. The hypervolume spanned by the basis fag is niD1 fag D a1 ^a2 ^ ^an D det.A/e1 ^e2 ^ en D det.A/I. The bracket of A is simply computed by takingVthe dot product of this hypervolume and the reciprocal pseudoscalar det.A/ D niD1 fag IN. Similar to the standard approach for obtaining a reciprocal basis, it is easy to see that the new reciprocal basis fag N can be computed by means of the equation ai D .1/i C1
.a1 ^ ^.1/i ^ ^an / I ; Œa1 ; a2 ; : : : ; an
where ai is left out of the wedge operation in position .1/i . This expression guarantees that fag N fag D id . Find an expression for computing the inverse of the matrix .A/. (Hint: Use fag N D Bfag.) N 5.3. Projections from G3;3 to G3;0 : Consider the Lie algebra so.3/ in G3;3 :
146
5 Lie Algebras and the Algebra of Incidence Using the Null Cone and Affine Plane
2
0 1 Lx D 4 1 0 0 0
3 0 05; 0
2
0 Ly D 4 0 1
3 0 1 0 05; 0 0
2
0 Lz D 4 0 0
3 0 0 0 1 5 : 1 0
Using CLICAL, represent this Lie algebra in G3;3 using the bivector matrices Lx , Ly , and Lz . Take their projections P .Lx / D I 1 .I Lx /, P .Ly / D I 1 .I Ly /, and P .Lz / D I 1 .I Lz / using the Euclidean pseudoscalar I D e1 e2 e3 and also using the reciprocal pseudoscalar I D e4 e5 e6 . Explain the dual relation of the results. 5.4. Using CLICAL, compute in the 2D affine plane A2e the new position of the point x D 4 1 C 2 2 2 A.R2 / after the translation t D 6 1 C 5 2 2 A.R2 /. 5.5. Using CLICAL, compute in the 2D affine plane A2e the dilation of the point x D 3 1 C 5 2 2 A.R2 / for 2
eu Du D 4 0 0
0 eu 0
3 2 3 1:75 0 0 0 05 D 4 0 1:75 0 5 : 1 0 0 1
5.6. Using CLICAL, compute in the geometric algebra of the null cone G.N 2 / the new position of the point x0 D 2 1 C 3 2 2 A.R2 / after a rotation of D 6 . Use Eq. 5.68 with the bivector of the spinor group Spin.2/ B D e1 ^ eN2 C e2 ^ eN1 . Note that the rotation is not computed with the exponential function but rather with a function depending on the mutually annihilating idempotents. 5.7. Compute in the affine 2-plane A2e , with the pseudoscalar I D 1 ^ 2 ^e and the reciprocal pseudoscalar I D 1 ^ 2 ^e, the meet of the lines Lh1 D a1h ^a2h and Lh2 D b1h ^b2h , where a1h D 4 1 Ce, a2h D 2 2 Ce, b1h D e, and b2h D 2 1 C3 2 Ce. 5.8. In the affine 2-plane A2e , compute the intersecting point p h of the lines Lh1 and Lh3 , where Lh1 is the line determined in problem 5.7 and Lh3 passes through the point c2h D 4 1 C 3 2 C e and is orthogonal to the line Lh1 . (Hint: Consider the line Lh3 D p h ^c2h with the point p h D c2h C s i.a1 a2 / for s 2 R, where i D 1 2 .) Note that s ¤ 0 can be overlooked, because the line is uniquely defined by the 2-direction of the bivector p h ^c2h and not by its magnitude. 5.9. Theorem proving: Let a circle entered at the origin and a and b be the end points of the diameter. Take any point c on the circle and show in the 2-plane A2e that the lines l ac and l cb are perpendicular. 5.10. Theorem proving: Prove the theorem of Desargues configuration in the 3Dprojective plane ˘ 3 . Consider that x1 ; x2 ; x3 and y1 ; y2 ; y3 are the vertices of two triangles in ˘ 3 and suppose that .x1^x2 /\.y1^y2 / D z3 , .x2^x3 /\.y2^y3 / D z1 , and .x3^x1 / \ .y3^y1 / D z2 . You can claim that c1^c2^c3 D 0 if and only if there
5.9 Exercises
147 C1
Fig. 5.3 Simpson’s theorem A
D
P
B
B1
C
A1
is a point p such that x1 ^y1 ^p D 0 D x2 ^y2 ^p D x3 ^y3 ^p. (Hint: Express the point as linear combinations of a1 ; b1 , a2 ; b2 , and a3 ; b3 . The other half of the proof follows by duality of the classical projective geometry.) 5.11. Theorem proving: Consider an arbitrary circumcircled triangle, see Fig. 5.3. From a point d on the circumcircle, draw three perpendiculars to the triangle sides bc, ca, and ab to meet the circle at points a1 , b1 , and c 1 , respectively. Prove that the lines l aa1 , l bb1 , and l cc1 are parallel. (Hint: In the affine 2-plane A2e , interpret the geometry of your results according to the grade and the absolute value of the directed distances between the lines.) 5.12. Consider in the affine 3-plane A3e the points a1h D 3 1 C 4 2 C 5 3 C e, a2h D 2 1 5 2 C 2 3 C e, and a3h D 1 1 C 6 2 C 4 3 C e; the line Lh1 D a1h ^a2h ; and the plane 1h D a1h ^a2h ^a3 . Compute, using the MAPLE package CLIFFORD 4.0, for (a) one of the points, (b) the line, and (c) the plane, their new positions after the rigid motion. This is defined by a translation of t1h D 1 C 2 2 C 3 C e and rotations about the three axes of x D 5 , y D 3 , and z D 6 . Recall that you computed the translation using the horosphere as an intermediate framework. 5.13. Consider in the affine 3-plane A3e the points p0h D e, p1h D 2 C e, p2h D 1 C 2 C e, p3h D 1 C e, p4h D 1 C 3 C e, p5h D 1 C 2 C 3 C e, p6h D 2 C 3 C e, and p7h D 3 C e; the lines L01 D p0h ^p1h , L36 D p3h ^p6h , and L76 D p7h ^p6h ; and the planes f D p0h ^p1h ^p3h , t D p5h ^p6h ^p7h , and r D p2h ^p5h ^p6h . Compute, using CLIFFORD 4.0, the directed distances between p7h and f ; p5h and L36 ; L01 and L36 ; L36 and L76 ; L01 and t ; L36 and t ; f and t ; and f and r . Interpret the geometry of your results according to the grade and the absolute value of the directed distances.
Chapter 6
Conformal Geometric Algebra
6.1 Introduction The geometric algebra of a 3D Euclidean space G3;0;0 has a point basis and the motor algebra G3;0;1 a line basis. In the latter geometric algebra, the lines expressed in terms of the Pl¨ucker coordinates can be used to represent points and planes as well. The reader can find a comparison of representations of points, lines, and planes using G3;0;0 and G3;0;1 in Chap. 4. Interestingly enough, in the case of the conformal geometric algebra, we find that the unit element is the sphere, which allows us to represent the other geometric primitives in its terms. To see how this is possible, we begin by giving an introduction in conformal geometric algebra following the same formulation presented in [9, 119], and show how the Euclidean vector space Rn is represented in RnC1;1 . Let fe1 ; : : : ; en ; eC ; e g be a vector basis with the following properties: ei2 D 1;
i D 1; : : : ; nI
D ˙1; ei eC D ei e D eC e D 0;
(6.1)
2 e˙
i D 1; : : : ; n:
(6.2) (6.3)
Note that this basis is not written in bold. A null basis fe0 ; e1 g can be introduced by .e eC / ; 2 D e C eC ;
e0 D e1
(6.4) (6.5)
with the properties 2 e02 D e1 D 0;
e1 e0 D 1:
(6.6)
A unit pseudoscalar E 2 R1;1 that represents the so-called Minkowski plane is defined by E D e1 ^ e0 D eC ^ e D eC e ; (6.7)
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 6, c Springer-Verlag London Limited 2010
149
150
6 Conformal Geometric Algebra
having the properties E 2 D 1; e D E ; E
(6.8) (6.9)
E e˙ D e D e˙ ; E e1 D e1 E D e1 ; E e0 D e0 E D e0
(absorption),
1 E D e1 e0 ; 1 C E D e0 e1 :
(6.10) (6.11) (6.12)
The dual of E is given by I; E D EI 1 D E e
(6.13)
where I is the pseudoscalar for RnC1;1 .
6.1.1 Conformal Split Euclidean points xe 2 Rn can be represented in RnC1;1 in a general way, as x c D xe C ˛e0 C ˇe1 ;
(6.14)
where ˛ and ˇ are arbitrary scalars. A conformal point x c 2 RnC1;1 can be divided into its Euclidean and conformal parts by an operation called the additive split: Rn;1 D Rn ˚ R1;1 [119]. This split is defined by the projection operators, PE (projection) and PE? (rejection), as follows: PE .x c / D .x c E /E D ˛e0 C ˇe1 2 R1;1 ; f D .x c ^ E /E D xe 2 Rn ; PE? .x c / D .x c E /E x c D PE .x c / C PE? .x c /:
(6.15) (6.16) (6.17)
The names “projection” and “rejection” stem from the geometrical meaning of these operators. The first returns the component of x c that is parallel to E by a projection (dot product). The latter produces the component of x c that is orthogonal to E , hence the name (see Fig. 6.1). Hestenes introduced earlier the multiplicative split: Rn;1 D Rn ˝ R1;1 as the additive split [119], which relates conformal and vector space models (covariant Euclidean geometry) [91, 93]. For the case of G4;1 , the null vector e0 is assigned to the origin, so that each point lies on the bundle of all lines crossing through the origin. G3 is the geometry of that bundle, and each point is represented by the vector x 2 R3 . The generating basis vectors for G3 are trivectors in G4;1 : i D ei ^e1 ^e0 D ei .e1 ^e0 / D ei E D E ei ; i D 1 2 3 D .e1 E /.e2 E /.e3 E / D .e1 e2 e3 /E D Ic ;
(6.18) (6.19)
6.1 Introduction
151 1.6
e
1.4 1.2
E
1 0.8
xc
0.6
e0
0.4
PE(xc)
1D Affine Plane
0.2
1D Horosphere
0
e1
PE(xc) -1
-0.5
1 0.5 0
0 0.5
1
-0.5
Fig. 6.1 The projection and rejection of vector xc 2 R2;1 from the E plane. The operators are illustrated for the 1D case
note that the pseudoscalar is invariant, that is, i D Ic , where i 2 D 1. By the case of additive split, the vector basis of e1 ; e2 ; e3 2 G3 is not associated with lines or the origin e0 , and its pseudoscalar I3 D e1 e2 e3 is not invariant.
6.1.2 Conformal Splits for Points and Simplexes We will next give the expressions using conformal splits of the basic geometric entities depicted in Fig. 6.2. The point is given by 1 2 1 2 1 x D x C x e1 C e0 E D E x x e1 e0 D xE C x2 e1 e0 ; (6.20) 2 2 2 this fulfills x D x^e0 ^e1 D x^E :
(6.21)
The line or spear is given by L D x^p^e1 D x^pe1 C .p x/ D .de1 C 1/n;
(6.22)
where the Pl¨ucker coordinates are given by n D .p x/ (tangent), x ^ p D x ^ .p x/ D dn (moment), and d D .x ^ p/n1 D x .x n1 /n (directance). The plane is given by P D x^p^q ^e1 D x^p^qe1 C .p x/^.q x/E ;
(6.23)
152
6 Conformal Geometric Algebra
a
b
c
L
P
X x
P n
0
0 X
e0
in
x^p X
q
0
Fig. 6.2 Conformal multiplicative splits: (a) point; (b) line; (c) plane
where the tangent is .p x/^.q x/ D x^p C p^q C q^x D i n and the moment x^p^q D x^Œ.p x/^.q x/ D x^.i n/ D i.x n/. The dual form of the plane is given by P D i.x ne1 C nE / D in;
(6.24)
where 1 n D x 2 x 1 D .x2 x1 /E C .x22 x21 /e1 2 1 D .x2 x1 /E C .x2 C x1 / .x2 x1 /e1 D nE C c ne1 : 2
(6.25)
In this book, we have used the additive split more often than the multiplicative split; however, the use of the multiplicative split and its representations are the matter of our future works.
6.1.3 Euclidean and Conformal Spaces One of the results of the non-Euclidean geometry demonstrated by Nikolai Lobachevsky in the nineteenth century is that in spaces with hyperbolic structure we can find subsets that are isomorphic to a Euclidean space. In order to do this, Lobachevsky introduced two constraints, to what we now call conformal point xc 2 RnC1;1 , see Fig. 6.3. The first constraint is the homogeneous representation, normalizing the vector xc such that xc e1 D 1;
(6.26)
6.1 Introduction
153
a Hyperplane
e
ee0 Null Cone
Horosphere e+ e1
-
-
-
Fig. 6.3 (a) The null cone (dotted lines), the hyperplane P C .e1 ; e0 /, and the horosphere for 1D. Note that even though the normal of the hyperplane is e1 , the plane is actually geometrically parallel to this vector. (b) Surface levels A, B, and C denoting spheres with a positive radius, zero radius (null cone), and negative radius, respectively
and the second constraint is that the vector must be a null vector, that is, x2c D 0:
(6.27)
Now, recall that a hyperplane P .n; a/ 2 RnC1;1 with normal n and passing through the point a is the solution to the equation n .x a/ D 0; x 2 RnC1;1 :
(6.28)
The normalization condition x c e1 D e1 e0 D 1 is equivalent to the equation e1 .x c e0 / D 0;
(6.29)
which is the equation of a hyperplane P .e1 ; e0 /. Thus, the normalization condition of Eq. 6.26 constrains the points x c to lie in a hyperplane passing though e0 with normal e1 . Equation 6.26 fixes the scale; however, for the conformal model, another constraint is needed to fix x c as a unique representation of xe 2 Rn . Note that the inner product of two conformal points 1 2 1 2 x y D x C x e1 C e0 y C y e1 C e0 2 2 1 2 1 2 1 D x y x y D .x y/2 2 2 2 1 D kx yk2 2
(6.30)
yields a quadratic representation of the Euclidean distance between the two Euclidean points: x and y.
154
6 Conformal Geometric Algebra
To complete the definition of generalized homogeneous coordinates for points in RnC1;1 , we resort, therefore, to the second constraint of Eq. 6.27, x 2c D 0. The set N nC1 of vectors that square to zero is called the null cone. Therefore, conformal points are required to lie in the intersection of the null cone N nC1 with the hyperplane P .e1 ; e0 /. The resulting surface Nen is called the horosphere: Nen D N nC1 \ P .e1 ; e0 / D fx c 2 RnC1;1 jx 2c D 0; x c e1 D 1g: (6.31) The homogeneous model horosphere has its origins in the work of F.A. Wachter (1792–1817), a student of Gauss [54]. An illustration of the null cone, the hyperplane, and the horosphere can be seen in Fig. 6.3a. By values of x 2c that are positive, zero (null cone), and negative, three families of surfaces are obtained as shown in Fig. 6.3b. The constraints (6.27) and (6.26) now included in (6.31) define an isomorphic mapping between the Euclidean space and the conformal space. Thus, for each conformal point, xc 2 RnC1;1 , there is a unique Euclidean space point xe 2 Rn and unique scalars ˛, ˇ such that the following mapping is bijective:
xe 7! xc D xe C ˛e0 C ˇe1 :
(6.32)
To see how this mapping is obtained, first we see that any point x c D xe C ˛e0 C ˇe1 2 Nen can be expressed as x c D xe C k1 eC C k2 e , for some scalars k1 , k2 , since e0 and e1 are linear combinations of the basis vectors eC and e . Since E 2 D 1, we can apply the conformal split to xc get x c D x c E 2 D .x c ^ E C x c E /E D .x c ^ E /E C .x c E /E I
(6.33)
see Fig. 6.3. Now, recall that .x c ^ E /E D xe is the rejection (see Eq. 6.17). The expression .xc E /E can be expanded as .x c E /E D .x c .e1 ^ e0 //E D e0 C .k1 C k2 /e1 :
(6.34)
Now, applying the condition that x 2c D 0, we find from Eq. 6.33 that x 2c D ..x c ^ E /E C .x c E /E /2 ; 0 D .xe C e0 C .k1 C k2 /e1 /2 D x2e .k1 C k2 /; x2e D .k1 C k2 /:
(6.35)
Finally, using Eq. 6.33, and substituting Eq. 6.35 into Eq. 6.34, we get 1 1 x c D .x c ^ E /E C .x c E /E D xe C e0 C .k1 C k2 /e1 D xe C x2e e1 C e0 : 2 2 (6.36) We can gain further insight into the geometrical meaning of the null vectors by analyzing Eq. 6.36. For instance, by setting xe D 0, we find that e0 represents the
6.1 Introduction
155
origin of Rn (hence the name). Similarly, dividing this equation by x c e0 D 12 x2e gives 2x2 1 2 1 1 e0 xc D 2 xe C x2e e1 C e0 D 2e C e1 C 2 x c e0 xe 2 xe xe 2 xe 1 e0 1 C e1 C 2 ! e1 : (6.37) D 2 xe 2 xe xe !1 Thus, we conclude that e1 represents the point at infinity.
6.1.4 Stereographic Projection Conformal geometry is equivalent to stereographic projection in the Euclidean space. Generally speaking, a stereographic projection is a mapping, taking points lying on a hypersphere to points lying on a hyperplane and following a simple geometric construction. It is well known that this projection is used in cartography to make maps of the earth; see Fig. 6.4. In this case, the projection plane passes through the equator and the sphere is centered at the origin. To make a projection, a line is drawn from the north pole to each point on the sphere, and the intersection of this line with the projection plane constitutes the stereographic projection. Next, we will illustrate the equivalence between stereographic projection and conformal geometric algebra in R1 . We will be working in R2;1 , with the basis vectors fe1 ; eC ; e g having the usual properties. The projection plane will be the x-axis, and the sphere will be a circle centered at the origin with a unitary radius. Given a scalar xe representing a point on the x-axis, we wish to find the point x c lying on the circle that projects to it (see Fig. 6.4b). The equation of the line passing through the north pole and xe is given by f .x/ D x1e x C 1. The equation of the circle is x2 C f .x/2 D 1. Substituting the equation of the line on the circle, e we get x2 2xxe C x2 x2e D 0, which has the two solutions x D 0; x D 2 x2xC1 . e Only the latter solution is meaningful. Substituting in of the line, we the equation x2 1 x2 1 e ; x2e C1 , which can be get f .x/ D x2e C1 . Hence, x c has coordinates x c D 2 x2xC1 e e e represented in homogeneous coordinates as the vector xc D 2
xe x2 1 e1 C 2e eC C e : C1 xe C 1
x2e
(6.38)
If we take the limits to Eq. 6.38, we get the expected points at infinity and the origin of Rn : lim x c D eC C e D e1 ;
xe !1
lim
xe !0
e eC xc D D e0 : 2 2
(6.39)
156
6 Conformal Geometric Algebra Hy
a
e lan rp pe Horosphere
E e0
lane
ne P Affi
e
+ e
∞
xc
e
e1 Xe
b
e+
xc e1 0
1
xe
Fig. 6.4 (a) Null cone, hyperplane, horosphere, and affine plane for the 1D case. (b) Stereographic projection for the 1D case
6.1 Introduction
157
This result is a first confirmation that stereographic projection is equivalent to a conformal mapping given by Eq. 6.36. For a second proof, we note that Eq. 6.38 can be rewritten as xe x2e 1 e eC C e C 1 x2e C 1 x2e C 1 2 1 2 1 2 xe e1 C .xe 1/eC C .xe C 1/e : D 2 xe C 1 2 2
xc D 2
(6.40)
Dividing by the scale factor, x22C1 , in order to achieve the constraint imposed by e Eq. 6.26, x c e1 D 1, we arrive at 1 x c D xe e1 C .x2e 1/eC C .x2e C 1/e D xe e1 C x2e e1 C e0 2 1 2 D x e C x e e1 C e0 ; 2
(6.41)
where xe D xe e1 , which is precisely Eq. 6.36. Hence, we have demonstrated that conformal geometric algebra is projectively equivalent to a stereographic projection (i.e., up to a scale factor).
6.1.5 Inner- and Outer-Product Null Spaces k The inner-product null space (IPNS) of a blade X k 2 Gp;q , denoted by NI.X k /, is defined as
˚ 1 NI.X k / WD x 2 Gp;q W x Xk D 0 ;
(6.42)
regardless of the blade being a null-space or not. Thinking in the dual space, we can define the outer-product null space (OPNS) k of a blade X k 2 Gp;q , denoted by NO.X k /, as follows ˚ 1 W x^X k D 0 NO.X k / WD x 2 Gp;q
(6.43)
regardless of the blade being a null-space or not. The dual operation on blades can be best seen considering the dual relation be1 k tween the IPNS and OPNS of the blade. Let x 2 Gp;q , X k 2 Gp;q with k 1 and the pseudoscalar I of Gp;q , according to the equations given in Sect. 1.2.7, we can formulate the dual relationship as follows .x^X k / D .x^X k / I 1 D x X k ;
(6.44)
158
6 Conformal Geometric Algebra
hence, x^X k D 0 ” x X k D 0;
(6.45)
NO.X k / D NI.X k /:
(6.46)
and thus
6.1.6 Spheres and Planes The equation of a sphere of radius centered at point ce 2 Rn can be written as .xe ce /2 D 2 :
(6.47)
Since x c y c D 12 .xe ye /2 , we can rewrite the above formula in terms of homogeneous coordinates as 1 x c c c D 2 : 2
(6.48)
Since x c e1 D 1, we can factor the above expression and then 1 2 x c c c e1 D 0I 2
(6.49)
this equation corresponds to the IPNS representation and yields finally the simplified equation for the sphere as x c s D 0; (6.50) where
1 c2 2 e1 (6.51) s D c c 2 e 1 D ce C e 0 C e 2 2 is the equation of the sphere. From this equation and (6.36), we can see that a conformal point is just a sphere with zero radius. The vector s has the properties s2 D 2 > 0; e1 s D 1:
(6.52) (6.53)
From these properties, we conclude that the sphere s is a point lying on the hyperplane xc e1 D 1, but outside the null cone x2c D 0. In particular, all points on the hyperplane outside the horosphere determine spheres with positive radius, points
6.1 Introduction
159
lying on the horosphere define spheres of zero radius (i.e., points), and points lying inside the horosphere have imaginary radius. Finally, note that spheres of the same radius form a surface that is parallel to the horosphere. Alternatively, spheres can be dualized and represented as .n C 1/-vectors, s D 1 sI . Then, using the main conjugation e I of I defined as 1 e I D .1/ 2 .nC2/.nC1/ I D I 1 ;
(6.54)
we can express the constraints of Eqs. 6.52 and 6.53 as s2 D se s D 2 ; e1 s D e1 .s I / D .e1 ^ s /I D 1:
(6.55)
Similar to Eq. 6.50, the equation involving a dual sphere reads x c ^ s D 0:
(6.56)
The advantage of the dual form is that the sphere can be directly computed from four points (in 3D) yielding the OPNS representation for the sphere s D x c1 ^ x c2 ^ x c3 ^ x c4 :
(6.57)
If we replace one of these points for the point at infinity, we get
D x c1 ^ x c2 ^ x c3 ^ e1 :
(6.58)
In the standard IPNS form, is given by
D I D n C de1 ;
(6.59)
where n is the normal vector and d represents the Hesse distance for the 3D space. Developing the products, we get the OPNS representation for the plane
D x c3 ^ x c1 ^ x c2 ^ e1 D xe3 ^ xe1 ^ xe2 ^ e1 C ..xe3 xe1 / ^ .xe2 xe1 //E ;
(6.60)
which is the equation of the plane passing through the points xe1 , xe2 , and xe3 . We can easily see that xe1 ^ xe2 ^ xe3 is a pseudoscalar representing the volume of the parallelepiped with sides xe1 , xe2 , and xe3 . Also, since .xe1 xe2 / and .xe3 xe2 / are two vectors on the plane, the expression ..xe1 xe2 /^.xe3 xe2 // is the normal to the plane. Therefore, planes are spheres passing through the point at infinity.
160
6 Conformal Geometric Algebra
6.1.7 Geometric Identities, Meet and Join Operations, Duals, and Flats A circle z can be regarded as the intersection of two spheres s1 and s2 . This means that for each point on the circle x c 2 z it lies on both spheres, that is, x c 2 s1 and x c 2 s2 . Assuming that s1 and s2 are linearly independent, we can write for x c 2 z .x c s1 /s2 .x c s2 /s1 D xc .s1 ^ s2 / D x c z D 0:
(6.61)
This result tells us that since x c lies on both spheres, z D .s1 ^ s2 / should be the intersection of the spheres or a circle. It is easy to see that the intersection with a third sphere leads to a point pair. We have derived algebraically that the wedge of two linearly independent spheres yields their intersecting circle (see Fig. 6.5b). This topological relationship between two spheres can be also conveniently described using the dual of the meet operation, namely, z D .z / D .s1 _ s2 / D s1 ^s2 :
(6.62)
This new equation says that the dual of a circle can be computed via the meet of two spheres in their dual form. This equation confirms geometrically our previous algebraic computation of Eq. 6.61. The standard OPNS (dual) form of the circle (in 3D) can be expressed by three points lying on it as z D x c1 ^ x c2 ^ x c3 I (6.63) see Fig. 6.5a. Similar to the case of the planes shown in Eq. 6.58, lines can be defined by circles passing through the point at infinity as l D x c1 ^ x c2 ^ e1 :
(6.64)
Fig. 6.5 (a) Circle computed using three points; note its stereographic projection. (b) Circle computed using the meet of two spheres
6.1 Introduction
161
This can be demonstrated by developing the wedge products as in the case of the planes to yield the OPNS representation of the line: x c1^x c2 ^e1 D xe1^xe2^e1 C .xe2 xe1 /^E ;
(6.65)
from where it is evident that the expression xe1^xe2 is a bivector representing the plane where the line is contained, and .xe2 xe1 / is the direction of the line. The standard IPNS form of the line can be expressed as L D nIe e1 mIe ;
(6.66)
where n and m stand for the line orientation and moment, respectively. The line in the IPNS standard form is a bivector representing the six Pl¨ucker coordinates. The dual of a point x is a sphere s. The intersection of four spheres yields the OPNS equation for the point; see Fig. 6.6b. The dual relationships between a point and its dual, the sphere, are s D x c1 ^x c2 ^xc3 ^x c4 $ x D s1 ^s2 ^s3 ^s4 ;
(6.67)
where the points are denoted as x i and the spheres si for i D 1; 2; 3; 4. There is another very useful relationship between an .r 2/-dimensional sphere A r and the sphere s (computed as the dual of a point s). If from the sphere A r we can compute the hyperplane A rC1 e 1 ^A r 6D 0, we can express the meet between the dual of the point s (a sphere) and the hyperplane A rC1 as getting the sphere A r of one dimension lower: .1/ s \ A rC1 D .s I / A rC1 D sA rC1 D A r :
(6.68)
Fig. 6.6 (a) Conformal point generated by projecting a point of the affine plane to the unit sphere. Note that we use only the Riemann sphere and the 2D plane. (b) Point generated by the meet of four spheres
162
6 Conformal Geometric Algebra
This result is telling us an interesting relationship: that the sphere A r and the hyperplane A rC1 are related via the point s (dual of the sphere s ); thus, we then rewrite Eq. 6.68 as follows: s D A r A 1 rC1 :
(6.69)
Using Eq. 6.69 and given the plane (ArC1 ) and the circle z (Ar ), we can compute the sphere s D z 1 :
(6.70)
Similarly, we can compute another important geometric relationship called the pair of points using Eq. 6.69 directly: s D PPL1 :
(6.71)
Now, using this result given the line L and the sphere s, we can compute the IPNS form of the pair of points PP (see Fig. 6.7b): PP D sL D s^L:
(6.72)
The OPNS of the pair of points is given by PP D x c1 ^x c2 :
(6.73)
Using Eq. 6.69 in a similar way, one can compute the projection of a blade X onto a blade Y , which yields another blade: X ! .X Y /X 1 :
(6.74)
Using Eq. 6.74, orthogonal projections of Euclidean geometry are computed in the conformal geometric algebra framework; see Fig. 6.8, note that L^ is itself a flat.
b
PP
c
s
a
s
z
L
r p1
C D
p1
π
Fig. 6.7 (a) The meet of a sphere and a plane. (b) Pair of points resulting from the meet between a line and a sphere. (c) Center of a circumscribed triangle
6.1 Introduction
163 CÙÕ*
C
L
S LÙS*
LÙÕ*
Õ
L
Õ
(C•Õ ) /Õ
(L•Õ)/Õ
(L•S)/S
Fig. 6.8 Orthogonal projections of Euclidean geometry using conformal geometric algebra operations Table 6.1 Representation of entities in conformal geometric algebra, G D grade Entity IPNS G OPNS (dual) G 1 2 2 Sphere s D c C 2 .c /e1 C e0 1 s D x c1 ^x c2 ^x c3 ^x c4 4 Point
x c D x C 12 x2 e1 C e0
1
x D s1 ^s2 ^s3 ^s4
4
Plane
D nIE de1 n D .xe1 xe2 /^.xe1 xe3 / d D .xe1 ^xe2 ^xe3 /IE
1
D e1 ^x c1 ^x c2 ^x c3
4
L D 1 ^ 2 L D nIE e1 mIE n D .xe1 xe2 / m D .xe1 ^xe2 /
2
L D e1 ^x c1 ^x c2
3
z D s1 ^s2 D s1 ^ 2
2
z D x c1 ^x c2 ^x c3
3
Line
Circle Point pair
PP D s1 ^s2 ^s3 PP D s^L
3 2
PP D x c1 ^x c2
2
A summary of the standard IPNS and OPNS forms of the basic geometric entities is presented in Table 6.1. Now consider the following element in the conformal model: X D ˛.e0 ^x 1 ^ ^x k ^e1 /;
(6.75)
since x i ^x j D x i ^.xj x i /, the previous equation can be rewritten as follows: X D ˛.e0 ^.x1 e0 /^ ^.xk e0 /^e1 /:
(6.76)
Now, substituting the conformal representation of the Euclidean points 1 x i D xi C x2i e1 C e0 ; 2
(6.77)
1 1 X D ˛ e0 ^ x1 C x21 e1 ^ ^ xk C x2k e1 ^e1 : 2 2
(6.78)
we obtain
164
6 Conformal Geometric Algebra
Since the wedge product eliminates the extra terms 12 x2i e1 , we get X D ˛.e0 ^x1 ^ ^xk ^e1 /:
(6.79)
Thus, the part involving the vectors xi and the weight ˛ can be seen as a purely Euclidean k-blade Xk D ˛x1^ ^xk ; therefore, this class of blade in the conformal model is equivalent to X D e0 ^Xk ^e1 :
(6.80)
Finally, we can claim that the general form of a flat k-dimensional offset subspace through the point p is given by p^Xk ^e1 :
(6.81)
Figure 6.9 depicts elements of this kind, namely the plane x 1^x 2 ^x 3 ^e1 (with its orientation denoted by a circle), the line x 4 ^x 5 ^e1 , and at their intersection the point x 6 ^e1 . When you need to find the tangent to a flat or to a round s (sphere through the origin) passing through one of its points p, surprisingly one can compute this without differentiation. Calling the element in question X , where p fulfills the constraint p^X D 0, we compute the tangent as follows: O pX
(6.82)
O stands for the grade involution. See in Fig. 6.10 the tangents of flats and where X rounds at one of their points. The carrier of an element is the smallest grade flat that contains it. Thus, a flat corresponds to its own carrier. The carrier of a round s is computed as follows:
x4 x4^x5 ^ e∞
x5 x6^e∞ x1 ^ x2 ^ x3 ^ e∞
x1
x2 x3
Fig. 6.9 Flat: 2D offset subspace through the point x 6
6.1 Introduction
165 Π
Z L
X
X
X X ^
X·L
^
S
^
^
X·Z
X·S
X·Π
Fig. 6.10 Contracting geometric entities: tangents of flats and rounds at one of their points (no differentiation required)
S P1
c
P2
L
S c
z
P
Fig. 6.11 Factorization of carriers and surrounds. (Top figure) A point pair P 1 P 2 is factorized as its surrounding sphere S and the carrier line L. (Bottom figure) A circle z is factorized as the intersection of its surrounding sphere S and the carrier plane
s^e1 :
(6.83)
This equation can be used to compute the tangent flat to an element X at one of its points p as follows: O ^e1 : pX
(6.84)
We can conclude that this equation can be applied to rounds and flats alike; however, note that for a flat, the equation is simply the identity. Moreover tangents do not have tangents flats, though they do have a carrier. Making use of Eq. 6.69, we can compute surrounds (contours of spheres) s.s^e1 /1 ;
(6.85)
s D s.s^e1 /1 .s^e1 /:
(6.86)
and factorization of spheres
Figure 6.11 shows the factorization of rounds involving carriers.
166
6 Conformal Geometric Algebra
6.1.8 Meet, Pair of Points, and Plunge A plunge is formulated in terms of the meet operation and the pair of points PP. In general, the plunge of three spheres is given by P l D s1 ^s2 ^s3 ;
(6.87)
which is depicted in Fig. 6.12a. You can see in Fig. 6.12b the intersection of three spheres but in a real point pair (in magenta). Now shrinking the three dual spheres to zero radius, the plunge P l gives simply a dual circle z Dx c1 ^x c2 ^x c3 passing through the conformal representation of three points. Note that on the one hand to contain a point x c1 is equivalent to plunging into the zero-radius sphere s1 at that point. Then letting the radius of a dual sphere go to infinity gives a dual plane. As shown in Fig. 6.13, the plunge of such diverse geometric entities can be mixed straightforwardly. Let us explain this further, the circle at equator of a sphere has its dual the pair of points which correspond to the north and south poles, vice-versa the dual of a real point pair is the imaginary equator of the sphere that has the point pair as its poles. In fact, two real dual spheres that do not intersect have an imaginary meet and a real plunge, conversely if they intersect the resulting circle is the real meet and the plunge is imaginary. Figure 6.14 depicts the relationship between the meet and plunge of two spheres. The plunge and center of the meet lie on the line intersecting the centers of the spheres. The general flat x^E^e1 can be also seen as a plunge. Consider the line x^n^e1 . Bearing in mind the plunge construction, this expression should perpendicularly intersect the points x and e1 and it should
Fig. 6.12 (a) The plunge of three nonintersecting spheres is the blue circle. Two imaginary meets in speckled red. The pairwise meets of the spheres, one real and two imaginaries (all in green). (b) The meet of three spheres is a real point pair (in magenta). The plunge is imaginary (dashed circle). The pairwise meets of the spheres are all in green
6.1 Introduction
167
Fig. 6.13 The plunge of dual plane , dual sphere s, and point x
Fig. 6.14 Meet and plunge of two spheres at increasing distances: (left) spheres are separated, real plunge and imaginary meet; (middle) real meet and imaginary plunge; (right) one sphere is inside the other, real plunge and imaginary meet Fig. 6.15 Plunge construction of the line x^n^e1
x ^n^e∞ x n
moreover be perpendicular to nI (bivector or dual of the vector), which in fact corresponds to the plane through the origin with the normal vector n. This plunge must correspond to the direct representation of the line through the point x in the direction of n (see Fig. 6.15). Taking another Euclidean vector factor m, this gives
168
6 Conformal Geometric Algebra
an element that should meet perpendicularly the dual line n^m, which is in fact the direct representation of a plane. Removing the Euclidean factor by setting it equal to the 0-blade 1, one gets the representation of a flat of dimension zero, that is, the direct flat point x ^ e1 , which is nothing but the element that perpendicularly connects the zero-radius dual sphere x with e1 .
6.1.9 Simplexes and Spheres In CGA, geometric objects can be computed as the wedge of linearly independent homogeneous points a1 ; a2 ; : : : ; ar , with r n so that a1 ^a2 ^ ^ar ¤ 0. This multivector can be expressed in an expanded form as follows: 1 1 ˙ a1 ^a2 ^ ^ar D Ar C e0 AC r C e 1 Ar E A r ; 2 2
(6.88)
where Ar D a0 ^a1 ^ ^ar ; AC r D
r X
.1/i a0 ^ ^ aL i ^j ^ar D a1 a0 ^ ^ ar a0 ;
i D0
A r D
r X
.1/i a2i a0 ^ ^ aL i ^ ^ar ;
i D0
A˙ r
D
r r X X
.1/i Cj a2i a2j a0 ^ ^ aL i ^ ^ aL j ^ ^ar :
(6.89)
i D0 j Di C1
The expanded form of Eq. 6.88 gives the following geometric information: – Determines an r-simplex if Ar ¤ 0 – Represents an (r 1)-simplex in a plane which passes through the origin if AC r D A D 0 r – Represents an .r 1/-sphere if and only if AC r ¤ 0 See [119] for a more detailed study on the expanded form. In Eq. 6.88, Ar is the moment of the simplex with a boundary (or tangent) AC r , thus the corresponding r-simplex can be formulated as follows: e ^a0 ^a1 ^ ^ar D eAr C E AC r :
(6.90)
6.2 The 3D Affine Plane
169
On the other hand, the volume (or content) of the simplex is kŠjAC r j, where 2 C C jAC r j D .Ar / Ar D ar ^ ^a0 ^e e ^a0 ^^ ^ar ˇ ˇ ˇ0 1 1ˇ ˇ ˇ r ˇ 1 ˇ 1 ˇ ˇ D ˇ; ˇ :: 2 ˇ 2 ˇ: dij ˇ ˇ ˇ ˇ1
(6.91)
where dij D jai aj j is the distance between a pair of points. This determinant is called the Cayley–Menger determinant. The directed distance from the origin in Rn to the plane of the simplex in terms of the points is given by 1 d D Ar .AC r / ;
(6.92)
thus the square of its absolute value is jdj2 D
jAr j2 2 jAC r j
D
.ar ^ ^a0 / .a0 ^ ^ar / ; .Nar ^ ^ aN 1 / .Na1 ^ ^ aN r /
(6.93)
where aN i D ai a0 for i D 1; : : : ; r. In the next section, we present the computing of the directed distance in the 3D affine plane.
6.2 The 3D Affine Plane In the previous section, we described the general properties of the conformal framework. However, sometimes we would like to use only the projective plane of the conformal framework but not the null cone of this space. This will be the case when we use only rigid transformations, and then we will limit ourselves to the affine plane, which is an n C 1 dimensional subspace of the hyperplane of reference P .e1 ; e0 /. We have chosen to work in the algebra G4;1 . Since we deal with homogeneous points, the particular choice of null vectors does not affect the properties of the conformal geometry. Points in the affine plane x 2 R4;1 are formed as follows: x a D xe C e0 ;
(6.94)
where xe 2 R3 . From this equation, we note that e0 represents the origin (by setting xe D 0), similarly, e1 represents the point at infinity. Then the normalization property is expressed as e1 x a D 1: (6.95)
170
6 Conformal Geometric Algebra
In this framework, the conformal mapping equation is expressed as 1 1 x c D xe C x2e e1 C e0 D x a C x2e e1 : 2 2
(6.96)
For the case when we will be working on the affine plane exclusively, we will be mainly concerned with a simplified version of the rejection. Noting that E D e1 ^ e0 D e1 ^ e, we write an equation for rejection as follows: PE? .x c / D .x c ^ E /E D .x c ^ E / E D .e1 ^ e0 / e0 C .x c ^ e1 / e0 ; (6.97) xe D e0 C .x c ^ e1 / e0 : Now, since the points in the affine plane have the form x a D x e C e0 , we conclude that x a D .x c ^ e1 / e0 (6.98) is the mapping from the horosphere to the affine plane.
6.2.1 Lines and Planes The lines and planes in the affine plane are expressed in a similar fashion to their conformal counterparts as the join of two and three points, respectively: La D x a1 ^ x a2 ; a
˘ D
x a1
^
x a2
^
(6.99) x a3 :
(6.100)
Note that unlike their conformal counterparts, the line is a bivector and the plane is a trivector. As seen earlier, these equations produce a moment-direction representation; thus, La D e1 d C B; (6.101) where d is a vector representing the direction of the line and B is a bivector representing the moment of the line. Similarly, we have that ˘ a D e1 n C ıe123 ;
(6.102)
where n is the normal vector to the plane and ı is a scalar representing the distance from the plane to the origin. Note that in any case, the direction and normal can be retrieved with d D e1 La and n D e1 ˘ a , respectively. In this framework, the intersection or meet has a simple expression too. Let A a D a a1 ^ ^ ara and B a D b1a ^ ^ bsa . Then the meet is defined as Aa \ B a D A a .B a INA a [B a /;
(6.103)
where INAa [B a is either e12 e1 , e23 e1 , e31 e1 , or e123 e1 , according to which the basis vectors span the largest common space of A a and B a .
6.2 The 3D Affine Plane
171
6.2.2 Directed Distance The so-called Hessian normal form is well known from vector analysis to be a convenient representation to specify lines and planes using their distance from the origin (the Hesse distance or directed distance). In this section, we are going to show how CGA can help us to obtain the Hesse distance for more general simplexes and not only for lines and planes. Figure 6.16a,b depicts a line and a plane, respectively, that will help us to develop our equations. Let A k be a k-line (or plane); then it consists of a momentum M k of degree k and of a direction D k1 of degree k 1. For instance, given three Euclidean points a1 ; a2 ; a3 , their 2-simplex defines a dual 3-plane in CGA that can be expressed as A k ˚ D M 3 C D 2 e0 D a1 ^ a2 ^ a3 C .a2 a1 / ^ .a3 a1 /e0 :
(6.104)
Then the directed distance of this plane, denoted as p k , can be obtained taking the inner product between the unit direction Duk1 and the moment M k . Indeed, from (6.1) and using expressions (6.6), we get the direction from ˚ e1 D D k1 and then its unitary expression Duk1 dividing D k1 by its magnitude. Schematically, D k1 ˇ: A k ! A k e1 D D k1 ! D k1 Dˇ u ˇ k1 ˇ ˇD ˇ
(6.105)
Finally, the directed distance pk of A k is Ak ; p k D D k1 u
(6.106)
and the where the dot operation basically takes place between the direction D k1 u momentum of Ak . Obviously, the directed distance vector p k touches orthogonally the k-plane A k , and as we mentioned at the beginning of this subsection, the
a
b s
Fig. 6.16 (a) Line in 2D affine space. (b) Plane in the 3D affine space (note that the 3D space is “lifted” by a null vector e
172
6 Conformal Geometric Algebra
ˇ ˇ magnitude ˇpk ˇ equals the Hesse distance. For the sake of simplicity, in Fig. 6.16a,b only D k1 Lk and D k1 ˚ k are shown, respectively. Now, having this point from the first object, we can use it to compute the directed distance from the k-plane Ak parallel to the object B k as follows: d ŒA k ; B k D d ŒD k1 A k ; B k D d Œ.e1 Ak / A k ; B k :
(6.107)
6.3 The Lie Algebra Similarly as in Sects. 3.5.3 and 5.2.3, we should introduce the analog of the complex “doubling” bivector for conformal geometry. This corresponds to the Minkowski plane of Eq. 6.7, namely, E D ei eNj D eC ^ e D eC e :
(6.108)
Thus, the bivector generators are the set of bivectors that commute with E . We proceed similarly as with the unitary group and formulate an algebraic constraint, so for any x; y 2 GnC1;1 Œ.x E / ^ .y E / E D x^.y E / C .x E /^y D .x ^ y/ E (6.109) after a simple algebraic manipulation, Œx^y .x E /.y E / E D 0:
(6.110)
Now, using this constraint, we can again try all combinations of fei ; eNi g to produce the bivector basis for the Lie algebra of the conformal group: E i D eC e ; B ij D ei ej N ij D ei e˙ E
.i < j D 1; : : : ; n/;
(6.111) (6.112)
.i D 1; : : : ; n/:
(6.113)
In the next section, we will use this set of Lie algebra operators to define the Lie groups of the conformal geometric algebra.
6.4 Conformal Transformations In the middle of the nineteenth century, J. Liouville proved, for the threedimensional case, that any conformal mapping on the whole of Rn can be expressed as a composition of inversions in spheres and reflections in hyperplanes [142].
6.4 Conformal Transformations
173
In particular, rotation, translation, dilation, and inversion mappings will be obtained with these two mappings. In conformal geometric algebra, these concepts are simplified, due to the isomorphism between the conformal group on Rn and the Lorentz group on RnC1 , which helps us to express, with a linear Lorentz transformation, a nonlinear conformal transformation, and then to use versor representation to simplify the composition of transformations with the multiplication of vectors [119]. Thus, using conformal geometric algebra, it is computationally more efficient and simpler to interpret the geometry of the conformal mappings, than with matrix algebra. A transformation of geometric figures is said to be conformal if it preserves the shape of the figures, that is, whether it preserves the angles and hence the shapes of straight lines and circles. In particular, rotation and translation mappings are conformal and are also called direct-motion transformations. Inversion and reflection mappings preserve the magnitude of the angle but reverse its direction; they are also called opposite-motion transformations. Any conformal transformation in Rn , x 7! x
1 ; 1 C ax
(6.114)
can be expressed as a composite of inversions and a translation: 7!
x
inversion 7!
translation 7!
inversion
x ; x2 x C a; x2
x a 1 x 2 C Dx : x Ca x Ca 1 C ax 2 2 x x
(6.115) (6.116) (6.117)
The conformal transformation in conformal geometric algebra uses a versor representation, g.x c / D G x c .G /1 D x 0c ; (6.118) where x c 2 RnC1;1 , G is a versor, and is a scalar. G can be expressed in CGA as a composite of versors for transversion, translation, and rotation as follows: G D K bT aR˛ :
(6.119)
These individual versors will be explained next.
6.4.1 Inversion By the classical definition, an inversion TS with respect to a sphere S (of radius and center c) is such that for any point q at a distance d from c, TS .q/ will be in the same ray from c to q and at a distance 2 =d from c; see Fig. 6.17a. We
174
6 Conformal Geometric Algebra
Fig. 6.17 (a) The point TS .q/ is the inverse point of q with respect to the circle S, and vice versa. (b) The inversion with respect to a sphere T mapping sphere to sphere and circle to circle, if they are not passing through the center of T
comment on some of the main properties of this transformation: (a) the inverse of a plane through the center of inversion is the plane itself; (b) the inverse of a plane not passing through the center of inversion is a sphere passing through the center of inversion; (c) the inverse of a sphere through the center of inversion is a plane not passing through the center of inversion; (d) the inverse of a sphere not passing through the center of inversion is a sphere not passing through the center of inversion, see Fig. 6.17b; (e) inversion in a sphere maps lines and circles to lines and circles. In the context of the conformal geometry, the general form of a reflection about a vector is s.x c / D sx c s1 D x c 2.s x c /s1 D x 0c ;
(6.120)
where sx C xs D 2.s x/, from the definition of the Clifford product between two vectors. We will now analyze what happens when s represents a sphere. Recall that the equation of a sphere of radius centered at point c c is the vector 1 s D c c 2 e1 : 2
(6.121)
If s represents the unit sphere centered at the origin, then s and s1 reduce to e0 1 e . Hence, 2.s x c / D x2e 1, and Eq. 6.120 becomes 2 1 x 0c
1 2 2 D xe C xe C e1 C x2e 1/.e0 e1 D x2e x1 e Cxe e1 Ce0 ; (6.122) 2
which is the conformal mapping for x1 c . To see how a general sphere inverts a point, we return to Eq. 6.121 to get s x c D c c x c 2 e1 x c D Œ.xe ce /2 2 :
(6.123)
6.4 Conformal Transformations
175
Inserting the above equation (6.122) and a little algebra gives x 0c
D
x e ce
where
2 1 2 g.x c / C g .xc /e1 C e0 ; 2
2 2 .xe ce / C ce D C ce x e ce .xe ce /2
g.xe / D
(6.124)
(6.125)
is the inversion in Rn .
6.4.2 Reflection The reflection of conformal geometric entities helps us to do any other transformation. The reflection of a point x with respect to the plane is equal to x minus twice the direct distance between the point and plane (see Fig. 6.18), that is, x D x 2. x/ 1 . To simplify this expression recall the property of the Clifford product of vectors 2.b a/ D ab C ba. The reflection could be written as x 0 D x . x x / 1 ; 0
x D x x
1
x
x 0 D x 1 :
1
(6.126) ;
(6.127) (6.128)
For any geometric entity Q, the reflection with respect to the plane is given by Q0 D Q 1 :
Fig. 6.18 Reflection of a point x with respect to the plane
(6.129)
176
6 Conformal Geometric Algebra
Fig. 6.19 Reflection about parallel planes
6.4.3 Translation The translation of conformal entities can be obtained by carrying out two reflections in parallel planes 1 and 2 (see Fig. 6.19): Q0 D Ta . 2 1 /QTea . 11 21 /; „ ƒ‚ … „ ƒ‚ … 1 a Ta D .n C de1 /n D 1 C ae1 D e 2 e1 ; 2
(6.130) (6.131)
where a D 2d n. Since ae1 D b i ei .eC C e /, the Lie algebra generators or bivectors for the translator Ta are ei eC and ei e (see Eq. 6.113).
6.4.4 Transversion A transversion can be generated from two inversions and a translation. A transversor has the form K b D eC T b eC D .e1 e0 /.1 C be1 /.e1 e0 / D 1 C be0 :
(6.132)
The transversion generated by Kb can be expressed in various forms: g.xe / D
xe x2e b 1 D xe .1 bxe /1 D .x1 e b/ : 1 2b xe C x2e b2
(6.133)
The last form can be written down directly as an inversion followed by a translation and another inversion. Note that a tranversion uses as a null vector the origin e0 , whereas the translator uses the point at infinity e1 . Since be0 D b i ei 12 .e eC /, the Lie algebra generators or bivectors for the transversion Ta are ei eC and ei e (see Eq. 6.113). In fact, the bivector basis is the same for the translator and for the transversion.
6.4 Conformal Transformations
177
Fig. 6.20 Reflection about nonparallel planes
6.4.5 Rotation A rotation is the product of two reflections with respect to two nonparallel planes (see Fig. 6.20): f . 11 21 /; Q0 D R . 2 1 /QR „ ƒ‚ … „ ƒ‚ …
(6.134)
or, computing the conformal product of the normals of the planes, R D n2 n1 D cos sin l D e 2 l : 2 2
(6.135)
There are three Lie algebra generators or bivectors for the rotors; they are of the type Bij D ei ej (see Eq. 6.112). With l D n2 ^ n1 and twice the angle between the planes 2 and 1 , the screw motion called motor related to an arbitrary axis L is M D TRe T:
e
eT e//; Q0 D M .TRe T /QM ..T R „ ƒ‚ … „ ƒ‚ … sin L D e 2 L: M D TRe T D cos 2 2
(6.136) (6.137)
The direct kinematics for serial robot arms can be expressed as a succession of motors and surprisingly is valid for points, lines, planes, circles, and spheres. Q0 D
n Y
Mi Q
i D1
n Y
f M ni C1 :
(6.138)
i D1
6.4.6 Rigid Motion Using Flags In Sect. 4.9.1, the concept of f lag or soma is explained. This helps to embed the frame in a single algebraic as depicted in Fig. 6.21. The flag representation involves
178
6 Conformal Geometric Algebra
a
b e3= x3 - x
x3 x
x
e2
n3
n2 x2
x1 n1
e1
Fig. 6.21 (a) Body frame; (b) Flag: point, line and plane with a common point
a point, line, and a plane, all entities being a common reference point: F D x C L C D x C Ic Q;
(6.139)
where L D x^x 1 ^e1 D x^.x 1 x/^e1 D x^e1 ^e1 D Ic n2 n3 ;
D x^x 1 ^x 2 ^e1 D x^e1 ^e2 ^e1 D e2 ^L D Ic n3 I combining these results, one gets Q D n3 C n2 n3 D .1 C n2 /n3 , which is a conic, namely, it fulfills x^F D x^.L C / D x^.Ic Q/ D 0I
(6.140)
its dual expression is x Q D 0;
(6.141)
Q2 D .n3 C n2 n3 /.n3 n2 n3 / D 0:
(6.142)
where
The previous equations describe a point lying on an absolute conic. Recall in projective geometry the equation of homogeneous points lying on the absolute conic. Finally, the rigid body can be nicely modeled using the flag F as follows, given a motor M D T R: f D M xM f C M L M f C M M f: F 0 D MF M Here this linearity is due to the congruence theorem.
(6.143)
6.4 Conformal Transformations
179
6.4.7 Dilation A dilation is the composite of two successive inversions centered at the origin. Using the unit sphere s1 D e0 12 e1 , another sphere of arbitrary radius and s2 D e0 12 2 e1 as inversors, we get .e0 e1 /.e0 2 e1 / D .1 E / C .1 C E /2 :
(6.144)
Normalizing to unity, we have D D .1 C E / C .1 E /1 D e E ;
(6.145)
where D ln . To prove that this is indeed a dilation, we note that 2 D e1 D 1 D e;
(6.146)
2 D e0 D 1 D e0 :
(6.147)
and similarly,
Therefore, 2 2 2 2 D .xe C x2e e1 C e0 /D 1 D Œ xe C . xe / e1 C e0 ;
(6.148)
which is the conformal mapping g.x e / D x 0c with x 0c D x 0e C x 0 2 e1 C e0 , where x 0e D 2 xe . The Lie algebra generator or bivector for the dilations is E D eC e or the Minkowski plane; see Eq. 6.111.
6.4.8 Involution The bivector E (the Minkowski plane of R1;1 ) represents an operation that correN r D .1/r A r . In sponds to the main involution, but for an r-blade A r , namely A particular, for vectors xN e D x e , which can be easily obtained by applying the versor E: 1 E .xe C x2e e1 C e0 /E D xe C x2e e1 C e0 : (6.149) 2 This expression corresponds to the conformal mapping of x e , thus confirming that the versor E represents the main involution for Rn . This means that the main involution is a reflection via the Minkowski plane R1;1 .
180
6 Conformal Geometric Algebra
6.4.9 Conformal Transformation Finally, using previous results, we can now write a canonical decomposition of a conformal transformation in terms of individual versors: G D K b T aR˛ ;
(6.150)
where the versors are as follows: for transversion K b D eC T b eC D 1 C be0 , for translation T a D 1 C 12 ae1 , and for rotation R ˛ . This decomposition reveals the structure of the three-parameter group fG 2 R1;1 jG G D 1g ' GL2 .R/. Note that in conformal geometric algebra the conformal transformation is built by successive multiplicative versors: in contrast in Rn , the conformal transformation of Eq. 6.117 is highly nonlinear, and its application will require complicated nonlinear and slow algorithms.
6.5 Ruled Surfaces Conics, ellipsoids, helicoids, and hyperboloids of one sheet are entities that cannot be directly described in CGA, but can be modeled with its multivectors. In particular, a ruled surface is a surface generated by the displacement of a straight line (called generatrix) along a directing curve or curves (called a directrices). The plane is the simplest ruled surface, but now we are interested in nonlinear surfaces generated as ruled surfaces. For example, a circular cone is a surface generated by a straight line through a fixed point and a point in a circle. It is well known that the intersection of a plane with the cone can generate the conics (see Fig. 6.22). In [165], the cycloidal curves are generated by two coupled twists. In this section, we are going to see how these and other curves and surfaces can be obtained using only multivectors of CGA.
6.5.1 Cone and Conics A circular cone is described by a fixed point v0 (vertex), a dual circle z0 D a0 ^ a1 ^ a2 (directrix), and a rotor R .; l/, 2 Œ0; 2 / rotating the straight line L.v0 ; a0 / D v0 ^ a0 ^ e1 (generatrix) along the axis of the cone l0 D z0 e1 . Then the cone w is generated as e .; l0 / ; 2 Œ0; 2 / : w D R .; l0 / L.v0 ; a0 /R
(6.151)
A conic curve can be obtained with the meet of a cone and a plane: see Fig. 6.22.
6.5 Ruled Surfaces
181
Fig. 6.22 Hyperbola as the meet of a cone and a plane
Fig. 6.23 A laser welding following a 3D curve: the projection of a cycloidal curve over a sphere
6.5.2 Cycloidal Curves The family of the cycloidal curves can be generated by the rotation and translation of one or two circles. For example (see Fig. 6.23), the cycloidal family of curves generated by two circles of radius r0 and r1 are expressed by the following motor: eR2 ; M D T R1T
(6.152)
T D T ..r0 C r1 /.sin./e1 C cos./e2 //; r0 ; R1 D R1 r1 R 2 D R 2 ./:
(6.153)
where
f. Then each conformal point x is transformed as M x M
(6.154) (6.155)
182
6 Conformal Geometric Algebra
Fig. 6.24 (a) The helicoid is generated by the rotation and translation of a line segment. In CGA, the motor is the desired multivector. (b) Intersection as the meet of a sphere and a cone
6.5.3 Helicoid We can obtain the ruled surface called a helicoid rotating a ray segment in a similar way as the spiral of Archimedes. So, if the axis e3 is the directrix of the rays and is orthogonal to them, then the translator that we need to apply is a multiple of , the angle of rotation (see Fig. 6.24a).
6.5.4 Sphere and Cone Let us see an example of how the use of algebra of incidence in CGA simplifies the algebraic formulation. The intersection of a cone and a sphere in a general position, that is, the axis of the cone does not pass through the center of the sphere, is the three-dimensional curve of all Euclidean points .x; y; z/ such that x and y satisfy the quartic equation
2 1 1 x 2 1 C 2 2x0 x C y 2 1 C 2 2y0 y C x02 C y02 C z20 r 2 c c D 4z20 .x 2 C y 2 /=c 2 ; and x, y, and z the quadratic equation
(6.156)
6.6 Exercises
183
Fig. 6.25 (a) Hyperboloid as the rotor of a line. (b) The Pl¨ucker conoid as a ruled surface
.x x0 /2 C .y y0 /2 C .z z0 /2 D r 2
(6.157)
(see Fig. 6.24b). In CGA, the set of points q of the intersection can be expressed as the meet of the dual sphere s and the cone w, (Eq. 6.151), defined in terms of its generatrix, L, that is, e .; l0 /; 2 Œ0; 2 / : q D .s / ŒR .; l0 / L.v0 ; a0 /R
(6.158)
Thus, in CGA, we only need (6.158) to express the intersection of a sphere and a cone, meanwhile in Euclidean geometry it is necessary to use the complicated equations (6.156) and (6.157).
6.5.5 Hyperboloid, Ellipsoids, and Conoid The rotation of a line over a circle can generate a hyperboloid of one sheet (see Fig. 6.25a).The ellipse is a curve of the family, of the cycloid; we can obtain an ellipsoid with a translator and a dilator. The cylindroid or Pl¨ucker conoid is a ruled surface (see Fig. 6.25b). This ruled surface is like the helicoid, where the magnitude of the translator parallel to the e3 axis is a multiple of cos./sin./.
6.6 Exercises 6.1 Formulate the equation of a sphere using conformal geometric algebra.
184
6 Conformal Geometric Algebra
6.2 Derive the representation of the point pair (0-sphere), PP, spanned by the points p1 and p2 at locations e1 and e2 with weights 3 and 2. 6.3 Compute the center and radius of PP. 6.4 Give the dual representation of PP and compute its center and radius. 6.5 Give the IPNS representation of the circle z through the points p1 and p2 and the unit point e1 . Compute its square radius and center, and the orientation vector of the circle z passing through its center. 6.6 Compute the IPNS representation of the sphere s through z and the origin. 6.7 Compute the OPNS representation of the sphere s and read off its center and square radius directly from that dual representation. 6.8 In conformal geometric algebra, write the equations of the direct distances between a point and a line, a point and a plane, and a line and a plane. 6.9 Given a pure translation versor T , show that the logarithm is given by log.T / D
1 e /: .T T 2
(6.159)
6.10 Explain the geometric meaning of the element x ^v, where x is a point and v an Euclidean vector. 6.11 Prove that the ratio of two flat points x^ e1 and y ^ e1 is a translator versor or translator. Which is its corresponding translation vector? 6.12 Prove that the tangent to a tangent is zero. Note that a tangent is also a round (sphere passing through the origin). 6.13 Prove that the ratio of two general planes passing through a common point p is a rotation versor or rotor.What is the bivector angle of the rotation? 6.14 Show that the cosine of the angle between two lines L1 and L2 through a e common point p can be computed using the formula cos./ D X Y (for jjX jj jjY jj blades of the same grade) directly applied to the lines themselves rather than to their directions. 6.15 Give the equation of the circle z passing through a point par x^y intersecting the dual plane perpendicularly.
6.6 Exercises
185
6.16 Give the equation of the circle z having a tangent vector in direction n at the point x and plunging into the plane . 6.17 The dual circle is the intersection of a plane with a sphere, namely s ^ . 2 Recall that the dual sphere at origin is s D .e0 2e1 /. Use CLUCAL to com/.e3 / and pute the meet of two intersecting dual circles z1 D T e1 .e0 e1 2 /.e / lying on the e ^e -plane. Here T and T e2 are the z2 D T e2 .e0 e1 3 1 2 e 1 2 translators of the circles along e1 and e2 , respectively. 6.18 Construct the contour of a sphere s as seen from a point p, that is, the circle z is the geometric locus of points where the invisible part of the sphere borders the visible part. (Hint: first making use of the plunge express the sphere, then construct the circle as a meet) 6.19 Prove that the tangent with a directional element E at the point x can be written using two equivalent equations O 1 //; x^.x .Ee O 1 /: x .x^ E^e 6.20 Prove that the ratio of two lines x ^ n ^e1 and y ^ m ^e1 is a versor representing a general rigid body motion. What are the corresponding screw parameters? 6.21 In conformal geometric algebra, write the equations of the direct distances between a sphere and a point, a sphere and a line, and a sphere and a plane. 6.22 In conformal geometric algebra, write the equations of the direct distances between a circle and a point, a circle and a line, a circle and a plane, a circle and a sphere. 6.23 In conformal geometric algebra, write the equations of the direct distances between a circle and a point, a circle and a line, a circle and a plane and a circle and a sphere. 6.24 In conformal geometric algebra, given three points lying on a circle, using the conformal split compute the radius and center of the circle. 6.25 In conformal geometric algebra, given three points a1 , a2 , and s3 in a general position, using CLICAL or CLUCAL compute the expanded form of the plane crossing these three points. 6.26 Theorem-proving: Prove in conformal geometric algebra the theorem of Desargues configuration. Recall the 3D projective plane ˘ 3 . Consider that x1 ; x2 ; x3 and y1 ; y2 ; y3 are the vertices of two triangles in ˘ 3 and suppose that
186
6 Conformal Geometric Algebra C1 A
D
P
B1
B
C
A1
Fig. 6.26 Simpson’s theorem
2
b
+
c2
=
2m2 +
a2 2
c
b m
a
Fig. 6.27 Apollonius’s theorem
.x1 ^x2 / \ .y1 ^y2 / D z3 , .x2 ^x3 / \ .y2 ^y3 / D z1 , and .x3 ^x1 / \ .y3 ^y1 / D z2 . You can claim that c1 ^ c2 ^ c3 D 0 if and only if there is a point p such that x1 ^ y1 ^ p D 0 D x2 ^ y2 ^ p D x3 ^ y3 ^ p. (Hint: Express the point as linear combinations of a1 ; b1 , a2 ; b2 , and a3 ; b3 . The other half of the proof follows by duality of the classical projective geometry.) 6.27 Theorem-proving: Using conformal geometric algebra, prove Simpson’s rule using the join of three projected points; see Fig. 6.26. If the projecting point D lies at the circumference, the join of the projected points is zero. Take three arbitrary points lying on a unit circumference and a fourth one nearby. Use CLUCAL for your computations. (Hint: First compute the projected points as the meet of three lines passing by the point D orthogonal to the triangle sides. The triangle is formed by three arbitrary points lying on the circumference. A; B; C.) 6.28 Prove Apollonius’s theorem using incidence algebra in conformal geometric algebra; see Fig. 6.27.
6.6 Exercises
187
6.29 Prove Pascal’s theorem using the incidence algebra in conformal geometric algebra. Given six points lying in a conic, compute six intersecting lines using the join operation. The meet of these lines should give three intersecting points, and the join of these three intersecting points should be zero. Use CLUCAL and any six points belonging to a conic of your choice. 6.30 Prove Desargues’ theorem using the algebra of incidence in conformal geometric algebra. Let x 1 , x 2 , x 3 and y 1 , y 2 , y 3 be the vertices of two triangles in P 2 , and suppose that the meets of the lines fulfill the equations .x 1 ^x 2 / \ .y 1 ^y 2 / D z3 , .x 2 ^ x 3 / \ .y 2 ^ y 3 / D z1 , and .x 3 ^ x 1 / \ .y 3 ^ y 1 / D z2 . Then the join of these points z1 ^ z2 ^ z3 D 0 if and only if a point p exists such that x 1 ^y 1 ^p D x 2 ^y 2 ^p D x 3 ^y 3 ^p D 0. In this problem, use CLUCAL and define two triangles of your choice. 6.31 A point p lies on the circumcircle of an arbitrary triangle with vertices x 1 , y 1 , and z1 . From the point p, draw three perpendiculars to the three sides of the triangle to meet the circle at points x 2 , y 2 , and z2 , respectively. Using incidence algebra in conformal geometric algebra, show that the lines x 1 ^x 2 , y 1 ^y 2 , and z1 ^z2 are parallel. 6.32 In exercise 4.20 of Chap. 4, we introduced the barycentric coordinates using a homogeneous model. Now, formulate the expressions for the barycentric coordinates in terms of conformal points. 6.33 Simplex: Show that the weight of x^y ^z=2 is the area of a triangle xyz, and that the weight of x^y ^z^w=3 is the volume of the tetrahedron xyzw. 6.34 The weight of a dual sphere s is the weight of its center and equals to e1 s . Take the dual of this expression to find out when a sphere passing through the points x;y;z; z becomes zero. 6.35 Prove that the distance measure between a point pi and the sphere/plane s D s C s4 e1 C s5 e0 can be computed in conformal geometry as follows: P p i s D pi s s4 12 s5 p2i or pi s D 5j D1 wi;j sj , with wi;j D xi;j j 2 f1; 2; 3g, w4 D 1, and w5 D 12 p2i . 6.36 Use the results of 6.35 for the sphere to propose the structure of a conformal radial basis function network for data clustering. 6.37 Compute the reflection of the line L through the locations e1 and e2 with respect to the unit sphere at the origin. 6.38 Following exercise 6.37, factorize the result to determine its center and squared radius.
188
6 Conformal Geometric Algebra
6.39 Compute the reflection of the vector e1 C e2 in the direction of 3e3 with respect to the unit sphere at the origin. Notice the weight in your result. 6.40 Compute the reflection of the line L with respect to the origin. 6.41 Compute the reflection of the line L with respect to the point e2 . 6.42 In the least-squares sense, one considers the minimum P of the squares of the distances between all the points and the sphere/plane min niD1 .p i s/2 . In order to obtain the minimum, this can be rewritten in a bilinear form as minP .s T Bs/, where T s D .s1 ; s2 ; s3 ; s4 ; s5 / and the entries of the matrix B are bj;k D niD1 wi;j wi;k . This matrix is symmetric since bj;k D bk;j . Here, we consider normalized results, sT s D 1. For such a constrained optimization problem, one introduces the Lagrangian L D s T Bs s T s, where s T s D 1 and B T D B. The necessary condition to find a minimum is rL D 2 .Bs s/ D 0 or Bs D s. As you can see, the solution of the minimization problem is given by the eigenvector of B that corresponds to the smallest eigenvalue. Extend this idea and write a computer program to find s for points in R5 lying on a hypersphere and for points lying on a hyperplane, where .B/ is a 7 7 matrix.
Chapter 7
Programming Issues
In this chapter, we discuss programming issues when computing in the geometric algebra framework. We explain the technicalities for programming that must be taken into account in order to generate a sound source code. We also comment on some alternatives to improve the efficiency of the code for applications in real time.
7.1 Main Issues for an Efficient Implementation The last decade saw many attempts to develop software for Clifford algebra and geometric algebra computing, some for computer science and numerical computing [51,125,151] and others for symbolic computing [1,3,53]. Efficient implementation is still an issue that greatly depends on the algorithmic complexity of the required algorithms and also on the kind of accelerating cards being used. Fast hardware for running geometric algebra algorithms were recently proposed by [67, 138, 152]. Nevertheless, there is still a long way to go to have efficient and fast software to run geometric algorithms for theorem-proving and real-time applications. We have to admit, however, that the current developments are good contributions toward achieving this main goal. According to the majority of Clifford or geometric algebra software developers, it seems that there are already three well-established issues for efficient implementation: the basic computing entities or multivectors, the role of metrics, and the computational burden caused by the number of basic operations involved. We next discuss these issues from our point of view, including useful suggestions from other authors [21, 52]. – 1. Multivectors: The geometric entities of a geometric algebra, Gn , are represented as multivectors that are expanded in a multivector basis of length 2n . It is not prudent to represent all multivectors in a frame that includes many zeros, rather, specialize the representation fixing the length to accomplish the needed representation and geometric products ignoring all the unnecessary multivector components. This implies a more limited use of grades or basis blades, which consequently reduces the storage and processing time.
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 7, c Springer-Verlag London Limited 2010
189
190
7 Programming Issues
– 2. When we formulate representations and algorithms for a certain problem, we need to precisely identify the appropriate geometric algebra with a suitable metric. It is not necessarily a trivial task to determine in which metric space, Rm;n , we should work: It will depend on how well we are acquainted with the problem in question and the manner in which we should tackle the problem. For instance, when we work in classification using neural computing, it is more advantageous to treat the feature space in higher dimensions in order to linearize the classification problem. Thus, we should not develop a program for a general geometric algebra and apply it for a particular problem. Everything indicates that we should have a specialized implementation for every metrically different geometric algebra. – 3. The number of basic operations on multivectors compared with those in vector calculus (linear transformations, inner or cross products, norms, etc.) is quite large. Usual multivector expressions demand multiple products and operations, which can be reduced in number and complexity by folding them into one calculation instead of executing them one by one by calling a series of generic functions. This tells us that we should program things out explicitly for each operation restricted to each multivector type. It is appropriate to avoid a combinatorial explosion due to the number of multivector blade components and variables. The analysis of these three issues suggests not to implement a program for a general geometric algebra. Rather, in a metric-defined geometric algebra, we have to break it up and specialize both the representation and the operation computations. Bear in mind that multivectors are unfortunately too big; the role of metrics indeed constrains the world representation but is not sufficient for an efficient computation. The operations on multivectors are, in general, simple and universal, but unfortunately too slow if they are not specialized, optimized, and formulated in such a way so that one can also take advantage of cost-effective hardware to speed up the computations.
7.1.1 Specific Aspects for the Implementation The specific considerations for an efficient implementation can be summarized as follows: – Do not waste computer memory. Try to store each variable as compactly as possible. Keep circumstantial results of functions, which can be reused later. – Implement the generic functions over the algebra in an efficient manner, such as factorization, meet, etc. To do this, one should consider various practical details: 1. Reduce unpredictable memory access. 2. Similarly as with the FFT, find zero coordinates and avoid processing them. 3. Optimize nontrivial geometric algebra expressions, particularly being careful not to implement them as a series of function calls.
7.2 Implementation Practicalities
191
4. Conditional branches are prohibitive in order to avoid the loss of processor cycles due to an incorrectly predicted conditional branch. 5. Whenever possible, unroll loops; that is, let them run only over the dimension of the algebra or the subspaces determined by the grade of the blade of the k-vectors.
7.2 Implementation Practicalities As you may have noticed, we have not mentioned a particular programming language. You should consider as potential, programming languages CCC, JAVA, and Matlab and for symbolic calculations Mathematica and Maple. In the previous section, we commented on some existing multivector software packets. In general, you should consider the specification of the algebra to work in, the implementation of the general multivector class, and the generation of optimized functions. We next discuss these aspects in some detail.
7.2.1 Specification of the Geometric Algebra, Gp;q The program requires first to know – The dimension n of the algebra Gn . – The metric of the algebra Gp;q , which is the type of the quadratic form expressing the space signature. – The definition of the specialized types, the list of basis blades, and versors. – The definition of the constants. This is referred to as constant multivector values, which should be considered constant, for example, basis vectors, e0 , and the pseudoscalar In . By including the constants in the algebra specifications, they can easily be recognized and optimized by the code generator or compiler.
7.2.2 The General Multivector Class For the program, the implementation of a general multivector is more appropriate. This is because many equations in geometric algebra are computed regardless of the multivector type of the input, and expansions and contractions vary the output multivector length as well. On the other hand, due to noisy entries, the multivector type of variables may change unexpectedly at run time. We should care for the administration of the coordinates of the general multivector, because during the computation, we can naively try to store all of the 2n multivector components, some of which, in fact, could be zero or perhaps not all are
192
7 Programming Issues
needed to accomplish a certain computational goal. Therefore, in order to increase the computational performance, we should avoid storing coordinates that are zero. Paying attention to the grade of blades, we can select only the k-blades of a multivector that are involved in a certain computation, rejecting the irrelevant ones. Since versors are even or odd, we can at least leave half of their grade counterparts. Note that for low-dimensional spaces, selecting the involved k-vectors reduces the dependency of a slow conditional branching by looking at the coordinate level. However, reduction by grade still misses some good opportunities at higher levels of complexity due to the inherent characteristics of the computation. This applies to entities that have common characteristics for intervening in the computation and depend heavily on the specific use of the algebra, for example, select only flat points in conformal geometric algebra. In general, alerting the processor of some proper constraints of the problem in question could help to group zero coordinates; blades of the same grade therefore alleviate even further the problem of unnecessary computations. Symbolic simplification and factorization of multivector functions will help to reduce their processing time, particularly if they are called in loops. For each multivector type used in the program, it is desirable to generate specialized multivector classes, bearing in mind that the class provides storage for the nonconstant coordinates and some functionality to convert back and forth between the general and specialized multivector classes. The rest of the functionality will be provided by the functions over the entire algebra.
7.2.3 Optimization of Multivector Functions The optimization of multivector functions over elements of the algebra is extremely important to speed up the processing and to guarantee high performance from the computer program. To start, functions are formulated based on their high-level definitions. However, the precise syntax of the definition depends greatly on the generative programming method used. Using the programming language C CC , a kind of meta-programming approach, the functions are formulated as a C CC template function. These functions are instantiated with a specific type of multivector. A reasonable procedure for generating code, and possibly optimized from a high-level definition, should take into consideration the following common-sense advice: – The types of the function arguments are substituted with efficient specializations. – The expressions in the function are formulated explicitly on a basis. – Simplify the expressions symbolically, so that most of the operations and products are executed at the basis level; unnecessary computations are removed; identical terms are fused; reduce length looking at the coordinate level, grade of k-vectors, and at the group level of complex geometric entities, etc. – Avoid unnecessary branching, conditionals, and loops. For multivector classification and factorization, it is advisable to unroll loops in order to ensure that each variable has a fixed multivector type. Here via blade factorization after loops are
7.2 Implementation Practicalities
193
unrolled, all variables in the algorithm get a fixed type, and consequently most conditional branches can be removed. – The algorithms for the meet and join of blades cannot be optimized by using specialized multivectors. This is because in the algorithm the multivector type of several variables is normally not fixed. Also, there are several other unavoidable conditional branches. – The return type of the function is specified.
7.2.4 Factorization Many researchers have explored ways to factorize representations [21,51,62]. Common multiplicative representations in geometric algebra include blades as wedge products of vectors and versors as geometric products of vectors (reflections for rotations, reversions, translations, etc.). A k-blade corresponds to the wedge product of k vectors, and similarly a versor is the result of the wedge of a certain number, n, of vectors. After factorization of a k-blade or a versor, for both, the resulting amount of vectors can be stored as a list of vectors. Note that the storage requirements of blades and versors become O.n2 / instead of O.2n / when we factorize them using the basis-of-blades standard method. Recall that multivectors are not homogeneous. That is, no k-blades or versors are expanded as a sum of multiple blades or versors of different grade; this requires more storage. As an illustration, we include the following procedure proposed by [21, 51, 62] for factorizing a blade: – Input: a nonzero blade X of grade r. – Compute its norm ˛ D jX j. L with the largest coordinate and then determine the – For X , find the basis blade X L. r basis vectors, ei , that span X – Normalize the input blade U D X =˛. L compute: – For all the r basis vectors ei 2 of X – Project ei onto U , that is, g i D .ei U /U 1 . – Normalize g i and store it in the list of factors. – Update U g 1 i U. – Obtain the last factor g r D U and normalize it. – Output: list the factors g i and the scale ˛. Further research work has to be done for more efficient factorization methods, especially for high-dimensional (n < 10) algebras. New insights in this regard can be found in Fontjine’s Ph.D. thesis [62].
194
7 Programming Issues
7.2.5 Speeding Up Geometric Algebra Expressions To reduce the complexity of multivector expressions, we can resort to a gradual symbolic simplification that can then be run using cost-effective hardware (see Fig. 7.1). The main idea here is that after we formulated an algorithm in geometric algebra terms for a certain task, we reduce its complexity using symbolic simplification (Maple). This produces a generic intermediate representation (IR) that can be used for the generation of different output formats such as C-code, FPGA descriptions (Verilog language), or CLUCalc code in order to visualize the results. For this purpose, an architecture called Gaalop was developed by Hildebrand et al. [96]. Let us consider a problem of inverse kinematics. Figure 7.2 shows the rotation planes of a robotic arm elbow. We consider a circle that is the intersection of two reference spheres, z D s1 ^s2 . We then compute the pair of points (PP) as the meet of the swivel plane and the circle: PP D z ^ swivel , so that we can choose a realistic elbow position. This is a point that lies on the meet of two spheres. This position is given by the 3D coordinates p ex , pey , and p ez . After the optimization of the equation of the inverse kinematics, we get an efficient representation to compute this elbow position point. You can see for p ex in Fig. 7.3 its data flow and its pipeline structure. Using this implementation, the authors reported for the whole inverse kinematics algorithm a remarkable acceleration of more than 130 times faster than the speed of an unoptimized algorithm.
Fig. 7.1 Computing a pair of points as the meet of the swivel plane and the circle using Gaalop
7.2 Implementation Practicalities
195
Fig. 7.2 Computing a pair of points as the meet of the swivel plane and the circle
7.2.6 Multivector Software Packets The existing software packets for multivector programming are Fortran-based CLICAL developed by the group of Pertti Lounesto. It is very useful for fast computation and theorem proving. It is available at http://users.tkk.fi/ppuska/mirror/ Lounesto/CLICAL.htm. The Matlab-based geometric algebra tutorial GABLE supports N 3 and is available at http://staff.science.uva.nl/ leo/GABLE/index.html. The Maple-based CLIFFORD supports N 9. It is very useful for symbolic programming and theorem proving and is available at http://math.tntech.edu/rafal. The CCC-based CLUCal is very useful for practicing and learning geometric algebra computations in 2D and 3D, particularly for visualization, computer vision, and crystallography. It is available at http://www.perwass.de/cbup/clu.html. GAIGEN2 generates fast CCC or JAVA sources for low-dimensional geometric algebra. It is very useful for practicing and learning geometric algebra computing and for trying a variety of problems in computer science and graphics. It is available at http://www.science.uva.nl/ga/gaigen/. CCC MV 1.3.0 sources supporting N 63. The author, Ian Bell, has developed up to version 1.6 with significant functionality extensions and bug fixes. This is a very powerful multivector software for applications in computer science and physics. It is available at http://www. iancgbell.clara.net/maths/index.htm. The CCC GEOMA v1.2, developed by Patrick Stein, contains CCC libraries for Clifford algebra with an orthonormal basis. It is available at http://nklein.com/software/geoma. The reader can also download our C ++ programs, which are being routinely used for applications in robotics, image processing, wavelet transforms, computer vision, neural computing, and medical robotics. These are available at http://www.gdl.cinvestav.mx/edb/GAprogramming. Readers who want to develop their own program for Clifford or geometric algebra applications should consider the advice given in this section, learn from the above-cited multivector software packets, and adopt and integrate their new
196
7 Programming Issues
Fig. 7.3 Data flow and pipeline structure for a fast computing of the elbow position point
7.2 Implementation Practicalities
197
developments. CLUcal and GAIGEN are highly recommended for learning geometric algebra. To write a C ++ geometric algebra program, one should start to look at GEOMA, the MV 1.3.0, the code generator of GAIGEN, or visit our homepage. For symbolic computing and theorem proving, CLICAL and CLIFFORD are extremely useful.
Part III
Geometric Computing for Image Processing, Computer Vision, and Neurocomputing
Chapter 8
Clifford–Fourier and Wavelet Transforms
8.1 Introduction This chapter presents the theory and use of the Clifford–Fourier transforms and Clifford wavelet transforms. We will show that using the mathematical system of the geometric algebra it is possible to develop different kind of Clifford–Fourier and wavelet transforms very useful for image filtering, pattern recognition, feature detection, image segmentation, texture analysis, and image analysis in frequency and wavelet domains. These techniques are fundamental for automated visual inspection, robot guidance, medical image processing, analysis of image sequences, as well as for satellite and aerial photogrammetry. First, we review the traditional one- and two-dimensional Fourier transforms. Then the complex and quaternionic Fourier transforms are explained, and together with quaternionic Gabor filters, their role for phase analysis is clarified. The quaternionic phase concept helps us to disentangle possible symmetries of 2D signals. In Chap. 13 dedicated to the applications, various illustrations of the use of the quaternionic phase concept show the power of analysis in the quaternionic frequency domain. As an extension of these transforms, the space and time Fourier transforms and the n-dimensional Clifford–Fourier transform are developed straightforwardly using the geometric algebra framework. Additionally, we present the extension of the real- and complex-wavelet transform to the quaternion wavelet transform and its applicability. Finally, as a natural consequence, we derive the n-dimensional Clifford wavelet transform as well.
8.2 Image Analysis in the Frequency Domain A chapter devoted to image analysis would be not complete if it lacked an analysis in the frequency domain, and so here we present the quaternionic Fourier transform (QFT) and the quaternionic Gabor filters. During the 1990s, a number of different attempts were made to use the algebra of quaternions for computations in the frequency domain. Chernov [37] used quaternions to speed up the evaluation of the 2D discrete, complex-valued, Fourier transform. Ell [56] introduced the E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 8, c Springer-Verlag London Limited 2010
201
202
8 Clifford–Fourier and Wavelet Transforms
QFT and applied it to the analysis of 2D linear, time-invariant, partial-differential systems. He used the phase component of the polar representation of the QFT (Eq. 8.27) as an indicator of the stability of the system. Later on, B¨ulow focused on the development of a quaternionic phase concept and clarified many theoretical and practical aspects of the QFT [29]. Sangwine [168] utilized the QFT for color image processing, assigning the individual RGB components to the quaternion imaginary components. The images are thus transformed holistically, as opposed to the more simplified approach in which each color channel is separately transformed utilizing the 2D Fourier transform. Using the Clifford algebra framework, we also show that the highest level of the hierarchy of harmonic transformations is occupied by the n-dimensional Clifford–Fourier transform. First, we review briefly the 1D, and 2D Fourier transforms.
8.2.1 The One-Dimensional Fourier Transform For a continuous and integrable function f 2 L2 .R/, the Fourier transform of f is defined by the function F ff g W R ! C, given by 4
F .u/ D F Œf .x/ D
Z
1
f .x/e i 2ux dx;
(8.1)
1
where i 2 D 1 is the unit imaginary. The function F .u/ can be expressed in terms of its complex parts or in polar form as follows: F .u/ D Fr .u/ C iFi .u/ D S.u/e i 2 .u/;
(8.2)
where S.u/ is the Fourier spectrum of f .x/, S 2 .u/ is the power spectrum, and .u/ its phase angle. If F .u/ 2 L2 .R/ and f 2 L2 .R/, the inverse Fourier transform is given by 4
f .x/ D F 1 ŒF .u/ D
Z
1
F .u/e i 2ux du:
(8.3)
1
Table 8.1 presents a summary of some basic properties of the Fourier transform. Table 8.1 Properties of the one-dimensional Fourier transform Property Function Fourier Transform Linearity ˛f .x/ C ˇg.x/ ˛F .u/ C ˇG.u/ Delay f .x t / e i2 ut F .u/ F .u u0 / Shifting e i2 u0 x f .x/ 1 F . ˛u / Scaling f .˛x/ j˛j Convolution .f ? g/.x/ F .u/G.u/ n .x/ .i 2 u/n F .u/ Derivative f R1 R1 2 2 Parseval theorem 1 jf .x/j dx 1 jF .u/j du
8.2 Image Analysis in the Frequency Domain
203
8.2.2 The Two-Dimensional Fourier Transform The one-dimensional Fourier transform (Eq. 8.1) can be easily extended to treat a function f .x; y/ of two variables, namely a continuous and integrable function f 2 L2 .R2 /; the Fourier transform of f is defined by the function F ff g W R2 ! C, given by Z 1Z 1 4 F .u; v/ D F Œf .x; y/ D f .x; y/e i 2.xuCyv/ dxdy: (8.4) 1
1
The two-dimensional inverse Fourier transform is given by Z 1Z 1 4 F .u; v/e i 2.uxCvy/ dudv: f .x; y/ D F 1 ŒF .u; v/ D 1
(8.5)
1
Similar to the one-dimensional case, the Fourier spectrum, phase, and power spectrums are given by following relations, respectively: 1
S.u; v/ D jF .u; v/j D jFr2 .u; v/ C Fi2 .u; v/j 2 ; h F .u; v/ i i ; ..u; v// D tan1 Fr .u; v/ P .u; v/ D S.u; v/2 D Fr2 .u; v/ C Fi2 .u; v/:
(8.6) (8.7) (8.8)
Table 8.2 presents a summary of some basic properties of the two-dimensional Fourier transform.
8.2.3 Quaternionic Fourier Transform Like the 2D Fourier transform and the Hartley transform, the QFT is a linear and invertible transformation. The QFT is limited to 2D signals. Clifford algebra Fourier Table 8.2 Properties of the two-dimensional Fourier transform Property
Function
Fourier Transform
Rotation Linearity Conjugation Separability Scaling
f .˙x; ˙y/ ˛f .x/ C ˇg.x/ f .x; y/ f1 .x/f2 .y/ f .˛x; ˇy/
Shifting Modulation Convolution Multiplication Spatial correlation Inner product
f .x ˙ x0 ; y ˙ y0 / e ˙i 2 .u0 xCv0 y/ f .x; y/ h.x; y/ D f .x; y/ ? g.x; y/ h.x; y/ D f .x; y/g.x; y/ h.x; y/ D f .x; y/ ˇ g.x; y/ R1 R1 I D 1 1 f .x; y/g .x; y/dxdy
F .˙u; ˙v/ ˛F .u; v/ C ˇG.u; v/ F .u; v/ F1 .u/F2 .v/ 1 F . ˛u ; ˇv / j˛ˇj
e ˙i 2 .x0 uCy0 v/ F .u; v/ F .u u0 ; v v0 / H.u; v/ D F .u; v/G.u; v/ H.u; v/ D F .u; v/ ? G.u; v/ H.u; v/ D F .u; v/G.u; v/ R1 R1 I D 1 1 F .u; v/G .u; v/dudv
204
8 Clifford–Fourier and Wavelet Transforms
transforms (CFTs) can deal with transformations of higher-dimensional signals. The complex Fourier transform and the QFT can be seen as special cases of CFTs. The 2D Fourier transform (FT) is given by Fc .u/ D D
Z
1
Z
1
T .i 2 ux/f .x/T .i 2 vy/d2 x
Z1 1
Z1 1
1
1
e .i 2ux/ f .x/e .i 2vy/ d2 x;
(8.9)
where x D .x; y/, u D .u; v/, and f .x/ is a 2D-real-valued function. T .i 2 ux/ and T .i 2 vy/ are Fourier kernels applied to both axes of the 2D signal. If we assign, instead, Fourier kernels that depend upon the quaternion bases i D 1 2 and j D 2 3 (ij D k D 1 3 ), we then get a straightforward expression for the QFT, Fq .u/ D Fq .u/ D
Z
1
Z
1
T .i 2 ux/f .x/T .j 2 vy/d2 x;
1 Z1 1 Z 1 1
e .i 2ux/ f .x/e .j 2vy/ d2 x;
(8.10)
1
where x D .x; y/, u D .u; v/, and f .x/ is a real-, complex-, or quaternion-valued 2D function. The inverse quaternionic Fourier transform (IQFT) is given by f .x/ D D
Z
1
Z
1
Z1 1
Z1 1
1
1
TQ .i 2 ux/Fq .u/TQ .j 2 vy/d2 u e .i 2ux/ Fq .u/e .j 2vy/ d2 u;
(8.11)
where T .i 2 ux/TQ .i 2 ux/ D 1 and T .j 2 vy/TQ .j 2 vy/ D 1. Let us now give the concept of Hermitian function in the quaternionic domain. As an extension of the Hermitian function f W R ! C with f .x/ D f .x/ for every x 2 R, we regard f W R2 ! H as a quaternionic Hermitian function if it fulfills the following nontrivial involution rules [37]: f .x; y/ D jf .x; y/j D Tj .f .x; y//; f .x; y/ D if .x; y/i D Ti .f .x; y//; f .x; y/ D if .x; y/i D i.jf .x; y/j /i D .i j /f .x; y/.j i / (8.12) D kf .x; y/k D Tk .f .x; y//: The concept of the quaternionic Hermitian function is very useful for the computation of the inverse QFT using the quaternionic analytic signal.
8.2 Image Analysis in the Frequency Domain
205
8.2.4 2D Analytic Signals This section discusses different points of views found in the literature concerning the fundamental concept of the analytical signal of 2D signals. The reader can find a detailed analysis of this issue in [29, 72]. The particular formulation of the 2D analytical signal used influences the final representation of the signal in the four quadrants of the frequency domain. Thus, in order to recover the entire real 2D signal using the inverse Fourier transform, one must take this final representation into account. The same thought process applies when one computes the magnitude and phase of the 2D signal. The analytical signal of a 2D signal is given by i fA .x/ D f .x/ ? ı 2 .x/ C 2
xy i D f .x/ C ifHi .x/; D f .x/ C if .x/ ?
2 xy
(8.13)
where fHi .x/ is called the total Hilbert transform [182] and the symbol ? stands for a 2D convolution operation. In the frequency domain, the signal is split among the four quadrants according to the following equation (see Fig. 8.1a): FA .u/ D F .u/ 1 i sign.u/sign.v/ :
(8.14)
The partial analytical signal locates the signal energy in the frequency domain on one side of a reference line indicated by the normal unit vector of the line m D .cos./; sin.//, which is perpendicular to the line direction n. The partial analytical signal in the spatial domain is given by the equation fAp .x/ D f .x/ ? ı.x m/ C
i ı.x n/;
x m
(8.15)
and in frequency domain by the equation (see Fig. 8.1b) FAp .u/ D F .u/ 1 C sign.u m/ :
(8.16)
In another approach, Hahn [76,77] introduced the following notion of the analytic signal: i i ı.y/ C fAh 1 .x/ D f .x/ ? ı.x/ C
x
y 1 ı.y/ ı.x/ D f .x/ f .x/ ? C if .x/ ? C if .x/ ?
2 xy
x
y D f .x/ fHi .x/ C i fHia .x/ C fHib .x/ ; (8.17)
206
8 Clifford–Fourier and Wavelet Transforms
a
b
c
d
Fig. 8.1 2D analytic signals in the frequency domain: (a) standard 2D analytical signal FA .u/, (b) partial analytical signal FAp .u/, (c) Hahn’s 2D analytical signal FAh1 .u/, and (d) quaternionic 2D analytical signal
where fHi .x/ is the total Hilbert transform and fHia .x/; fHib .x/ are partial Hilbert transforms which only reference the x- and y-axes, respectively. In the frequency domain, the analytical signal is localized only in the first quadrant and is multiplied four times according to FAh 1 .u/ D 1 C sign.u/ 1 C sign.v/ F .u/I (8.18) see Fig. 8.1c. In this approach, owing to the Hermitian symmetry, only one-half of the plane of the frequency spectrum is redundant, so that a second analytical signal with its spectrum located in the second quadrant is required: fAh 2 .x/ D f .x/ C fHi .x/ i fHia .x/ fHib .x/ ; FAh 2 .u/ D 1 C sign.u/ 1 C sign.v/ F .u/: (8.19)
8.2 Image Analysis in the Frequency Domain
207
The entire 2D real signal can only be recovered by taking into account both analytic signals, FAh 1 and FAh 2 (for more details, see [77]). As Fig. 8.1d shows, the QFT of a 2D real signal is a quaternionic Hermitian, and thus the total information of the signal is not lost. This suggests that the quaternionic analytical signal may be defined by modifying Hahn’s equation (8.17), utilizing instead the quaternion basis j i fAq .x/ D f .x/ ? ı.x/ C ı.y/ C
x
y ı.y/ ı.x/ 1 C jf .x/ ? C ijf .x/ ? D f .x/ C if .x/ ?
2 xy
x
y D f .x/ C ifHia .x/ C jfHib .x/ kfHi .x/; (8.20) where fHia .x/ and fHib .x/ are the partial Hilbert transforms and fHi .x/ is the total Hilbert transform. In the frequency domain, the quaternionic analytical signal is given by (see Fig. 8.1d) FAq .u/ D 1 C sign.u/ 1 C sign.v/ Fq .u/:
(8.21)
We will now show that we can indeed obtain f .x; y/ by utilizing the first quadrant of the frequency spectrum Fq .u; v/ four times. To do so, we employ the simple property of quaternions, Re.q/ D Re.i qi / D Re.j qj / D Re.kqk/, such that Z 1 Z 1 f .x; y/ D 4Re e .i 2ux/ Fq .u/e .j 2vy/ d2 u 0 0 Z 1Z 1 D Re e .i 2ux/ Fq .u/e .j 2vy/ d2 u 0 0 Z 1Z 1 CRe e .i 2ux/ iFq .u/i e .j 2vy/ d2 u 0 0 Z 1Z 1 CRe e .i 2ux/ jFq .u/je .j 2vy/ d2 u 0 0 Z 1Z 1 CRe e .i 2ux/ kFq .u/ke .j 2vy/ d2 u 0 0 Z 1Z 1 D Re e .i 2ux/ Fq .u/e .j 2vy/ d2 u 0 0 Z 1Z 1 CRe e .i 2ux/ Fq .u; v/e .j 2vy/ d2 u 0 0 Z 1Z 1 CRe e .i 2ux/ Fq .u; v/e .j 2vy/ d2 u 0 0 Z 1Z 1 CRe e .i 2ux/ Fq .u; v/e .j 2vy/ d2 u 0
0
208
8 Clifford–Fourier and Wavelet Transforms
D Re
1Z
Z 0
CRe CRe
1
e .i 2ux/ Fq .u/e .j 2vy/ dudv 0
Z
0
Z
1
Z
1 0 1Z 0
Z
0
CRe Z D Re
0
1 0
Z
e .i 2ux/ Fq .u; v/e .j 2vy/ dudv e .i 2ux/ Fq .u; v/e .j 2vy/ dudv
e
1
e .i 2ux/ Fq .u; v/e .j 2vy/ dudv
1 1 1 Z 1 .i 2ux/ 1
Fq .u/e .j 2vy/ dudv :
(8.22)
Note that Hahn’s approach to the analytical signal fails to recover a 2D complex Hermitian signal using the first quadrant of the frequency spectrum. That is why Hahn introduced two complex signals for each of the real 2D signal equations (8.18) and (8.19). By contrast, using the quaternionic analytical signal, the entire real 2D signal can be recovered from the first quadrant of the frequency spectrum. This is due to the Hermitian symmetry properties of the quaternionic analytical signal. Figure 8.2 shows the complex and QFTs of a 2D signal.
8.2.5 Properties of the QFT Next, we give some relevant properties of the QFT. We start with a treatment of the symmetries of 2D signals using the Fourier and Hartley transforms. Symmetry Properties. A 2D signal can be split into even and odd parts with respect to the two axes: f .x/ D fee .x/ C foo .x/ C feo .x/ C foe .x/;
(8.23)
where the subindexes e (for even) and o (for odd) are related to the x-axis and y-axis, respectively. This split depends on the selected origin and image orientation. Since the 2D Fourier transform has real and imaginary parts, it is not possible to disentangle the four components of Eq. 8.23, as the following computation shows: F .u; v/ D D D
Z Z Z
1 1 1 1 1 1
Z Z
1
e i 2ux f .x; y/dx e i 2vy dy
1 1
.cos.2 ux/ i sin.2 ux//f .x; y/dx e i 2vy dy
1
.He iHo / cos.2 vy/ i sin.2 vy/ dy
D .Hee Hoo / i.Hoe C Heo / 2 C:
(8.24)
8.2 Image Analysis in the Frequency Domain
209
Fig. 8.2 Analysis in frequency domain: (upper row) (a) 2D signal, (middle row) (b) real and imaginary parts of FT, (c) magnitude and phase of FT (lower row), (d) real and imaginary parts of QFT, (e) magnitude and phases of QFT
The elements of Eq. 8.23 are intermixed within the real and imaginary parts of this equation. In contrast, by computing the QFT following the quaternion multiplication ij D k, we obtain Fq .u; v/ D D D
Z
1
Z
1
1 Z1 1 Z 1
Z1 1 1
e i 2ux f .x; y/dx e j 2vy dy .cos.2 ux/ i sin.2 ux//f .x; y/dx e j 2vy dy
1
.He iHo /e j 2vy dy
210
8 Clifford–Fourier and Wavelet Transforms
D
Z
1
.He iHo / cos.2 vy/ j sin.2 vy/ dy
Z1 1 He cos.2 vy/ iHo cos.2 vy/ jHe sin.2 vy/ C D 1 C.ij /Ho sin.2 vy/ dy D Hee iHoe jHeo kHoo 2 H:
(8.25)
Note that here the QFT separates the four components of Eq. 8.23 by projecting them into the orthonormal quaternionic basis. By contrast, the Hartley transform (HT) cannot split these parts, as the following computation shows: H.u/ D D
Z
1
Z
1
n o f .x/ cos.2 u x/ C sin.2 u x/ d2 x
1 Z1 1 Z 1
n f .x/ cos.2 ux/ cos.2 vy/ sin.2 ux/ sin.20i vy/ C 1 1 o C cos.2 ux/ sin.2 vy/ C sin.2 ux/ cos.2 vy/ d2 x
D Hee .u/ C Hoo .u/ C Heo .u/ C Hoe .u/ 2 R:
(8.26)
Polar Representation of the QFT. The polar representation of the QFT is given by ˇ ˇ ˇ ˇ Fq .u/ D ˇFq .u/ˇe i .u/ e k .u/ e k.u/ ;
(8.27)
and the evaluation of its phase follows the quaternion rules given in Sect. 3.4. This kind of representation is very helpful for image analysis if we use the phase concept. Shift Property. With regard to the polar representation of the QFT, it is appropriate to mention the shift property of the QFT: Fq .u/T D
Z
1
Z
1
e .i 2ux/ f .x d/e .j 2vy/ d2 x
1 1 .i 2ud1 /
De Fq .u/e .j 2ud2 / ˇ ˇ ˇ ˇ D ˇFq .u/T ˇe i. .u/2ud1 / e k .u/ e j..u/2vd2 / :
(8.28)
Modulation. The modulation of the 2D signal causes a shifting in the 2D frequency space, which is the result of the modulation in the space domain of the 2D signal from both sides by two orthogonal carriers with frequencies u0 and v0 : fm .x; y/ D e .i 2u0 x/ f .x; y/e .j 2v0 y/ :
(8.29)
8.2 Image Analysis in the Frequency Domain
211
By taking the QFT of fm .x; y/, we obtain n o Fq fm .x; y/ .u; v/ D Fq .u u0 ; v v0 /:
(8.30)
Convolution. Consider the real 2D signals f1 and f2 with their QFTs F1q and F2q , respectively. The convolution of these 2D signals can be carried out in the frequency domain as a sort of multiplication of their QFTs. This can be easily proved, by either integrating with respect to the Fourier kernel e .j 2yv/ Z
Z
e .i 2xu/ f1 .x/ ? f2 .x/ e .j 2yv/ d2 x 1 Z1 Z 1 Z 1 1 Z 1 D e .i 2xu/ f1 .x 0 /f2 .x x 0 /d2 x 0 e .j 2yv/ d2 x 1 1 1 Z1 1 Z 1 0 0 D e .i 2x u/ f1 .x 0 /F2q .u/e .j 2y v/ d2 x 0 1 Z1 1 Z 1 0 D e .i 2x u/ f1 .x 0 / cos.2 y 0 v/F2q .u/d2 x 0 1 1 Z 1Z 1 0 C e .i 2x u/ f1 .x 0 /j sin.2 y 0 v/ jF2q .u/j d2 x 0 1 1 D F1q e .u/F2q .u/ C F1q o .u/ jF2q .u/j
Fq D
1
1
D F1q e .u/F2q .u/ C F1q o .u/Tj .F2q .u//;
(8.31)
or by integrating with respect to the Fourier kernel e .i 2xv/ : Fq D F1q .u/F2q e .u/ C iF1q .u/i F2q o .u/ D F1q .u/F2q e .u/ C Ti .F1q .u//F2q o .u/:
(8.32)
Note that if one of the functions is even with respect to at least one of the kernel arguments, the convolution is then equal to the product of the individual spectra: Fq D F1q .u/F2q .u/:
(8.33)
8.2.6 Discrete QFT As with the method to discretize the FT and its inverse, we can proceed similarly with the QFT and its inverse. Given a discrete two-dimensional signal of size M N with quaternionic components fmn 2 H, the discrete quaternionic Fourier trans-
212
8 Clifford–Fourier and Wavelet Transforms
form (DQFT) and the inverse discrete quaternionic Fourier transform (IDQFT) are given, respectively, by Fquv D
M 1 N 1 X X
e .
i 2um M
/f e mn
j 2vn N
;
(8.34)
mD0 nD0
fmn D
M 1 N 1 j 2vn 1 X X . i 2um / : e M Fquv e N MN mD0 nD0
(8.35)
The reader can then easily implement a computer program to determine the QFT and IQFT. Using the fast Fourier transform (FFT), we first compute the onedimensional DFT of fmn in a row-wise sense, as follows: fun D
M 1 X
e .
i 2um M
/f
mn
D Real.fun / C Imag.fun /:
(8.36)
mD0
This spectrum, fun , is then divided into real and imaginary parts, which in turn are now transformed in a column-wise sense, once again using a one-dimensional DFT, to give Fuv r D
N 1 X
i 2vn Real.fun / e . M / ;
(8.37)
i 2vn Imag.fun / e . M / :
(8.38)
nD0
Fuv i D
N 1 X nD0
Using these results, we can finally compose the DQFT: Fuvq D Real.Fuvr / C i Real.Fuvi / C Cj Imag.Fuv r / C k Imag.Fuv i /:
(8.39)
Note that the first, row-wise computation outputs complex numbers (Eq. 8.36), and that the second, column-wise computation rotates their real and imaginary parts spatially by 90ı . In this way, the procedure yields the orthogonal bivector components of the DQFT of Eq. 8.39. This method is an implementation of the procedure to build quaternions from complex numbers, namely, by using the doubling technique [196]. Current FFT routines and hardware can be re-utilized with few changes to implement a fast quaternionic Fourier transform (FQFT). The reader can find more details about the implementation of an FQFT in [58].
8.3 Image Analysis Using the Phase Concept
213
8.3 Image Analysis Using the Phase Concept In the field of image processing, Gabor filters have proved to be very useful bandpass filters with the beneficial property that they are optimally localized both in the space and in the frequency domain [66]. Applications for Gabor filters include pattern recognition, classification, texture analysis, local phase estimation, and frequency estimation. In the following sections, the 2D Gabor filter and the phase concept are explained in detail.
8.3.1 2D Gabor Filters A two-dimensional complex Gabor filter is a linear shift-invariant filter with the impulse response of a complex carrier modulated by a Gauss function: hc .xI u0 ; ; ˛; / D g.x 0 ; y 0 / exp.2 i.u0x C v0 y//:
(8.40)
Here, the Gauss function is given by x 2 C .˛y/2 g.x; y/ D C exp ; 2
(8.41)
˛ where ˛ is the aspect ratio and C D 2 2 the normalizing factor so that R 0 g.x; y/dxdy D 1. The coordinates of g.x ; y 0 / have been rotated about the R origin
x0 cos 0 D y sin
sin ; x : cos : y
(8.42)
In the frequency domain, the 2D Gabor filter has the following transfer function: hc .xI u0 ; ; ˛; / ! Hc .uI u0 ; ; ˛; / Hc .uI u0 ; ; ˛; / D exp.2 2 2 Œ.u0 u00 /2 C .v0 v00 /2 =˛/:
(8.43)
Figure 8.3 shows aq complex Gabor filter whose center in frequency and orientation
are given by f0 D u20 C v20 and D atan. uv00 /, respectively. Choosing D , the principal axis of the Gauss function is aligned with the orientation of . Now, by assigning two orthogonal complex carriers to the axes of the 2D signal, we can further extend the complex Gabor filter into the quaternionic Gabor filter:
hq .xIu0 ; ; ˛; D 0/ D g.xI ; ˛/ exp.i 2 u0 x/ exp.j 2 v0 y/ s ˛w y s w x 1 1 2 2 exp j : (8.44) D g.xI ; ˛/ exp i
214
8 Clifford–Fourier and Wavelet Transforms
Note that the complex carriers are dependent on the quaternion bases i and j and that for simplicity we do not use a rotated Gauss function. The transfer function of a quaternionic Gabor filter is a direct interpretation of the modulation theorem of the QFT explained in Sect. 8.2. It consists of a shifted Gaussian function in the quaternionic frequency domain, hq .xI u0 ; ; ˛; D 0/ ! Hq .uI u0 ; ; ˛; D 0/ Hq .uI u0 ; ; ˛; D 0/ D exp.2 2 2 Œ.u u0 /2 C .v v0 /2 =˛ 2 /; (8.45) by which the greater amount of the Gabor filter’s energy is preserved for positive frequencies, u0 and v0 , in the upper-right quadrant. In this regard, the convolution of a real image with a quaternionic Gabor filter approximates a quaternionic analytic signal. Figure 8.3 presents two examples of quaternion Gabor filters. Figure 8.4 presents a convolved medical image using a 50 50-pixel Gabor filter with 1 D 7, 2 D 2, s1 D 2, s2 D 2, and ˛ D 90ı , and Fig. 8.5 a convolved medical image using a 5050-pixel Gabor filter with 1 D 7, 2 D 2, s1 D 2, s2 D 2, and ˛ D 67:5ı .
8.3.2 The Phase Concept The local quaternionic phase of a two-dimensional signal can be measured using the angular phase of the filter response of a quaternionic Gabor filter. The evaluation of the angular phase is carried out according to the rules of the quaternion phase presented in Sect. 3.4. The phase concept can be used for 3D reconstruction using interferometry techniques. Figure 8.6 shows the measurement of the phase change of 3D objects: a cube and a bell shape form that were illuminated with a light grid. In order to complete the whole 3D object representation, we can use a stereoscopic system to get the 3D information of some key points of the illuminated object and together with the unwrapped phase we can compute for each phase object point its corresponding 3D value.
8.4 Clifford–Fourier Transforms The real Fourier transform given in Eq. 8.1 can be straightforwardly extended to Clifford–Fourier transforms for different dimensions and metrics, where we have to consider carefully the role of the involved pseudoscalar of In 2 Gn D Gn;0 . In the next sections, we present the 3D CFT and the Space-time CFT and the n-dimensional Clifford–Fourier transform.
8.4 Clifford–Fourier Transforms
215
Fig. 8.3 Complex and quaternionic Gabor filters: (upper left) real and imaginary parts (r,i ) of a complex Gabor 50 50-pixel filter with 1 D 8, 2 D 4, s D 2, and ˛ D 30ı ; (upper right) magnitude and phase () for this filter; (middle left) real and imaginary parts (i ,j ,k) of a quaternionic Gabor 50 50-pixel filter with 1 D 10, 2 D 10, s1 D 4, s2 D 4, and ˛ D 0; (middle right) magnitude and phases (, , and ) for this filter; (lower left) real and imaginary parts (i ,j ,k) of a quaternionic 50 50-pixel Gabor filter with 1 D 8, 2 D 4, s1 D 2, s2 D 4, and ˛ D 30ı ; (lower right) magnitude and phases (, , and ) for this filter
216
8 Clifford–Fourier and Wavelet Transforms
Fig. 8.4 Real and imaginary parts of a convolved quaternionic image using qgabor (50,7,2,2,2,90)
Fig. 8.5 Real and imaginary parts of a convolved quaternionic image using qgabor (50,7,2,2,2,67.5)
8.4 Clifford–Fourier Transforms
217 4 2 0 −2 −4
0
50
100
150
0
50
100
150
4 2 0 −2 −4
Fig. 8.6 Left images: original 150 150 images which are convolved with qgabor(25,7,7,2,2,90). Middle images: one of the phase images. Right images: phase changes modulo 2 along the row 64
8.4.1 Tri-Dimensional Clifford–Fourier Transform The tri-dimensional Clifford–Fourier transform (3D CFT) for vector fields was first introduced by Jancewicz [105], then used by Ebling and Sheuermann [55], and recently the work on the C3;0 , Clifford–Fourier transform by Hitzer and Mawardi [97, 132] enriched even more this area of study. In the tri-dimensional geometric algebra, the pseudoscalar is a trivector with a signature I32 D .e 1 e 2 e 3 /2 D 1. Similar to the case of the real Fourier transform given in Eq. 8.1, we can use the trivector instead of the imaginary number i with i 2 D 1. We know that the scalar and the pseudoscalar build a complex number which in turn becomes the centrum of the G3 . Thus, for any tri-dimensional multivector field f W R3 ! G3 , the tri-dimensional Clifford–Fourier transform of f is defined by the function given by 4
F .w/G3 D F Œf .x/ D
Z
f .x/e I3 wx d3 x;
(8.46)
R3
where w D w1 e 1 C w2 e 2 C w3 e 3 , x D x1 e 1 C x2 e 2 C x3 e 3 , and d3 x D dx 1^dx 2^dx 3 . Since the pseudoscalar I3 commutes with any element of G3 , the I3 Clifford–Fourier kernel, e I3 wx , will commute with every element of G3 as well. The inverse 3D Clifford–Fourier transform is computed as follows: 4
f .x/ D F 1 ŒF .w/G3 D
1 .2 /3
Z R3
F .w/e I3 wx d3 w:
(8.47)
218
8 Clifford–Fourier and Wavelet Transforms Table 8.3 Properties of the 3D Clifford–Fourier transform Property Function 3D Clifford–Fourier Transform Linearity ˛f .x/ C ˇg.x/ ˛F .w/ C ˇG.w/ Delay f .x t/ e I3 wt F .w/ F .w w0 / Shifting e I3 w0 x f .x/ 1 F.w / Scaling f .˛x/ j˛ 3 j ˛ Convolution .f ? g/.x/ F .w/G.w/ n n .I Derivative f R 3 w/ F .w/2 3 R .x/ 2 3 Parseval theorem R3 jf .x/j d x R3 jF .w/j d w 2 2 j˛j3 e .w=˛/ =2 Gaussian e .˛x / =2
The vectors x and w represent position in spatial and frequency domains, respectively. Table 8.3 presents a summary of some basic properties of the 3D Clifford–Fourier transform. In order to reduce the computational complexity of the CFT, we can first group conveniently the multivector elements of function f .x/ 2 G3 and then apply the traditional two-dimensional fast Fourier transform (FFT). Let us first rewrite the tridimensional multivector valued function f .x/ 2 R3 2 G3 in terms of four complex signals using, instead of the imaginary number i , the trivector I3 as follows: f .x/ D Œf .x/0 C Œf .x/1 C Œf .x/2 C Œf .x/3 D f .x/0 C f .x/1 e 1 C f .x/2 e 2 C f .x/3 e 3 Cf .x/23 e 23 C f .x/31 e 31 C f .x/12 e 12 C f .x/I3 I3 D .f .x/0 C f .x/I3 I3 / C .f .x/1 C f .x/23 I3 /e 1 C.f .x/2 C f .x/31 I3 /e 2 C .f .x/3 C f .x/12 I3 /e 3 ;
(8.48)
where bivectors are expressed as dual vectors, for example, I3 e 2 D e 3 e 1 . Now taking into account the linearity property of the CFT, 3D CFT can be written as follows: F .w/ D F Œf .x/ D FŒ.f .x/0 C f .x/I3 I3 / C F Œ.f .x/1 C f .x/23 I3 /e 1 CF Œ.f .x/2 C f .x/31 I3 /e 2 C F Œ.f .x/3 C f .x/12 I3 /e 3 :
(8.49)
This kind of separation can be applied to multivector fields of arbitrary dimension D, as a result Clifford–Fourier transformations can be computed by carrying out several standard Fourier transformations. For the 2D and 3D cases, one requires two FFTs and four FFTs, respectively.
8.4.2 Space and Time Geometric Algebra Fourier Transform As we show in Chap. 9, in computer vision the 3D projective space is treated in the geometric algebra of the Minkowski metric G3;1 and it is related with the projective plane (image) G3 via the projective split. Since in Sect. 8.4.1 we study
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis
219
the 3D Clifford–Fourier transform in G3 , now it will be very interesting to formulate a Clifford–Fourier transform using the framework of the space–time geometric algebra G3;1 . In G3;1 , the space–time vectors are given by x D xe1 C ye2 C ze3 Cte4 D xCte4 and the space–time frequency vectors by w D ue1 Cve2 Cwe3 Cse4 D w C se4 . Given a 16D space–time algebra function f W R3;1 ! G3 , the space– time geometric algebra Fourier transform (ST-GAFT) is computed as follows: Z F .w/ST CF T D e e4 t s f .x/e I3 xw d4 x; (8.50) R3;1
R where the space–time volume d 4 x D dtdxdydz. The part R3 f .x/e I3 xw d3 x corresponds to the 3D CFT. The inverse space–time geometric algebra Fourier transform (IST GAFT) is given by Z f .x/ D e e4 t s F .w/ST CF T e I3 xw d 4 w; (8.51) R3;1
where the space–time frequency volume d 4 w D dsdudvdw.
8.4.3 n-Dimensional Clifford–Fourier Transform Equation 8.1 for the real Fourier transform can be straightforwardly extended in the n-dimensional geometric algebra Gn framework. Given a multivector valued function f W Rn ! Gn , its n-dimensional Clifford–Fourier transform (nD CFT) is given by Z F .w/Gn D f .x/e In wx dn x; (8.52) Rn
where In stands for the pseudoscalar of Gn D Gn;0 for dimensions n D 2,3(mod) 4 or also possible for the geometric algebra G0;n for n D 1,2,3(mod 4). In the next sections, we present the 3D CFT and the space–time CFT.
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis The word wavelet was used for the first time in Alfred Haar’s thesis in 1909. Surprisingly, the wavelet transform (WT) has become a useful signal processing tool only in the last few decades, mainly due to the contributions in the areas of applied mathematics and signal processing [107, 109, 127, 129, 139]. Generally speaking, the WT is an approach that definitely overcomes the shortcomings of the window Fourier transform, thanks to the development of the QFT [29]. The generalization of the real and complex wavelet transforms was straightforward to the hypercomplex wavelet transform.
220
8 Clifford–Fourier and Wavelet Transforms
In the next sections, we review and explain the real, complex, and quaternion wavelet transforms.
8.5.1 Real Wavelet Transform The real wavelet transform (RWT) implements a meaningful decomposition of a signal f .x/ onto a family of functions. Such functions are dilations and translations of a unique function called a mother wavelet that fulfills Z
1
.t/dt D 0:
(8.53)
1
The mother wavelet is normalized, jj jj D1, and is centered with respect to a certain neighborhood (t D 0). In general, the wavelet family is generated by affine transformations. In the onedimensional case, these wavelets are given by p .s/ .s.x t//; (8.54) s;t .x/ D where .s; t/ 2 R2 represents the scale and translation, respectively. Considering L2 .R/ the vector space of measurable and square-integrable onedimensional functions of f , we can define a wavelet transform for functions f .x/ 2 L2 .R/ in terms of a certain wavelet as follows: Z C1 Wf .s; t/ D f .x/ s;t .x/dx D hf .x/; s;t .x/i: (8.55) 1
We should interpret a wavelet transform simply as a decomposition of a signal f .x/ into a set of frequency channels that have the same bandwidth on a logarithmic scale. If certain requirements are fulfilled, the inverse of the wavelet transform exists [109, 127, 128] and thus f .x/ can be reconstructed. In many image processing tasks like feature extraction or the design of matching algorithms, it is highly desirable for the used filters to exhibit good locality, in both space and frequency. The key advantage of the wavelet transform is the adaptability of its parameters s and t.
8.5.2 Discrete Wavelets The discretization of the continuous wavelet transform helps to eliminate any redundancies. For this, the functions s;t .x/ of Eq. 8.55 are discretized via the parameters s and t: sD
1 ; 2j
t D k;
(8.56)
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis
221
for the integers .j; k/ 2 Z2 . The resulting functions are called dyadic discrete wavelets: 1 xk ; (8.57) .x/ D p j;k 2j 2j for .j; k/ 2 Z2 . In the dyadic-scale space of the Eq. 8.57, if we denote Aj f as the approximation of a given f .x/ at the scale s D 21j , we express the difference between two successive approximations as Dj f D Aj 1 f Aj f;
(8.58)
where j stands for the number of a certain level. In practice, we consider a limited number of levels j D 1; 2; : : : ; n. The coarsest level is n, and A0 is an identity operator. In this regard, the function f .x/ can be decomposed as f .x; y/ D A1 f C D1 f D A2 f C D2 f C D1 f :: : n X Dj f: D An f C
(8.59)
j D1
As it has been shown [128], the multiresolution analysis of the one-dimensional function f .x/ can be carried out in terms of a scaling function and its associated wavelet function as follows: Aj f .x/ D
C1 X
hf .u/; j;k .u/ij;k .x/;
kD1
Dj f .x/ D
C1 X
hf .u/;
j;k .u/i
j;k .x/;
(8.60)
kD1
where 1 j;k .x/ D p 2j
xk 2j
;
p 1 xk j;k .x/ D p ; (8.61) 2j 2j
for .j; k/ 2 Z2 . The relation between these functions is clearly described in the frequency domain, O .2w/ D e i w H.e i w /.w/; O
(8.62)
where O stands for the Fourier transform of , H stands for the transfer function of , and HN is its complex conjugate.
222
8 Clifford–Fourier and Wavelet Transforms
The multiresolution analysis of two-dimensional functions f .x; y/ can be formulated straightforwardly as an extension of the above equations: f .x; y/ D A1 f C D1;1 f C D1;2 f C D1;3 f D A2 f C D2;1 f C D2;2 f C D2;3 f C D1;1 f C D1;2 f C D1;3 f :: : n X ŒDj;1 f C Dj;2 f C Dj;3 f : (8.63) D An f C j D1
We can characterize each approximation function, Aj f .x; y/, and the difference components, Dj;p f .x; y/, for p D 1; 2; 3, by means of a 2D scaling function ˚.x; y/ and its associated wavelet functions, p .x; y/, as follows: Aj f .x; y/ D
C1 X
C1 X
aj;k;l ˚ j;k;l .x; y/;
kD1 lD1
Dj;p f .x; y/ D
C1 X
C1 X
dj;p;k;l j;p;k;l .x; y/;
(8.64)
kD1 lD1
where 1 x k y l ; ; j ˚ j;k;l .x; y/ D j ˚ 2 2j 2 1 xk y l j;p;k;l .x; y/ D j p ; ; j 2 2j 2
.j; k; l/ 2 Z 3 ; (8.65)
and aj;k;l .x; y/ D hf .x; y/; ˚ j;k;l .x; y/i; dj;p;k;l D hf .x; y/; j;p;k;l .x; y/i:
(8.66) (8.67)
In order to carry out a separable multiresolution analysis, we decompose the scaling function ˚.x; y/ and the wavelet functions p .x; y/ as follows: ˚.x; y/ D .x/.y/; 1 .x; y/ D .x/ .y/; 2 .x; y/ D .x/.y/; 3 .x; y/ D
.x/ .y/;
(8.68)
where is a 1D scale function and is its associated wavelet function. The functions 1 , 2 , 3 are expected to extract the details of the y-axis, x-axis, and diagonal directions, respectively.
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis
223
8.5.3 Wavelet Pyramid Equation 8.63 represents in a compact manner a pyramid structure for the processing of an image f .x; y/. Given a 2D image f .x; y/, sampled at x D 1; 2; : : : ; mx and y D 1; 2; : : : ; my , we establish a pyramidal processing by computing the coefficients aj;k;l and dj;p;k;l for each level of j D 1; 2; : : : ; n. For each level j of the pyramid, we group these coefficients into the matrices Aj , Dj;p for p D 1; 2; 3 as follows: Aj D .aj;k;l /kD1;2; :::; mx IlD1;2; :::; my ;
(8.69)
Dj;p D .dj;p;k;l /kD1;2; :::; mx IlD1;2; :::; my :
(8.70)
2j
2j
2j
2j
The coefficients aj;k;l and dj;p;k;l can be obtained simply via an iterative computation of the impulse responses h and g of the filters and . This procedure is known as Mallat’s algorithm [128].
8.5.4 Complex Wavelet Transform Despite the advantage of the locality in both the spatial and frequency domains, the wavelet pyramid of real-valued wavelets unfortunately has the drawback of being neither translation-invariant nor rotation-invariant. As a result, no procedure can yield phase information. This is one of the important reasons why researchers are interested in hypercomplex wavelet transforms like the complex or quaternion wavelet transforms. Since the translation in the spatial domain is represented as a rotation in the complex domain, the complex-valued wavelet can be used for multiscale phase analysis of the image signals. This allows the interpolability of the wavelet transform at subpixel accuracy through the wavelet pyramid levels. There are many kinds of complex wavelets. Lina [122] extended the Daubechies wavelets to the complex ones. The complex wavelet, briefly outlined here, was developed by Magarey and Kingsbury [127] and used for motion estimation of video frames. It is worth pointing out that the efficiency of the matching strategies and the similarity distance measures are highly dependent on how well the wavelet is designed. By satisfying an image-matching principle, the wavelet filter pair (h,g) (impulse responses of the filters and ) must be compactly supported in the spatial domain, that is, they should show regularity (differentiable to a high order) and symmetry (leading to a linear phase). In practice, the orthogonality cannot be strictly fulfilled. As an illustration, we will describe the complex wavelet designed by Magarey and Kingsbury [109, 127], which was also utilized by Pan [144]. The impulse responses of the scale function h and the wavelet function g are a complex pair of even-lengthed modulated windows
224
8 Clifford–Fourier and Wavelet Transforms
h.k/ D bQ1 w Q 1 .k C 0:5/e i wQ 1.kC0:5/ ;
g.k/ D b1 w1 .k C 0:5/e i w1.kC0:5/ ; (8.71)
for k D nw ; nw C 1; : : : ; nw 1; b1 and bQ1 are complex constants. w1 and wQ 1 are a pair of real-valued windows of width 2nw , symmetric with respect to k D 0 and their magnitudes decay to zero at both ends. The commonly used low-pass filter is the Gauss filter: wQ 1 .k/ D e
k2 2 2 Q1
;
w1 .k/ D e
k2 2 21
:
(8.72)
In order to satisfy the trade-off of good locality of matching and information sufficiency, the minimum width of the window function should be 4, consequently, nw D 2. It is also required to fulfill the complementarity of the modulation frequencies w1 and w2 in the frequency range Œ0; as follows: w1 C w2 D :
(8.73)
Figure 8.7 presents a complex wavelet designed by Magarey and Kingsbury [127] which was also used by Pan [144]. This complementary constraint for the pair of the low-pass filter and the high-pass filter imposes that w1 > wQ 2 . In Fig. 8.7c, the Fourier transforms of h and g have a conjugated symmetry in regard to their modulation frequencies w1 and w2 . Due to the fact that one-dimensional signals have conjugated symmetric spectra, one can neglect the negative half-spectrum Œ ; o without losing information on the real 1D input signal. In order to cover
Fig. 8.7 Complex wavelets: ( from above) (a) scaling function , (b) wavelet function functions in the frequency domain
, (c) both
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis
225
the frequency range Œ0; completely without significant gaps and causing minimal overlap, one can adjust the filters at each level, j , according to the following relationship: wj D 3w Qj:
(8.74)
Thus, according to Eq. 8.73, the modulation frequencies at the first (finest) level are (see Fig. 8.7) w1 D
5
; 6
w Q1 D
: 6
(8.75)
In practice, the modulation frequencies through the levels are subdivided as w wj D j21 . For the case of 2D wavelet analysis, we can use the 1D complex wavelets of Eq. 8.71 and achieve the separability described by Eq. 8.68. These 2D wavelet filters are predominantly first quadrant filters in the frequency domain. Since real, discrete images contain significant information in the first and second frequency quadrants, one has to use the complex conjugated filters hN and gO in addition to h and g. Thus, in order to include in the computation the information from the second quadrant, one has to calculate the matrices DQ j;p p D 1; 2; 3 of the difference of coefficients, for each j -level. Kingsbury [109] called this algorithm the dual-tree complex wavelet transform, which means that one uses a mirror-processing tree to include the second quadrant. Thus, the complex wavelet analysis by the transition from level j 1 to j transforms two complex approximation sub-matrices to eight complex approximation and difference sub-matrices as follows: fAj 1 ; AQj 1 g ! fAj ; AQj ; Dj;p ; DQ j;p ; p D 1; 2; 3g;
(8.76)
where AQj is the mirror of Aj and DQ j;p is the mirror of Dj;p .
8.5.5 Quaternion Wavelet Transform The quaternion wavelet transform is a natural extension of the real and complex wavelet transform, taking into account the axioms of the quaternion algebra, the quaternionic analytic signal [29], and the separability property described by Eq. 8.68. The QWT is applied for signals of two or more dimensions. Multiresolution analysis can also be straightforwardly extended to the quaternionic case; we can, therefore, improve the power of the phase concept, which in the real wavelets is not possible, and in the case of the complex is limited to only one phase. Thus, in contrast to the similarity distance used in the complex wavelet pyramid [144], we favor the quaternionic phase concept for top-down parameter estimation.
226
8 Clifford–Fourier and Wavelet Transforms
For the quaternionic versions of the wavelet scale function h and the wavelet function g, we choose two quaternionic-modulated Gabor filters in quadrature as follows: c2 "!2 y c1 !1 x q h D g.x; y; 1 ; "/ exp i exp j 1 1 D hqee C hqoe i C hqeo j C hqoo k; cQ2 "!Q 2 y cQ1 !Q 1 x g q D g.x; y; 2 ; "/ exp i exp j 2 2 q q q q C goe i C geo j C goo k; D gee
(8.77)
(8.78)
where the parameters 1 ; 2 ; c1 ; c2 ; cQ1 ; cQ2 ; w1 ; w2 ; wQ 1 ; wQ 2 are selected to fulfill the requirements of Eqs. 8.73 and 8.75. Note that the horizontal axis x is related to i and the vertical axis y is related to j; both imaginary numbers of the quaternion algebra fulfill the equation k D ji. The right parts of the Eqs. 8.77 and 8.78 obey a natural decomposition of a quaternionic analytic function: the subindex ee (even–even) stands for a symmetric filter, eo (even–odd) or oe (odd–even) both stand for asymmetrical filters, and oo (odd–odd) stands for an asymmetrical filter as well. Thus, we can clearly see that hq and g q of Eqs. 8.77 and 8.78 are powerful filters to disentangle the symmetries of the 2D signals. At this point, we can show the disadvantageous property of a complex wavelet, that is, the merging of important information in its two filter components. When D 1, the even and odd parts of the complex wavelet merge information from two components of the quaternionic wavelet rather than separating this information: he .x; y/ D g.x; y/ cos.!1 x C !2 y/
(8.79)
D g.x; y/.cos.!1 x/ cos.!2 y/ sin.!1 x/ sin.!2 y// D hqee .x; y/ hqoo .x; y/; ho .x; y/ D g.x; y/ sin.!1 x C !2 y/ D g.x; y/.cos.!1 x/ sin.!2 y/ sin.!1 x/ cos.!2 y// D hqoe .x; y/ C hqeo .x; y/:
(8.80) (8.81)
This indicates also that for image analysis, using the phase concept, complex wavelets can use only one phase, whereas the quaternionic wavelets offer three phases. It is also possible to steer quaternionic wavelets. Kingsbury [109] computes six complex filters combining the real and imaginary parts of complex wavelets at each level of its quaternion wavelet pyramid. In the quaternionic wavelet pyramid, one can also generate these six selective filters as shown in Fig. 8.8 simply by applying the automorphisms of Eq. 8.12 to a basic quaternion filter. The advantage of our selective quaternion wavelets is that they provide three phases.
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis
227
Fig. 8.8 Quaternion wavelet filters with selective orientations ( from the left): 15ı ; 45ı ; 75ı ; 75ı ; 45ı ; 15ı
Next, we present an important theorem that tells us how we can recover the whole image’s energy by using only the energy contained in the first quadrant. Theorem Assuming that the shift-invariant wavelet filters are used in the procedure, the 2D signal f .x; y/ can be fully reconstructed from the quaternionic wavelet transform Wf .u; v/q only by considering the first quadrant. Note that the perfect reconstruction is carried out by using well-designed kernels: low-pass and high-pass
˚.u; v/ D .u; v/e .i 2 uQx/ e .j 2Qvy/ , .u; v/ D
.u; v/e .i 2ux/ e .j 2vy/ .
Proof Using the automorphisms of the quaternion algebra, Re.q/ D Re.i qi / D Re.jqj/ D Re.kqk/ (see Eq. 8.12) and taking four times the signal energy of the first quadrant, we can reconstruct f .x; y/ as follows: f .x; y/ D f .x; y/l C f .x; y/h Z 1 Z 1 D 4Re .˚.u; v/ C .u; v//Wf .u; v/q dudv : 0
0
228
8 Clifford–Fourier and Wavelet Transforms
Let us consider a signal reconstructed by the low-pass kernel: f .x; y/l D 4Re D 4Re D Re
Z Z
1 0
1
0 1
Z 0
CRe
Z
0
Z
CRe Z D Re
CRe
˚.u; v/Wf .u; v/q dudv
Z0 1
.u; v/e .i 2ux/ Wf .u; v/q e .j 2vy/ dudv
.u; v/e .i 2ux/ Wf .u; v/q e .j 2vy/ dudv
Z0 1
1
Z0 1
1
0 1
.u; v/e .i 2ux/ iWf .u; v/i e .j 2vy/ dudv
Z
.u; v/e .i 2ux/ Wf .u/q e .j 2vy/ dudv
1 0 1Z 0
.u; v/e .i 2ux/ Wf .u; v/q e .j 2vy/ dudv
1 0
0
Z
Z
0
1
Z
.u; v/e .i 2ux/ kWf .u; v/q ke .j 2vy/ dudv
.u; v/e .i 2ux/ Wf .u; v/q e .j 2vy/ dudv
0
.u; v/e .i 2ux/ jWf .u; v/q je .j 2vy/ dudv
0 1
Z
0
Z
CRe Z D Re
0 1
Z
0
CRe
1
0 1Z 1
Z 0
CRe
Z
.u; v/e .i 2ux/ Wf .u; v/q e .j 2vy/ dudv
1 1 1 Z 1
.u; v/e .i 2ux/ Wf q .u; v/e .j 2vy/ dudv :
1
1
Note that one can prove in the same way the case of the high-pass kernel .u; v/ D .u; v/e .i 2ux/ e .j 2vy/ . Thus, the complete 2D signal is equal to the sum of the outputs of these two kernels, considering only the energy of the first quadrant: f .x; y/ D 4Re
Z
1 0
C4Re Z D Re CRe
1
˚.u; v/Wf .u; v/q dudv
0 1Z 1
Z 1
Z
0
Z
.u; v/Wf .u; v/q dudv
0 1
.u; v/e .i 2ux/ Wf q .u; v/e .j 2vy/ dudv
1 1 1 Z 1
Z
1
.u; v/e .i 2ux/ Wf q .u; v/e .j 2vy/ dudv
1
D f .x; y/l C f .x; y/h :
(8.82)
The implications of this theorem are very important, particularly when we design a processing schema for multiresolution analysis, which is explained next.
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis
229
8.5.6 Quaternionic Wavelet Pyramid The quaternionic wavelet multiresolution analysis can be easily formulated from the basic ideas given in Sects. 8.5.2 and 8.5.3. For the 2D image function f .x; y/, a quaternionic wavelet multiresolution will be written as f .x; y/ D Aqn f C
n X
q q q ŒDj;1 f C Dj;2 f C Dj;3 f :
(8.83)
j D1
The upper index q indicates a quaternion 2D signal. We can characterize each apq q proximation function Aj f .x; y/ and the difference components Dj;p f .x; y/ for q p D 1; 2; 3 by means of a 2D scaling function ˚ .x; y/ and its associated wavelet functions qp .x; y/ as follows: q
Aj f .x; y/ D
C1 X
C1 X
q
aj;k;l ˚ j;k;l .x; y/;
kD1 lD1 q f .x; y/ D Dj;p
C1 X
C1 X
dj;p;k;l qj;p;k;l .x; y/;
(8.84)
kD1 lD1
where 1 q xk y l ; ˚ ; 2j 2j 2j 1 xk y l q j;p;k;l .x; y/ D j qp ; ; 2 2j 2j q
.j; k; l/ 2 Z 3 ;
˚ j;k;l .x; y/ D
(8.85)
and q
aj;k;l .x; y/ D hf .x; y/; ˚ j;k;l .x; y/i; dj;p;k;l D hf .x; y/; qj;p;k;l .x; y/i:
(8.86)
In order to carry out a separable quaternionic multiresolution analysis, we decompose the scaling function ˚ q .x; y/j and the wavelet functions qp .x; y/j for each level j as follows: ˚ q .x; y/j D i .x/j j .y/j ; q .x; y/ D i .x/ j .y/ ; j 1 q 2 .x; y/j q3 .x; y/j
j
D D
j
i .x/ j .y/ ; j j i .x/ j .y/ ; j
j
(8.87)
230
8 Clifford–Fourier and Wavelet Transforms
Fig. 8.9 Quaternionic Gabor filters in the space and frequency domains. Note that in this domain for visualizing, the transfer function was sampled at a higher rate and we erased the central part of the spectrum. (a) ( first and second upper rows) Approximation filter ˚, (b) (third and fourth rows) detail filter 3 (diagonal), (c) ( fifth row from the left) magnitude of the filter in the frequency domain: (c.1) approximation, (c.2) detail, (c.3) depiction of both, (c.4) magnified cross section with orientation of 45ı . Note that the approximation and the detail filters are located at frequencies 6 and 5
, respectively, fulfilling the requirements of Eqs. 8.73 and 8.75 6
where i .x/j and .x/ij are 1D complex filters applied along the rows and columns, respectively. Note that in and , we use the imaginary number i ; j of quaternions that fulfill j i D k. In Fig. 8.9, we show the quaternion scaling or approximation filter ˚ q .x; y/j and the quaternion wavelet function q3 .x; y/j , which is designed for detecting diagonal details. Note that, as in Fig. 8.7, the approximation and the detail filters are located at frequencies 6 and 5 6 , respectively, fulfilling the requirements of Eqs. 8.73 and 8.75. This figure also shows the quaternion filters in the quaternionic frequency domain, which were transformed using the QFT [29]. By using these formulas, we can build quaternionic wavelet pyramids. Figure 8.10 shows the two primary levels of the pyramid (fine to coarse). According
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis Columns
Rows
Columns
Rows
H1q
H2q
H1q
2
Approximation
2
H2q
2
Approximation
G2q
2
Horizontal2q
H2q
2
Vertical2q
q 2
2
Diagonal2q
q
Horizontal1
Image
q
G
q 2
2 G
q 1
H q 1
G
2
q 1
G
q 2
2
1
2 Gq 1
231
2
Vertical1q
2
Diagonal1q
Level 2
Level 1 Fig. 8.10 Abstraction of two levels of the quaternionic wavelet pyramid
to Eq. 8.87, the approximation after the first level Aq1 f .x; y/ is the output of q q q ˚ q .x; y/1 , and the differences D1;1 f; D1;2 f; and D1;3 f are the outputs of q q q 1;1 .x; y/, 1;2 .x; y/, and 1;3 .x; y/. The procedure continues through the j levels decimating the image at the outputs of the levels (indicated in Fig. 8.10 within the circle). The quaternionic wavelet analysis from level j 1 to level j corresponds to the transformation of one quaternionic approximation to a new quaternionic approximation and three quaternionic differences, that is, q fAqj 1 g ! fAqj ; Dj;p ; p D 1; 2; 3g:
(8.88)
Note that we do not use the idea of a mirror tree expressed in Eq. 8.76. As a result, the quaternionic wavelet tree is the compact and economic processing structure to be used for the case of n-dimensional multiresolution analysis. The procedure of quaternionic wavelet multiresolution analysis depicted partially in Fig. 8.10 is as follows: – (i) Convolve the 2D real signal at level j and convolve it with the scale and q q wavelet filters Hj and Gj along the rows of the 2D signal. q q – (ii) Hj and Gj are convolved with the columns of the previous responses of the filters Hjq and Gjq . – (iii) Subsample the responses of these filters by a factor of two (# 2). – (iv) The real part of the approximation at level j is taken as the input at the next level j . This process continues through the all levels j D 1; : : : ; n, repeating steps 1!4. The theorem in Sect. 8.5.5 indicates that we do not need to take into consideration any other quadrant than the first one; thus, we do not need to create a mirror architecture or a dual tree as in the case of complex wavelet multiresolution analysis [109]. In Kingsbury’s paper, the multi-resolution architecture outputs four-element
232
8 Clifford–Fourier and Wavelet Transforms
“complex” vectors fa; b; c; d g D a C bi1 C ci2 C d i1 that, according to the author, are not quaternions, as they have different algebraic properties. This structure is generated based on concepts of the signal and filter theory. Even though the dual-tree wavelet schema works correctly, the wavelet quaternion transform leads to an architecture that merges two branches of the dual-tree into one single quaternionic tree. The big advantage of the quaternionic wavelet tree is that, with the same amount of computational resources, it offers three phases for the analysis using the phase concept. The method of applying the quaternionic phase concept is explained and illustrated with real experiments in Chap. 13.
8.5.7 The Tridimensional Clifford Wavelet Transform In this section, we will use the similitude group SIM.3/ denoted by Gs D RC SO.3/ ˝ R3 D f.s; r ; t/js 2 RC ; r 2 SO.3/; t 2 R3 g, where s stands for the dilation parameter, t for a translation vector, and , for the SO.3/ rotation parameters; see Mawardi and Hitzer [133] for study of the Clifford algebra Cl3;0 -valued wavelet transform. The group action Gs on R3 in G3 will be represented in terms of rotors as follows: Gs W R3 ! R3 ; e C t: x ! sRx R
(8.89) (8.90)
The left Haar measure on Gs is given by d.s; ; t/ D d.s; / d3 t;
(8.91)
where d.s; / D dssd4 and d D 81 2 sin.1 /d1 d2 d3 . Now, let us define in the 3D geometric algebra G3 framework a mother wavelet that can be transformed by the action of the similitude group Gs . The group action on the mother wavelet can be formulated in terms of a unitary linear operator: Us; ;t W L2 .R3 I G3 / ! L2 .Gs I G3 /; .x/ ! Us; ;t .x/ D s; ;t .x/ 1 xt : r1 D 3 s s2 The family of wavelets,
s; ;t ,
is known as the daughter Clifford wavelets. Note 3 2
that the normalization constant s guarantees that the norm of of s, namely, jj
(8.92)
s; ;t jjL2 .R3 IG3 /
D jj jjL2 .R3 IG3 / :
s; ;t
is independent
(8.93)
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis
233
This can be proved easily: jj
2 s; ;t jjL2 .R3 IG
3/
D D
Z
X 1 3 R3 M s Z X 1
s3
R3 M
2 M
r1
x t s
d3 x
2 3 3 M .u/s det.r /d u
D
Z
X R3 M
2 3 M .u/d u:
In the G3 Clifford–Fourier domain, Eq. 8.92 can be represented as follows: Ff
s; ;t g.w/
3 D e I3 t w s 2 O .sr1 .w//:
(8.94)
Substituting .x t/=s D y for the argument of Eq. 8.92 under the Clifford–Fourier integral, we get Z 1 F f s; ;t g.w/ D .r1 y/e I3 w.t Cs y / s 3 d3 y 3 3 R s2 Z 3 D e I3 t w s 2 .r1 y/e I3 s wy d3 y R3
De We will call
I3 t w
s O .sr1 .w//: 3 2
(8.95)
2 L2 .R3 I G3 / an admissible wavelet if
C D
Z
Z RC
SO.3/
s 3 f O .sr1 .w//ge O .sr1 .w//d
(8.96)
is an invertible multivector constant, an finite at any w 2 R3 . We will see later that the admissibility condition is important to guarantee R that the Clifford wavelet transform is invertible. Note that for w D 0, O .0/ D R3 .x/e I3 0x d3 x D 0 for the scalar part of C to be finite. Thus, similar to the classical real-valued wavelets, an admissible Clifford-valued mother wavelet 2 L2 .R3 I G3 / ought to satisfy Z Z 3 3 .x/d x D (8.97) M .x/eM d x D 0; R3
R3
where M .x/ are real-valued wavelets. It means that Rthe integral of every component i of the Clifford mother wavelet is zero, that is, R3 M .x/d3 x D 0. The 3-dimensional Clifford wavelet transform (3D-CWT) with respect to the mother wavelet 2 L2 .R3 I G3 / is given by T W L2 .R3 I G3 / ! L2 .Gs I G3 / f ! T f .s; ; t/ D
Z f .x/ R3
D .f;
B td x s;
3
;
s; ;t /L2 .R3 IG3 / :
(8.98)
234
8 Clifford–Fourier and Wavelet Transforms
The Clifford wavelet transform of Eq. 8.98 has a Clifford–Fourier representation given by the following expression: Z 1 3 I3 t w 3 T f .s; ; t/ D d w: (8.99) fO.w/s 2 f O .sr1 .w//geQ 3 .2 / R3 Finally, the inverse tridimensional Clifford wavelet transform (3D-ICWT) is given by the following expression: Z f .x/ D T f .s; ; t/ s; ;t C 1 dd3 t G Z s D .f; s; ;t /L2 .R3 IG3 / s; ;t C 1 dd3 t; (8.100) Gs
where C is given by Eq. 8.110.
8.5.8 The Continuous Conformal Geometric Algebra Wavelet Transform In this section, we present the continuous conformal geometric algebra wavelet transform (CGAWT) on the sphere S n1 based on the conformal group of the sphere Gc . A possible description of the group is done in terms of a projective identification of the points of the Euclidean space Rn with rays in the null cone in RnC1;1 . As shown in Sect. 6.4, in conformal geometric algebra GnC1;1 , in general, the conformal transformation can be expressed as a composite of versors for transversion, translation, and rotation as follows: G D D K b T aR˛ ;
(8.101)
as a result, due to the multiplicative nature of the versors, we can avoid complex nonlinear algorithms. In contrast, Cerejeiras et al. [34] used the M¨obius transformation in Rn for the formulation of the continuous wavelet transform and wavelet frames on the sphere. This transformation is expressed in a nonlinear manner as a ratio: ma .x/ D
.x a/ ; a 2 Rn ; jaj < 1: .1 C ax/
(8.102)
In this study, we will consider the space of the square-integrable multivectorvalued functions on the sphere, the space L2 .S n1 /. In this space, the inner product and the norm are defined as follows: Z f .x/g.x/dS.x/; hf; giL2 D S n1 Z ˝ ˛ f .x/f .x/ 0 dS.x/; (8.103) jjf jj2 D 2n S n1
8.5 From Real to Clifford Wavelet Transforms for Multiresolution Analysis
235
where hi0 stands for the scalar component of the multivector and dS.x/ is the normalized spin(n)-invariant measure on S n1 . We consider the following unitary operators acting on the multivector function 2 L2 .S n1 /: the rotor and the dilator R; D 2 GnC1;1 , Q .Rx R/:
! D
(8.104)
In general, the continuous conformal geometric algebra wavelet transform (CCGAWT) with respect to the mother wavelet .x/ 2 L2 .S n1 / is given by T W L2 .S n1 I GnC1;1 / ! L2 .Gc I GnC1;1 /; Z f .x/ g f ! T f .; / D ; dS.x/ S n1
D .f;
; /L2 .Gc IGnC1;1 / :
(8.105)
Wiaux et al. [194] proved the correspondence principle between spherical wavelets and Euclidean wavelets by applying the inverse stereographic projection of a wavelet on the plane. Thus, typical functions and wavelets can be carried onto the 2-sphere such as the 2D Gauss function and the 2D Gabor function: Gauss .x/
2 . ' // 2
2
D e jxj !
.; '/Gauss D e .tan
; 1
2
i k0 xjxj ! G .x/ D e
G .; '/ D
2 2
e i k0 tan 2 cos.'0 '/e 2 tan 1 C cos./
: (8.106)
8.5.9 The n-Dimensional Clifford Wavelet Transform The similitude group of Rn , SIM.n/, is: Gs D RC SO.n/ ˝ Rn D f.s; r ; t/js 2 RC ; r 2 SO.n/; t 2 Rn g, where s stands for the dilation parameter, t for a translation vector, and the SO.n/ rotation parameters. The n-dimensional Clifford wavelet transform (nD-CWT) with respect to the mother wavelet 2 L2 .Rn I Gn / is given by T W L2 .Rn I Gn / ! L2 .Gs I Gn /; f ! T f .s; ; t/ D
Z f .x/ Rn
D .f;
B td x s;
n
;
s; ;t /L2 .Rn IGn / :
(8.107)
The Clifford wavelet transform of Eq. 8.107 has a Clifford–Fourier representation given by the following expression: 1 T f .s; ; t/ D .2 /n
Z Rn
n fO.w/s 2 f O .sr1 .w//g e In t w dn w:
(8.108)
236
8 Clifford–Fourier and Wavelet Transforms
Finally, the inverse n-dimensional Clifford wavelet transform (nD-ICWT) is given by the following expression: f .x/ D
Z T f .s; ; t/ Gs
D
Z
.f; Gs
s; ;t C
s; ;t /L2 .R3 IG3 /
1
ddn t
s; ;t C
1
ddn t;
(8.109)
where C is similar to Eq. 8.110 above but for an n-dimension C D
Z RC
Z SO.n/
s n f O .sr1 .w//ge O .sr1 .w//d:
(8.110)
8.6 Conclusion This chapter has shown that low-level image processing improves if signal representation and processing are carried out in a system of rich algebraic properties like geometric algebra. In this system, an n-dimensional representation of 2D signals unveils properties that are otherwise obscured with the use of algorithms developed by applying matrix algebra over the real or complex field. Strikingly, the bivector algebra of the geometric algebras allows us to disentangle the symmetries of 2D signals, as in the case of the quaternionic Fourier and wavelet transforms. The chapter also presents different Clifford–Fourier and wavelet transforms in various dimensions and metrics. This opens up an area for the design and implementation of new filters, convolution on the sphere, and estimators for the analysis of nD signals in a much wider scope.
Chapter 9
Geometric Algebra of Computer Vision
9.1 Introduction This chapter presents a mathematical approach based on geometric algebra for the computation of problems in computer vision. We show that geometric algebra is a well-founded and elegant language for expressing and implementing those aspects of linear algebra and projective geometry that are useful for computer vision. Since geometric algebra offers both geometric insight and algebraic computational power, it is useful for tasks such as the computation of projective invariants, camera calibration, and the recovery of shape and motion. We mainly focus on the geometry of multiple uncalibrated cameras and omnidirectional vision. The following section introduces 3D and 4D geometric algebras and formulates the aspects of projective geometry relevant for computer vision within the geometric algebra framework. Given this background, in Sects. 9.2.4–9.2.5, we look at the concepts of projective transformations and projective split. Section 9.3 presents the algebra of incidence, and Sect. 9.4 the algebra in projective space of points, lines, and planes. An analysis of monocular, binocular, and trinocular geometries is given in Sect. 9.6. We dedicate the following sections to omnidirectional vision using, however, the conformal geometric algebra framework. The motivation to resort to this framework is because mirrors can be represented using parameterized spheres, thus the computing can be greatly simplified. Conclusions follow in the final section. In this chapter, vectors will be notated in boldface type (except for basis vectors) and multivectors will appear in bold italics. Lowercase letters are used to denote vectors in the 3D Euclidean space, and uppercase letters to denote vectors in the 4D projective space. We also denote a geometric algebra Gp;q;r , that refers to an n-dimensional geometric algebra in which p-basis vectors square to C1, q-basis vectors to 1, and r-basis vectors to 0, so that p C q C r D n.
9.2 The Geometric Algebras of 3D and 4D Spaces The need for a mathematical framework to understand and process digital camera images of the 3D world prompted researchers in the late 1970s to use projective geometry. By using homogeneous coordinates, we were able to embed both 3D E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 9, c Springer-Verlag London Limited 2010
237
238
9 Geometric Algebra of Computer Vision
Euclidean visual space in the projective space P 3 or R4 , and the 2D Euclidean space of the image plane in the projective space P 2 or R3 . As a result, inherently nonlinear projective transformations from 3D space to the 2D image space now became linear, and points and directions could be differentiated rather than being represented by the same quantity. The use of projective geometry was indeed a step forward. However, there is still a need [88] for a mathematical system that reconciles projective geometry with multilinear algebra. Indeed, in most of the computer-vision literature these mathematical systems are divorced from one another. Depending on the problem at hand, researchers typically resort to different systems, for example, dual algebra [32] for incidence algebra and the Hamiltonian formulation for motion estimation [198]. Here, we suggest the use of a system that offers all of these mathematical facilities. Unlike matrix and tensor algebra, geometric algebra does not obscure the underlying geometry of the problem. We will, therefore, formulate the main aspects of such problems using geometric algebra, starting with the modeling of 3D visual space and the 2D image plane.
9.2.1 3D Space and the 2D Image Plane To introduce the basic geometric models in computer vision, we consider the imaging of a point X 2 R4 into a point x 2 R3 . We assume that the reader is familiar with the basic concepts of using homogeneous coordinates, which are discussed in greater detail in later sections. The optical center, C , of the camera may be different from the origin of the world coordinate system, O, as depicted in Fig. 9.1.
Fig. 9.1 Pinhole camera model
9.2 The Geometric Algebras of 3D and 4D Spaces
239
In the standard matrix representation, the mapping P W X ! x is expressed by the homogeneous transformation matrix 2
t11 P D 4 t21 t31
t12 t22 t32
3 t14 t14 5 ; t34
t13 t23 t33
(9.1)
which may be decomposed into a product of three matrices, P D KP0 M0c ;
(9.2)
where P0 , K, and M0c will now be defined. P0 is the 3 4 matrix, 2
1 0 40 1 0 0
3 0 0 0 05; 1 0
(9.3)
which simply projects down from 4D to 3D and represents a projection from homogeneous coordinates of space to homogeneous coordinates of the image plane. M0c represents the 4 4 matrix containing the rotation and translation, which take the world frame F0 to the camera frame Fc , and is given explicitly by M0c D
R 0T
t : 1
(9.4)
This Euclidean transformation is described by the extrinsic parameters of rotation (33 matrix R) and translation (31 vector t). Finally, the 33 matrix K expresses the assumed camera model as an affine transformation between the camera plane and the image coordinate system, so that K is an upper triangular matrix. In the case of the perspective (or pinhole) camera, the matrix K, which we now call Kp , is given by 2
˛u Kp D 4 0 0
˛v 0
3 u0 v0 5 : 1
(9.5)
The five parameters in Kp represent the camera parameters of scaling, shift, and rotation in the camera plane. In this case, the distance from the optical center to the image plane is finite. In later sections, we formulate the perspective camera in the geometric algebra framework. One important task in computer vision is to estimate the matrix of intrinsic camera parameters Kp and the rigid motion given in M0c , in order to be able to reconstruct 3D data from image sequences.
240
9 Geometric Algebra of Computer Vision
9.2.2 The Geometric Algebra of 3D Euclidean Space The 3D space is spanned by three basis vectors f 1 ; 2 ; 3 g, with i2 D C1 for all i D 1; 2; 3, and the 3D geometric algebra generated by these basis vectors has 23 D 8 elements given by 1 ; f 1 ; 2 ; 3 g; f 1 2 ; 2 3 ; 3 1 g; f 1 2 3 g I : „ƒ‚… „ ƒ‚ … „ ƒ‚ … „ ƒ‚ … scalar
vectors
(9.6)
trivector
bivectors
Here, bivectors can be interpreted as oriented areas and trivectors as oriented volumes. Note that we will not use bold for these basis vectors. The highest-grade element is a trivector called the unit pseudoscalar. It can easily be verified that the pseudoscalar 1 2 3 squares to 1 and commutes with all multivectors (a multivector is a general linear combination of any of the elements in the algebra) in the 3D space. The unit pseudoscalar I is crucial when discussing duality. In a threedimensional space, we can construct a trivector a^ b^ c, but no 4-vectors exist, since there is no possibility of sweeping the volume element a^b^c over a fourth dimension. The three basis vectors f i g multiplied by I give the following basis bivectors: I 1 D 2 3 ;
I 2 D 3 1 ;
I 3 D 1 2 :
(9.7)
If we identify i ; j ; k of the quaternion algebra with 2 3 ; 3 1 , and 1 2 , we can recover the famous Hamilton relations: i 2 D j 2 D k2 D ijk D 1:
(9.8)
In geometric algebra, a rotor R is an even-grade element of the algebra that Q D 1. The relation between quaternions and rotors is as satisfies the equation RR follows: if Q D fq0 ; q1 ; q2 ; q3 g represents a quaternion, then the rotor that performs the same rotation is simply given by R D q0 C q1 .I 1 / q2 .I 2 / C q3 .I 3 /:
(9.9)
The quaternion algebra is, therefore, seen to be a subset of the geometric algebra of three-dimensional space.
9.2.3 A 4D Geometric Algebra for Projective Space For the modeling of the image plane, we use G3;0;0 , which has the standard Euclidean signature. We will show that if we choose to map between projective space and 3D Euclidean space via the projective split (see Sect. 9.2.5), we are then forced to use the 4D geometric algebra G1;3;0 for P 3 . The Lorentzian metric we
9.2 The Geometric Algebras of 3D and 4D Spaces
241
are using here has no adverse effects in the operations we outline in this chapter. However, we briefly discuss in a later section how a fC C CCg metric for our 4D space and a different split is being favored in recent research. The Lorentzian 4D algebra has as its vector basis 1 ; 2 ; 3 ; 4 , where 42 D C1 and k2 D 1 for k D 1; 2; 3. This then generates the following multivector basis: I : (9.10) 1 ; k ; 2 3 ; 3 1 ; 1 2 ; 4 1 ; 4 2 ; 4 3 ; I k ; „ƒ‚… „ƒ‚… „ƒ‚… „ ƒ‚ … „ƒ‚… scalar
4 vectors
6
bivectors
4
trivectors pseudoscalar
The pseudoscalar is I D 1 2 3 4 , with I 2 D .1 2 3 4 /.1 2 3 4 / D .3 4 /.3 4 / D 1:
(9.11)
The fourth basis vector, 4 , can also be seen as a selected direction for the projective split [14] operation in 4D. We will see shortly that by carrying out the geometric product via 4 , we can associate bivectors of our 4D space with vectors of our 3D space. The role and use of the projective split operation is treated in more detail in a later section.
9.2.4 Projective Transformations Historically, the success of homogeneous coordinates has partly been due to their ability to represent a general displacement as a single 4 4 matrix and to linearize nonlinear transformations [57]. The following equation indicates how a projective transformation may be linearized by going up one dimension in the GA framework. In general, a point .x; y; z/ in the 3D space is projected onto the image via a transformation of the form: x0 D
˛1 x C ˇ1 y C ı1 z C 1 ˛2 x C ˇ2 y C ı2 z C 2 ; y0 D : Q Q Q C ız Q C Q ˛x Q C ˇy C ız C Q ˛x Q C ˇy
(9.12)
This transformation, which is expressed as the ratio of two linear transformations, is indeed nonlinear. In order to convert this nonlinear transformation in E 3 into a linear transformation in R4 , we define a linear function f p by mapping vectors onto vectors in R4 such that the action of f p on the basis vectors fi g is given by Q 4; f p .1 / D ˛1 1 C ˛2 2 C ˛3 3 C ˛ Q 4; f p .2 / D ˇ1 1 C ˇ2 2 C ˇ3 3 C ˇ Q 4; f p .3 / D ı1 1 C ı2 2 C ı3 3 C ı f p .4 / D 1 1 C 2 2 C 3 3 C Q 4 :
(9.13)
242
9 Geometric Algebra of Computer Vision
When we use homogeneous coordinates, a general point P in E 3 given by x D x 1 C y 2 C z 3 becomes the point X D .X1 C Y 2 C Z3 C W 4 / in R4 , where x D X=W; y D Y =W , and z D Z=W . Now, using f p , the linear map of X onto X0 is given by X0 D
3 X
Q C ıZ Q C Q W /4 : (9.14) f.˛i X C ˇi Y C ıi Z C i W /i g C .˛X Q C ˇY
i D1
The coordinates of the vector x 0 D x 0 1 C y 0 2 C z0 3 in E 3 which correspond to X0 are given by x0 D
˛1 X C ˇ1 Y C ı1 Z C 1 W ˛1 x C ˇ1 y C ı1 z C 1 D ; Q Q Q C ız Q C Q ˛X Q C ˇY C ıZ C Q W ˛x Q C ˇy
(9.15)
and similarly, y0 D
˛2 x C ˇ2 y C ı2 z C 2 ; Q C ız Q C Q ˛x Q C ˇy
z0 D
˛3 x C ˇ3 y C ı3 z C 3 : Q C ız Q C Q ˛x Q C ˇy
(9.16)
If the above represents projection from the world onto a camera image plane, we should take into account the focal length of the camera. This would require that Q etc. Thus, we can define z0 D f (focal length) independently ˛3 D f ˛; Q ˇ3 D f ˇ, of the point chosen. The nonlinear transformation in E 3 then becomes a linear transformation, f p , in R4 . The linear function f p can then be used to prove the invariant nature of various quantities under projective transformations [117].
9.2.5 The Projective Split The idea of the projective split was introduced by Hestenes [88] in order to connect projective geometry and metric geometry. This is done by associating the even subalgebra of GnC1 with the geometric algebra of next lower dimension, Gn . One can define a mapping between the spaces by choosing a preferred direction in GnC1 , nC1 . Then, by taking the geometric product of a vector X 2 GnC1 and nC1 , X^nC1 ; XnC1 D XnC1 C X^nC1 D XnC1 1 C XnC1
(9.17)
X^ the vector x 2 Gn can be associated with the bivector X nC1 2 GnC1 . This result nC1 can be projectively interpreted as the pencil of all lines passing though the point nC1 . In physics, the projective split is called the space–time split, and it relates a space–time system G4 with a Minkowski metric to an observable system G3 with a Euclidean metric.
9.2 The Geometric Algebras of 3D and 4D Spaces
243
In computer vision, we are interested in relating elements of projective space with their associated elements in the Euclidean space of the image plane. Optical rays (bivectors) are mapped to points (vectors), optical planes (trivectors) are mapped to lines (bivectors), and optical volumes (4-vectors) to planes (trivector or pseudoscalar). Suppose we choose 4 as a selected direction in R4 . We can then define a mapping that associates the bivectors i 4 , i D 1; 2; 3 in R4 with the vectors i , i D 1; 2; 3 in E 3 : 1 1 4 ; 2 2 4 ; 3 3 4 :
(9.18)
Note that in order to preserve the Euclidean structure of the spatial vectors f i g (i.e., i2 D C1) we are forced to choose a non-Euclidean metric for the basis vectors in R4 . That is why we select the basis 42 D C1; i D 1; i D 1; 2; 3 for G1;3;0 . This is precisely the metric structure of the Lorentzian space–time used in studies of relativistic physics. We note here that although we have chosen to relate our spaces via the projective split, it is possible to use a Euclidean metric fC C CCg for our 4D space and define the split using reciprocal vectors [150]. It is becoming apparent that this is the preferred procedure since it generalizes nicely to splits from higher-dimensional spaces. However, for the problems discussed in this chapter, we encounter no problems by using the projective split. Let us now see how we associate points via the projective split. For a vector X D X1 1 C X2 2 C X3 3 C X4 4 in R4 , the projective split is obtained by taking the geometric product of X and 4 : X^4 X4 .1 C x/: X4 D X4 C X^4 D X4 1 C X4
(9.19)
According to Eq. 9.18, we can associate X^4 =X4 in R4 with the vector x in E 3 . Similarly, if we start with a vector x D x1 1 C x2 2 C x3 3 in E 3 , we represent it in R4 by the vector X D X1 1 C X2 2 C X3 3 C X4 4 such that X1 X2 X3 X^4 D 1 4 C 2 4 C 3 4 X4 X4 X4 X4 X1 X2 X3 D 1 C 2 C 3 ; X4 X4 X4
xD
(9.20)
Xi for i D 1; 2; 3. This manner of representing x in a higherwhich implies xi D X 4 dimensional space can, therefore, be seen to be equivalent to using homogeneous coordinates X for x. Let us now look at the representation of a line L in R4 . A line is given by the outer product of two vectors:
L D A^B D .L14 1 4 C L24 2 4 C L34 3 4 / C .L23 2 3 C L31 3 1 C L12 1 2 /
244
9 Geometric Algebra of Computer Vision
D .L14 1 4 C L24 2 4 C L34 3 4 / I.L23 1 4 C L31 2 4 C L12 3 4 / D n I m:
(9.21)
The six quantities fni ; mi g i D 1; 2; 3 are precisely the Pl¨ucker coordinates of the line. The quantities fL14 ; L24 ; L34 g are the coefficients of the spatial part of the bivector that represents the line direction n. The quantities fL23 ; L31 ; L12 g are the coefficients of the non-spatial part of the bivector that represents the moment of the line m. Let us now see how we can relate this line representation to an E 3 representation via the projective split. We take a line L, joining points A and B: L D A^B D hABi2 D hA4 4 Bi2 :
(9.22)
Here, the notation hM ik tells us to take the grade-k part of the multivector M . Now, using our previous expansions of X4 in the projective split for vectors, we can write L D .A4 /.B4 /h.1 C a/.1 b/i2 ;
(9.23)
4 where a D A^ 4 and b D BB^ are the E 3 representations of A and B. Writing 4 A 4 A4 D A4 and B4 D B4 then gives us
L D A4 B4 h1 C .a b/ abi2 D A4 B4 f.a b/ C a^bg:
(9.24)
Let us now “normalize” the spatial and non-spatial parts of the above bivector: .a b/ .a^b/ L D C A4 B4 ja bj ja bj ja bj D .nx 1 C ny 2 C nz 3 / C .mx 2 3 C my 3 1 C mz 1 2 /
L0 D
D .nx 1 C ny 2 C nz 3 / C I3 .mx 1 C my 2 C mz 3 / D n0 C I3 m0 : (9.25) Here, I3 D 1 2 3 I4 . Note that in E 3 the line has two components, a vector representing the direction of the line and the dual of a vector (bivector) representing the moment of the line. This kind of representation completely encodes the position of the line in 3D space by specifying the plane in which the line lies and the perpendicular distance of the line from the origin. Finally, for the plane D A ^B ^c^, the expected result should be 0 D n0 C I3 d (left as exercise 9.6).
9.3 The Algebra of Incidence In this section, we discuss the use of geometric algebra for the algebra of incidence [95]. First, we define the concept of the bracket; then, we discuss duality; and, finally, we show that the basic projective geometry operations of meet and join can be
9.3 The Algebra of Incidence
245
expressed easily in terms of standard operations within the geometric algebra. We also briefly discuss the linear algebra framework in GA, indicating how it can be used within projective geometry. One of the main reasons for moving to a projective space is so that lines, planes, etc., may be represented as real geometric objects and so that operations of intersection, etc., can be performed using simple manipulations (rather than sets of equations, as in the Euclidean space E 3 ).
9.3.1 The Bracket In an n-D space, any pseudoscalar will span a hypervolume of dimension n. Since, up to scale, there can only be one such hypervolume, all pseudoscalars P are multiples of the unit pseudoscalar I , such that P D ˛I , with ˛ being a scalar. We compute this scalar multiple by multiplying the pseudoscalar P and the inverse of I : PI 1 D ˛II 1 D ˛ ŒP :
(9.26)
Thus, the bracket ŒP of the pseudoscalar P is its magnitude, arrived at by multiplication from the right by I 1 . This bracket is precisely the bracket of the Grassmann–Cayley algebra. The sign of the bracket does not depend on the signature of the space, and as a result it has been a useful quantity for nonmetrical applications of projective geometry. The bracket of n vectors fx i g is Œx 1 x 2 x 3 : : :x n D Œx 1 ^x 2 ^x 3 ^: : :^x n D .x 1 ^x 2 ^x 3 ^: : :^xn /I 1 :
(9.27)
It can also be shown that this bracket expression is equivalent to the definition of the determinant of the matrix whose row vectors are the vectors x i . To understand how we can express a bracket in projective space in terms of vectors in Euclidean space, we can expand the pseudoscalar P using the projective split for vectors: P D X1 ^X2 ^X3 ^X4 D hX1 4 4 X2 X3 4 4 X4 i4 D W1 W2 W3 W4 h.1 C x 1 /.1 x 2 /.1 C x 3 /.1 x 4 /i4 ; where Wi D Xi 4 from Eq. 9.19. A pseudoscalar part is produced by taking the product of three spatial vectors (there are no spatial bivectorspatial vector terms), such that P D W1 W2 W3 W4 hx 1 x 2 x 3 x 1 x 3 x 4 C x 1 x 2 x 4 C x 2 x 3 x 4 i4 D W1 W2 W3 W4 h.x 2 x 1 /.x 3 x 1 /.x 4 x 1 /i4 D W1 W2 W3 W4 f.x 2 x 1 /^.x3 x 1 /^.x4 x 1 /g:
(9.28)
246
9 Geometric Algebra of Computer Vision
If Wi D 1, we can summarize the above relationships between the brackets of four points in R4 and E 3 as follows: ŒX1 X2 X3 X4 D .X1 ^X2 ^X3 ^X4 /I4 1 f.x 2 x 1 /^.x 3 x 1 /^.x 4 x 1 /gI3 1 :
(9.29)
9.3.2 The Duality Principle and Meet and Join Operations In order to introduce the concepts of duality, which are so important in projective geometry, we must first define the dual A of an r-vector A as A D AI 1 :
(9.30)
This notation, A , relates the ideas of duality to the notion of a Hodge dual in differential geometry. Note that, in general, I 1 might not necessarily commute with A. We see, therefore, that the dual of an r-vector is an .n r/-vector. For example, in 3D space the dual of a vector (r D 1) is a plane or bivector (n r D 3 1 D 2). By using the ideas of duality, we are then able to relate the inner product to incidence operators in the following manner. In an n-D space, suppose we have an rvector A and an s-vector B, where the dual of B is given by B D BI 1 BI 1 . Since BI 1 D B I 1 C B ^ I 1 , we can replace the geometric product by the inner product alone (in this case, the outer product equals zero, and there can be no .n C 1/-D vector). Now, using the identity Ar .Bs Ct / D .Ar ^Bs /Ct
for
r C s t;
(9.31)
we can write A.BI 1 / D A.B I 1 / D .A^B/I 1 D .A^B/I 1 :
(9.32)
This expression can be rewritten using the definition of the dual as follows: AB D .A^B/ :
(9.33)
This equation shows the relationship between the inner and outer products in terms of the duality operator. Now, if r C s D n, then A^B is of grade n and is therefore a pseudoscalar. Using Eq. 9.26, it follows that AB D .A^B/ D .A^B/I 1 D .ŒA^BI /I 1 D ŒA^B:
(9.34)
9.4 Algebra in Projective Space
247
We see, therefore, that the bracket relates the inner and outer products to nonmetric quantities. It is via this route that the inner product, normally associated with a metric, can be used in a nonmetric theory such as projective geometry. It is also interesting to note that since duality is expressed as a simple multiplication by an element of the algebra, there is no need to introduce any special operators or any concept of a different space. When we work with lines and planes, however, it will clearly be necessary to employ operations for computing the intersections, or joins, of geometric objects. For this, we will require a means of performing the set-theory operations of intersection, \, and union, [. If in an n-dimensional geometric algebra the r-vector A and the s-vector B do not have a common subspace (null intersection), one can define the join of both vectors as follows: J D A [ B D A^B;
(9.35)
so that the join is simply the outer product (an rCs vector) of the two vectors. However, if A and B have common blades, the join would not simply be given by the wedge but by the subspace the two vectors span. The operation join, J , can be interpreted as a common dividend of lowest grade and is defined up to a scale factor. The join gives the pseudoscalar if .r C s/ n. We will use [ to represent the join only when the blades A and B have a common subspace; otherwise, we will use the ordinary exterior product,^, to represent the join. If there exists a k-vector C such that for A and B we can write A D A0 C and B D B 0 C for some A0 and B 0 , then we can define the intersection or meet using the duality principle as follows: .A \ B/ D A [ B :
(9.36)
This is a beautiful result, telling us that the dual of the meet is given by the join of the duals. Since the dual of A \ B will be taken with respect to the join of A and B, we must be careful to specify which space we will use for the dual in Eq. 9.36. However, in most cases of practical interest this join will indeed cover the entire space, and therefore we will be able to obtain a more useful expression for the meet using Eq. 9.33. Thus, A \ B D ..A \ B/ / D .A [ B /I D .A ^B /.I 1 I /I D .A B/: (9.37) The above concepts are discussed further in [95].
9.4 Algebra in Projective Space Having introduced duality, defined the operations of meet and join, and given the geometric approach to linear algebra, we are now ready to carry out geometric computations using the algebra of incidence.
248
9 Geometric Algebra of Computer Vision
Consider three non-collinear points, P1 ; P2 ; P3 , represented by vectors x 1 ; x 2 ; x 3 in E 3 and by vectors X1 ; X2 ; X3 in R4 . The line L12 joining points P1 and P2 can be expressed in R4 by the bivector L12 D X1 ^X2 :
(9.38)
Any point P , represented in R4 by X, on the line through P1 and P2 , will satisfy the equation X^L12 D X^X1 ^X2 D 0:
(9.39)
This is therefore the equation of the line in R4 . In general, such an equation is telling us that X belongs to the subspace spanned by X1 and X2 , that is, X D ˛1 X1 C ˛2 X2 ;
(9.40)
for some ˛1 ; ˛2 . In computer vision, we can use this equation as a geometric constraint to test whether a point X lies on L12 . The plane ˚123 passing through points P1 ; P2 ; P3 is expressed by the following trivector in R4 : ˚123 D X1 ^X2 ^X3 :
(9.41)
In 3D space, there are generally three types of intersections we wish to consider: the intersection of a line and a plane, a plane and a plane, and a line and a line. To compute these intersections, we will make use of the following general formula [94], which gives the inner product of an r-blade, Ar D a1 ^a2 ^ ^ar , and an s-blade, Bs D b1 ^b2 ^ ^bs (for s r): Bs .a1 ^a2 ^ ^ar / X .j1 j2 : : :jr /Bs .aj1 ^aj2 ^ ^ajs /ajs C1 ^ ^ajr : D j
(9.42)
In the equation, we sum over all the combinations j D .j1 ; j2 ; : : :; jr / such that no two jk ’s are the same. If j is an even permutation of .1; 2; 3; : : :; r/, then the expression .j1 j2 : : :jr / D C1, and it is an odd permutation if .j1 j2 : : :jr / D 1.
9.4.1 Intersection of a Line and a Plane In the space R4 , consider the line A D X1 ^X2 intersecting the plane ˚ D Y1 ^ Y2 ^Y3 . We can compute the intersection point using a meet operation, as follows: A \ ˚ D .X1 ^X2 / \ .Y1 ^Y2 ^Y3 / D A \ ˚ D A ˚:
(9.43)
9.4 Algebra in Projective Space
249
Here, we have used Eq. 9.37, and we note that in this case the join covers the entire space. Note also that the pseudoscalar I4 in G1;3;0 for R4 squares to 1, that it commutes with bivectors but anticommutes with vectors and trivectors, and that its inverse is given by I4 1 D I4 . Therefore, we can claim that A ˚ D .AI 1 /˚ D .AI /˚:
(9.44)
Now, using Eq. 9.42, we can expand the meet, such that A \ ˚ D .AI /.Y1 ^Y2 ^Y3 / D f.AI /.Y2 ^Y3 /gY1 C f.AI /.Y3 ^Y1 /gY2 Cf.AI /.Y1 ^Y2 /gY3 :
(9.45)
Noting that .AI / .Yi ^Yj / is a scalar, we can evaluate Eq. 9.45 by taking scalar parts. For example, .AI / .Y2^Y3 / D hI.X1^X2 /.Y2^Y3 /i D I.X1^X2^Y2^Y3 /. From the definition of the bracket given earlier, we can see that if P D X1 ^X2 ^ Y2 ^Y3 , then ŒP D .X1 ^X2 ^Y2 ^Y3 /I4 1 . If we therefore write ŒA1 A2 A3 A4 as a shorthand for the magnitude of the pseudoscalar formed from the four vectors, then we can readily see that the meet reduces to A \ ˚ D ŒX1 X2 Y2 Y3 Y1 C ŒX1 X2 Y3 Y1 Y2 C ŒX1 X2 Y1 Y2 Y3 ;
(9.46)
thus giving the intersection point (vector in R4 ).
9.4.2 Intersection of Two Planes The line of intersection of two planes, ˚1 D X1 ^X2 ^X3 and ˚2 D Y1 ^Y2 ^Y3 , can be computed via the meet of ˚1 and ˚2 : ˚1 \ ˚2 D .X1 ^X2 ^X3 / \ .Y1 ^Y2 ^Y3 /:
(9.47)
As in the previous section, this expression can be expanded as ˚1 \ ˚2 D ˚1 .Y1 ^Y2 ^Y3 / D f.˚1 I /Y1 g.Y2 ^Y3 / C f.˚1 I /Y2 g.Y3 ^Y1 / Cf.˚1 I /Y3 g.Y1 ^Y2 /: Once again, the join covers the entire space and so the dual is easily formed. Following the arguments of the previous section, we can show that .˚1 I / Yi ŒX1 X2 X3 Yi , so that the meet is
250
9 Geometric Algebra of Computer Vision
˚1 \ ˚2 D ŒX1 X2 X3 Y1 .Y2 ^Y3 / C ŒX1 X2 X3 Y2 .Y3 ^Y1 / CŒX1 X2 X3 Y3 .Y1 ^Y2 /;
(9.48)
thus producing a line of intersection or bivector in R4 .
9.4.3 Intersection of Two Lines Two lines will intersect only if they are coplanar. This means that their representations in R4 , A D X1 ^X2 , and B D Y1 ^Y2 will satisfy the equation A^B D 0:
(9.49)
This fact suggests that the computation of the intersection should be carried out in the 2D Euclidean space, which has an associated 3D projective counterpart R3 . In this plane, the intersection point is given by A \ B D A B D .AI3 /.Y1 ^Y2 / D f..AI3 /Y1 /Y2 ..AI3 /Y2 /Y1 g ;
(9.50)
where I3 is the pseudoscalar for R3 . Once again, we evaluate ..AI3 /Yi / by taking scalar parts: .AI3 /Yi D hX1 X2 I3 Yi i D I3 X1 X2 Yi D ŒX1 X2 Yi :
(9.51)
The meet can, therefore, be written as A \ B D ŒX1 X2 Y1 Y2 ŒX1 X2 Y2 Y1 ;
(9.52)
where the bracket ŒA1 A2 A3 in R3 is understood to mean .A1 ^A2 ^A3 /I3 1 . This equation is often an impractical means of performing the intersection of two lines. (See [150] for a method which creates a plane and intersects one of the lines with this plane; see also [53] for a discussion of what information can be gained when the lines do not intersect. See Chap. 5 for a complete treatment of the incidence relations between points, lines, and planes in the n-affine plane.)
9.4.4 Implementation of the Algebra In order to implement the expressions and procedures outlined so far in this chapter, we have used a computer algebra package written for MAPLE. The program can be found in [114] and works with geometric algebras of G1;3;0 and G3;0;0 ; a more
9.5 Projective Invariants
251
general version of this program, which works with a user-defined metric on an n-D algebra, is in the public domain [3]. Using these packages, we are easily able to simulate the situation of several cameras (or one moving camera) looking at a world scene and to do so entirely in projective (4D) space. Much of the work described in subsequent sections has been tested in MAPLE.
9.5 Projective Invariants In this section, we use the framework established in Chap. 9 to show how standard invariants can be expressed, both elegantly and concisely, using geometric algebra. We begin by looking at algebraic quantities that are invariant under projective transformations, arriving at these invariants using a method which can be easily generalized from one dimension to two and three dimensions.
9.5.1 The 1D Cross-Ratio The fundamental projective invariant of points on a line is the so-called cross-ratio, , defined as AC BD .t3 t1 /.t4 t2 / D D ; BC AD .t4 t1 /.t3 t2 / where t1 D jPAj; t2 D jPBj, t3 D jP C j, and t4 D jPDj. It is fairly easy to show that for the projection through O of the collinear points A; B; C , and D onto any line, remains constant. For the 1D case, any point q on the line L can be written as q D t 1 relative to P , where 1 is a unit vector in the direction of L. We can then move up a dimension to a 2D space, with basis vectors .1 ; 2 /, which we will call R2 and in which q is represented by the following vector Q: Q D T 1 C S 2 :
(9.53)
Note that, as before, q is associated with the bivector, as follows: qD
T T Q^2 D 1 2 1 D t 1 : Q2 S S
(9.54)
When a point on line L is projected onto another line L0 , the distances t and t 0 are related by a projective transformation of the form t0 D
˛t C ˇ : ˛t Q C ˇQ
(9.55)
252
9 Geometric Algebra of Computer Vision
This nonlinear transformation in E 1 can be made into a linear transformation in R2 by defining the linear function f 1 , which maps vectors onto vectors in R2 : Q 2; f 1 .1 / D ˛1 1 C ˛ Q 2: f 1 .2 / D ˇ1 1 C ˇ Consider two vectors X1 and X2 in R2 . Now form the bivector S1 D X1 ^X2 D 1 I2 ; where I2 D 1 2 is the pseudoscalar for R2 . We can now look at how S1 transforms under f 1 : S10 D X01 ^X02 D f 1 .X1 ^X2 / D .detf 1 /.X1 ^X2 /:
(9.56)
This last step follows as a result of a linear function, which must map a pseudoscalar onto a multiple of itself, the multiple being the determinant of the function. Suppose that we now select four points of the line L, whose corresponding vectors in R2 are fXi g, i D 1; : : :; 4, and consider the ratio R1 of two wedge products: X1 ^X2 : X3 ^X4
(9.57)
.detf 1 /X1 ^X2 X01 ^X02 : 0 0 D X3 ^X4 .detf 1 /X3 ^X4
(9.58)
R1 D Then, under f 1 , R1 ! R01 , where R01 D
R1 is therefore invariant under f 1 . However, we want to express our invariants in terms of distances on the 1D line. To do this, we must consider how the bivector S1 in R2 projects down to E 1 : X1 ^X2 D .T1 1 C S1 2 /^.T2 1 C S2 2 / D .T1 S2 T2 S1 /1 2 S1 S2 .T1 =S1 T2 =S2 /I2 D S1 S2 .t1 t2 /I2 :
(9.59)
In order to form a projective invariant that is independent of the choice of the arbitrary scalars Si , we must now consider ratios of the bivectors Xi ^Xj (so that detf 1 cancels), and then multiples of these ratios (so that the Si ’s cancel). More precisely, consider the following expression: I nv1 D
.X3 ^X1 /I21 .X4 ^X2 /I21 : .X4 ^X1 /I21 .X3 ^X2 /I21
(9.60)
9.5 Projective Invariants
253
Then, in terms of distances along the lines, under the projective transformation f 1 , Inv1 goes to Inv01 , where Inv01 D
.t3 t1 /.t4 t2 / S3 S1 .t3 t1 /S4 S2 .t4 t2 / D ; S4 S1 .t4 t1 /S3 S2 .t3 t2 / .t4 t1 /.t3 t2 /
(9.61)
which is independent of the Si ’s and is indeed the 1D classical projective invariant, the cross-ratio. Deriving the cross-ratio in this way allows us to easily generalize it to form invariants in higher dimensions.
9.5.2 2D Generalization of the Cross-Ratio When we consider points in a plane, we once again move up to a space with one higher dimension, which we shall call R3 . Let a point P in the plane M be described by the vector x in E 2 , where x D x 1 C y 2 . In R3 , this point will be represented by X D X1 C Y 2 C Z3 , where x D X=Z and y D Y =Z. As described in Chap. 9, we can define a general projective transformation via a linear function f 2 by mapping vectors to vectors in R3 , such that Q 3; f 2 .1 / D ˛1 1 C ˛2 2 C ˛ Q 3; f 2 .2 / D ˇ1 1 C ˇ2 2 C ˇ Q 3: f 2 .3 / D ı1 1 C ı2 2 C ı
(9.62)
Now, consider three vectors (representing non-collinear points) Xi , i D 1; 2; 3, in R3 , and form the trivector S2 D X1 ^X2 ^X3 D œ2 I3 ;
(9.63)
where I3 D 1 2 3 is the pseudoscalar for R3 . As before, under the projective transformation given by f 2 , S2 transforms to S20 , where S20 D detf 2 S2 :
(9.64)
Therefore, the ratio of any trivector is invariant under f 2 . To project down into E 2 , assuming Xi 3 D Zi .1 C x i / under the projective split, we then write S2 I3 1 D hX1 X2 X3 I3 1 i D hX1 3 3 X2 X3 3 3 I3 1 i D Z1 Z2 Z3 h.1 C x 1 /.1 x 2 /.1 C x 3 /3 I3 1 i;
(9.65)
254
9 Geometric Algebra of Computer Vision
where the x i represent vectors in E 2 . We can only get a scalar term from the expression within the brackets by calculating the product of a vector, two spatial vectors, and I3 1 , that is, S2 I3 1 D Z1 Z2 Z3 h.x1 x 3 x 1 x 2 x 2 x 3 /3 I3 1 i D Z1 Z2 Z3 f.x2 x 1 /^.x3 x 1 /gI2 1 :
(9.66)
It is therefore clear that we must use multiples of the ratios in our calculations, so that the arbitrary scalars Zi cancel. In the case of four points in a plane, there are only four possible combinations of Zi Zj Zk , and it is not possible to cancel all the Z’s by multiplying two ratios of the form Xi ^Xj ^Xk together. For five coplanar points fXi g, i D 1; : : :; 5, however, there are several ways of achieving the desired cancellation. For example, I nv2 D
.X5 ^X4 ^X3 /I31 .X5 ^X2 ^X1 /I31 : .X5 ^X1 ^X3 /I31 .X5 ^X2 ^X4 /I31
According to Eq. 9.66, we can interpret this ratio in E 2 as .x 5 x 4 /^.x5 x 3 /I21 .x 5 x 2 /^.x5 x 1 /I21 .x 5 x 1 /^.x5 x 3 /I21 .x 5 x 2 /^.x5 x 4 /I21 A543 A521 D ; A513 A524
I nv2 D
(9.67)
where 12 Aijk is the area of the triangle defined by the three vertices x i ; x j ; x k . This invariant is regarded as the 2D generalization of the 1D cross-ratio.
9.5.3 3D Generalization of the Cross-Ratio For general points in E 3 , we have seen that we move up one dimension to compute in the 4D space R4 . For this dimension, the point x D x 1 Cy 2 Cz 3 in E 3 is written as X D X1 C Y 2 C Z3 C W 4 , where x D X=W; y D Y =W; z D Z=W . As before, a nonlinear projective transformation in E 3 becomes a linear transformation, described by the linear function f 3 in R4 . Let us consider 4-vectors in R4 , fXi g, i D 1; : : :; 4, and form the equation of a 4-vector: S3 D X1 ^X2 ^X3 ^X4 D 3 I4 ;
(9.68)
where I4 D 1 2 3 4 is the pseudoscalar for R4 . As before, S3 transforms to S30 under f 3 : S30 D X01 ^X02 ^X03 ^X04 D detf 3 S3 :
(9.69)
9.6 Visual Geometry of n-Uncalibrated Cameras
255
The ratio of any two 4-vectors is therefore invariant under f 3 , and we must take multiples of these ratios to ensure that the arbitrary scale factors Wi cancel. With five general points we see that there are five possibilities for forming the combinations Wi Wj Wk Wl . It is then a simple matter to show that one cannot consider multiples of ratios such that the W factors cancel. It is, however, possible to do this if we have six points. One example of such an invariant might be Inv3 D
.X1 ^X2 ^X3 ^X4 /I41 .X4 ^X5 ^X2 ^X6 /I41 : .X1 ^X2 ^X4 ^X5 /I41 .X3 ^X4 ^X2 ^X6 /I41
(9.70)
Using the arguments of the previous sections, we can now write .X1 ^X2 ^X3 ^X4 /I41 W1 W2 W3 W4 f.x2 x 1 /^.x3 x 1 /^.x4 x 1 /gI31 :
(9.71)
We can, therefore, see that the invariant I nv3 is the 3D equivalent of the 1D crossratio and consists of ratios of volumes, I nv3 D
V1234 V4526 ; V1245 V3426
(9.72)
where Vijkl is the volume of the solid formed by the four vertices x i ; x j ; x k ; x l . Conventionally, all of these invariants are well known, but we have outlined here a general process which is straightforward and simple for generating projective invariants in any dimension.
9.6 Visual Geometry of n-Uncalibrated Cameras In this section, we analyze the constraints relating the geometry of n-uncalibrated cameras. First, the pinhole camera model for one view will be defined in terms of lines and planes. Then, for two and three views, the epipolar geometry is defined in terms of bilinear and trilinear constraints. Since the constraints are based on the coplanarity of lines, we will only be able to define relationships expressed by a single tensor for up to four cameras. For more than four cameras, the constraints are linear combinations of bilinearities, trilinearities, and quadrilinearities.
9.6.1 Geometry of One View We begin with the monocular case depicted in Fig. 9.2. Here, the image plane is defined by a vector basis of three arbitrary non-collinear points, A1 , A2 , and A3 , with the optical center given by A0 (all vectors in R4 ). Thus, fAi g can be used as a
256
9 Geometric Algebra of Computer Vision
Fig. 9.2 Projection into a single camera: the monocular case
coordinate basis for the image plane ˚A D A1 ^A2 ^A3 , so that any point A0 lying in ˚A can be written as A0 D ˛1 A1 C ˛2 A2 C ˛3 A3 :
(9.73)
We are also able to define a bivector basis of the image plane fLA i g spanning the lines in ˚A : LA 1 D A2 ^A3 ;
LA 2 D A3 ^A1 ;
LA 3 D A1 ^A2 : (9.74)
The bivectors fLA i g together with the optical center allow us to define three planes, iA , as follows: 1A D A0 ^A2 ^A3 D A0 ^LA 1; 2A D A0 ^A3 ^A1 D A0 ^LA 2; A 3 D A0 ^A1 ^A2 D A0 ^LA 3:
(9.75)
We will call the planes jA optical planes. Clearly, each is a trivector and can be written as jA D tj1 .I 1 / C tj 2 .I 2 / C tj 3 .I 3 / C tj 4 .I 4 / tjk .I k /;
(9.76)
since there are four basis trivectors in our 4D space. These optical planes also clearly intersect the image plane in the lines fLA j g. Furthermore, the intersections of the optical planes also define a bivector basis that spans the pencil of optical rays (rays passing through the optical center of the camera) in R4 . Thus, LA1 D 2 \ 3 A0 ^A1 ; LA2 D 3 \ 1 A0 ^A2 ; LA3 D 1 \ 2 A0 ^A3 ;
(9.77)
9.6 Visual Geometry of n-Uncalibrated Cameras
257
so that any optical ray resulting from projecting a world point X onto the image plane can be written as A0 ^X D xj LAj : We can now interpret the camera matrices, used so widely in computer vision applications, in terms of the quantities defined in this section. The projection of any world point X onto the image plane is notated x and is given by the intersection of line A0 ^X with the plane ˚A . Thus, x D .A0 ^X/ \ .A1 ^A2 ^A3 / D X f.A0 ^ / \ .A1 ^A2 ^A3 /g; (9.78) where is summed over 1 to 4. We can now expand the meet given by Eq. 9.78 to get x D Xj fŒA0 ^j ^A2 ^A3 A1 C ŒA0 ^j ^A3 ^A1 A2 CŒA0 ^j ^A1 ^A2 A3 g:
(9.79)
Since x D x k Ak , Eq. 9.79 implies that x D Xj Pjk Ak and therefore that x k D Pjk Xj ; where Pjk D ŒA0 ^j ^LA k Œk ^j D tkj ;
(9.80)
since I j ^k D I ıjk . The matrix P takes X to x and is therefore the standard j camera projection matrix. If we define a set of vectors fA g, j D 1; 2; 3, which are j the duals of the planes fjA g, that is, A D jA I 1 , it is then easy to see that j A D jA I D I jA D Œtj1 1 C tj 2 2 C tj 3 3 C tj 4 4 :
(9.81)
Thus, we see that the projected point x D x j Aj may be given by j x j D XA
j or x D .XA /Aj :
(9.82)
That is, the coefficients in the image plane are formed by projecting X onto the vectors formed by taking the duals of the optical planes. This is, of course, equivalent to the matrix formulation 2
3 2 13 2 x1 A t11 2 5X D 4 x D 4 x2 5 D 4 A t21 3 x3 A t31
t12 t22 t32
t13 t23 t33
3 2 3 X1 t14 6 X2 7 7 t24 5 6 4 X3 5 P X : (9.83) t34 X4
258
9 Geometric Algebra of Computer Vision
The elements of the camera matrix are, therefore, simply the coefficients of each optical plane in the coordinate frame of the world point. They encode the intrinsic and extrinsic camera parameters as given in Eq. 9.2. Next, we consider the projection of world lines in R4 onto the image plane. Suppose we have a world line L D X1 ^X2 joining the points X1 and X2 . If x 1 D .A0 ^X1 / \ ˚A and x 2 D .A0 ^X2 / \ ˚A (i.e., the intersections of the optical rays with the image plane), then the projected line in the image plane is clearly given by l D x 1 ^x 2 : Since we can express l in the bivector basis for the plane, we obtain l D l j LA j; where LA 1 D A2 ^A3 , etc., as defined in Eq. 9.74. From our previous expressions for projections given in Eq. 9.82, we see that we can also write l as follows: j k l D x 1 ^x 2 D .X1 A /.X2 A /Aj ^Ak l p LA p;
(9.84)
which tells us that the line coefficients fl j g are 2 3 3 2 l 1 D .X1 A /.X2 A / .X1 A /.X2 A /; 2 3 1 1 3 /; l D .X1 A /.X2 A / .X1 A /.X2 A 1 2 2 1 /.X2 A / .X1 A /.X2 A /: l 3 D .X1 A
(9.85)
Using the identity in Eq. 9.36 and utilizing the fact that the join of the duals is the dual of the meet, we are then able to deduce identities of the following form for each l j :
2 3 ^A / D .X1 ^X2 /.2A \ 3A / D L.LA l 1 D .X1 ^X2 /.A 1/ :
We, therefore, obtain the general result,
j l j D L.LA j / LLA ;
(9.86)
where we have defined LjA to be the dual of LA j . Thus, we have once again expressed the projection of a line L onto the image plane by contracting L with the set of lines dual to those formed by intersecting the optical planes. We can summarize the two results derived here for the projections of points (X1 and X2 ) and lines (L D X1 ^X2 ) onto the image plane: j
x 1 D .X1 A /Aj ; k A l D .LLjA /LA j l Lk :
j
x 2 D .X2 A /Aj ; (9.87)
9.6 Visual Geometry of n-Uncalibrated Cameras
259
j Having formed the sets of dual planes fA g and dual lines LjA for a given image plane, it is then conceptually very straightforward to project any point or line onto that plane. If we express the world and image lines as bivectors, L D ˛j j C ˛Q j I j and p LA D ˇj j C ˇQj I j , we can write Eq. 9.87 as a matrix equation:
2
2
3 2 l1 u11 l D 4 l 2 5 D 4 u21 l3 u31
u12 u22 u32
u13 u23 u33
u14 u24 u34
u15 u25 u35
3 ˛1 3 6 ˛2 7 7 u16 6 6 7 ˛ 6 7 3 u26 5 6 7 PL lN; (9.88) 6 ˛Q 1 7 u36 6 7 4 ˛Q 2 5 ˛Q 3
where lN is the vector of Pl¨ucker coordinates Œ˛1 ; ˛2 ; ˛3 ; ˛Q 1 ; ˛Q 2 ; ˛Q 3 and the matrix Q PL contains the ˇ and beta’s, that is, information about the camera configuration. When we back-project a point x or line l in the image plane, we produce their duals, that is, a line lx or a plane l , respectively. These back-projected lines and planes are given by the following expressions: j
j
lx D A0 ^x D .XA /A0 ^Aj D .XA /LA j;
(9.89)
j .LLA /A0 ^LA j
(9.90)
l D A0 ^l D
D
j .LLA /jA :
9.6.2 Geometry of Two Views In this and subsequent sections, we work in projective space R4 , although a return to 3D Euclidean space will be necessary when we discuss invariants in terms of image coordinates; this will be done via the projective split. Figure 9.3 shows a world point X projecting onto points A0 and B0 in the two image planes A and B , respectively. The so-called epipoles E AB and E BA correspond to the intersections of the line joining the optical centers with the image planes. Since the points A0 ; B0 ; A0 ; B0 are coplanar, we can formulate the bilinear constraint by taking advantage of the fact that the outer product of these four vectors must disappear. Thus, A0 ^B0 ^A0 ^B0 D 0:
(9.91)
Now, if we let A0 D ˛i Ai and B0 D ˇj Bj , then Eq. 9.91 can be written as ˛i ˇj fA0 ^B0 ^Ai ^Bj g D 0:
(9.92)
Defining FQij D fA0 ^B0 ^Ai ^Bj gI 1 ŒA0 B0 Ai Bj gives us FQij ˛i ˇj D 0;
(9.93)
260
9 Geometric Algebra of Computer Vision
Fig. 9.3 Sketch of binocular projection of a world point
which corresponds in R4 to the well-known relationship between the components of the fundamental matrix [126] or the bilinear constraint in E 3 , F , and the image coordinates [126]. This suggests that FQ can be seen as a linear function mapping two vectors onto a scalar: FQ .A; B/ D fA0 ^B0 ^A^BgI 1 ;
(9.94)
so that FQij D FQ .Ai ; Bj /. Note that viewing the fundamental matrix as a linear function means that we have a coordinate-independent description. Now, if we use the projective split to associate our point A0 D ˛i Ai in the image plane with its E 3 representation a0 D ıi ai , where ai D Ai^ 4 , it is not difficult to see that the Ai 4 coefficients are expressed as follows: ˛i D
A0 4 ıi : Ai 4
(9.95)
Thus, we are able to relate our 4D fundamental matrix FQ to an observed fundamental matrix F in the following manner: FQkl D .Ak 4 /.Bl 4 /Fkl ;
(9.96)
9.6 Visual Geometry of n-Uncalibrated Cameras
261
so that ˛k FQkl ˇl D .A0 4 /.B0 4 /ık Fkl l ;
(9.97)
4 . F is the standard fundamental matrix that we where b0 D i bi , with bi D BBi^ i 4 would form from observations.
9.6.3 Geometry of Three Views The so-called trilinear constraint captures the geometric relationships existing between points and lines in three camera views. Figure 9.4 shows three image planes A ; B , and C with bases fAi g, fBi g, and fCi g and optical centers A0 ; B0 ; C0 .
Fig. 9.4 Model of the trinocular projection of the visual 3D space
262
9 Geometric Algebra of Computer Vision
Intersections of two world points Xi with the planes occur at points A0i ; B0i ; C0i , i D 1; 2. The line joining the world points is L12 D X1^X2 , and the projected lines are denoted by L0A ; L0B , and L0C . We first define three planes: ˚A0 D A0 ^A01 ^A02 ;
˚B0 D B0 ^B01 ^B02 ;
˚C0 D C0 ^C01 ^C02 : (9.98)
It is clear that L12 can be formed by intersecting ˚B0 and ˚C0 : L12 D ˚B0 \ ˚C0 D .B0 ^L0B / \ .C0 ^L0C /:
(9.99)
If LA1 D A0 ^ A01 and LA2 D A0 ^ A02 , then we can easily see that L1 and L2 intersect with L12 at X1 and X2 , respectively. We, therefore, have LA1 ^L12 D 0
and
LA2 ^L12 D 0;
(9.100)
which can then be written as .A0 ^A0i /^f.B0 ^L0B / \ .C0 ^L0C /g D 0
for i D 1; 2:
(9.101)
This suggests that we should define a linear function, T , that maps a point and two lines onto a scalar as follows: T .A0 ; L0B ; L0C / D .A0 ^A0 /^f.B0 ^L0B / \ .C0 ^L0C /g:
(9.102)
Now, using the line bases of the planes B and C in a similar manner as was used for plane A in Eq. 9.74, we can write A0 D ˛i Ai ; L0B D ljB LB j ;
L0C D lkC LC k:
(9.103)
0 C 0 If we define the components of a tensor as Tijk D T .Ai ; LB j ; Lk /, and if A ; LB ; 0 and LC are all derived from projections of the same two world points, then Eq. 9.101 tells us that we can write
Tijk ˛i ljB lkC D 0:
(9.104)
T is the trifocal tensor [83, 173] and Eq. 9.104 is the trilinear constraint. In [80, 173], this constraint was arrived at by considering camera matrices; here, however, Eq. 9.104 is arrived at from purely geometric considerations, namely, that two planes intersect in a line, which in turn intersects with another line. To see how we relate the three projected lines, we express the line in image plane A joining A01 and A02 as the intersection of the plane joining A0 to the world line L12 with the image plane ˚A D A1 ^A2 ^A3 : L0A D A01 ^A02 D .A0 ^L12 / \ ˚A :
(9.105)
9.6 Visual Geometry of n-Uncalibrated Cameras
263
Considering L12 as the meet of the planes ˚B0 \ ˚C0 and using the expansions of L0A , L0B , and L0C given in Eq. 9.103, we can rewrite this equation as n o B C liA LA B0 ^LB \ C0 ^LC \ ˚A : (9.106) i D .A 0 ^A i /^lj lk j k Using the expansion of the meet given in Eq. 9.48, we have n oi h B C B C B \ C LA D .A ^A /^l l ^L ^L liA LA 0 i 0 0 i j k j i ; k
(9.107)
which, when we equate coefficients, gives liA D Tijk ljB lkC :
(9.108)
Thus, we obtain the familiar equation that relates the projected lines in the three views.
9.6.4 Geometry of n-Views If we have n-views, let us choose four of these views and denote them by A, B, C, and N. As before, we assume that fAj g, fBj g, . . . , etc., j D 1; 2; 3, define the image planes. Let ˚Ai D A0 ^ Ai ^ A0 , ˚Bi D B0 ^ Bi ^ B0 , etc., where A0 , B0 , etc., are the projections of a world point P onto the image planes. The expression ˚Aj _ ˚Bk represents a line passing through the world point P , as does the equation ˚C l \ ˚N m . Since these two lines intersect, we have the condition f˚A j \ ˚B kg^f˚C l \ ˚N mg D 0:
(9.109)
Consider also the world line L D X1 ^ X2 that projects down to la ; lb ; lc ; ln in the four image planes (see Fig. 9.5). We know from the previous sections that it is possible to write L in terms of these image lines as the meet of two planes in various ways, for example, L D .A0 ^la / \ .B0 ^lb /;
(9.110)
L D .C0 ^lc / \ .N 0 ^ln /:
(9.111)
Now, since L^L D 0, we can consider la D `ia LA i , etc., and then write `ia `jb `kc `m n
h i h i B C N A0 ^LA \ B ^ C \ N D 0; (9.112) ^L ^L ^L 0 0 0 i j m k
which can be further expressed as `ia `jb `kc `m n Qijkm D 0:
(9.113)
264
9 Geometric Algebra of Computer Vision
la
ld lb
lc
A0
D0 B0
C0
Fig. 9.5 Model of the tetraocular projection of the visual 3D space
Here, Q is the so-called quadrifocal tensor and Eq. 9.113 is the quadrilinear constraint recently discussed in [83]. The above constraint in terms of lines is straightforward, but it is also possible to find a relationship between point coordinates and Q. To do this, we expand Eq. 9.109 as follows: ˛r ˇs ıt u
nh i h io B N A0 ^LA ^ C0 ^LC D 0; jr \ B0 ^Lks lt \ N 0 ^Lmu (9.114)
A where we have used the notation LA jr D Aj ^Ar ijr Li . Thus, we can also write the above equation as
˛r ˇs ıt u i1 jr i2 ks i3 lt i4 mu Qi1 i2 i3 i4 D 0;
(9.115)
for any fi; j; k; mg.
9.7 Omnidirectional Vision We know that traditional perspective cameras have a narrow field of view. One effective way to increase the visual field is to use a catadioptric sensor, which consists of a conventional camera and a convex mirror [22]. In order to be able to model the catadioptric sensor geometrically, it must satisfy the restriction that all the measurements of light intensity pass through only one point in space (effective viewpoint). The complete class of mirrors that satisfy such a restriction was analyzed by Baker and Nayar [5]. In [69], a unifying theory for central catadioptric systems was introduced. They showed that central catadioptric projection is equivalent to a projective mapping
9.7 Omnidirectional Vision Table 9.1 Mirror parameters œ and of the unified catadioptric projection.
265 Mirror
œ
Parabolic
1
Hyperbolic Elliptical
1
p d 2 C 4p 2 1 p d 2 C 4p 2
2p 1 d.1 2p/ p d 2 C 4p 2 d.1 2p/ p d 2 C 4p 2
Fig. 9.6 The catadioptric unifying model expressed with conformal geometric algebra entities
from a sphere to a plane. This is done by projecting the points on the unit sphere (centered at the origin) with respect to the point N D .0; 0; œ) onto a plane orthogonal to the Z-axis and with a Hesse distance . The parameters œ and are in function of the mirror parameters p and d (see Table 9.1), where the latus rectum or focal chord [28] is 4p, and d is the distance between the two focal points.
9.7.1 Omnidirectional Vision and Geometric Algebra The central catadioptric projection in terms of conformal geometric algebra was first introduced in [15]. In this work, the authors showed how the unified theory of catadioptric projection can be handled effectively and easily by the conformal geometric algebra framework. For the catadioptric image formation, we only need three entities (see Fig. 9.6). The first one is a unit sphere S (not necessarily centered at the origin of the coordinate system). The second one is a point N , which is at a distance œ from the sphere center. Finally, the third entity is a plane ˘ , which is orthogonal to the line S ^N ^e and at a distance from the sphere center. Observe that this is a more general definition of the unified model. Recall that the plane equation ˘ D nO C ıe has two unknowns: the vector n 2 R3 and the scalar ı. The vector n can be extracted from
266
9 Geometric Algebra of Computer Vision
the orthogonal line to the plane; thus, n D .S ^ N ^ e/ I31 ;
(9.116)
and b n D n=jnj. The distance from the sphere (center) to the plane can be calculated with S ˘ , since we know that the distance from the sphere to the plane is , then we have S ˘ D . Thus, 1 S ˘ D c C .c2 2 /e C e0 .b n C ıe/ D c b n ı D ; 2
(9.117)
and then ı D c b n . Therefore, the equation of the plane is ˘ Db n C ıe D b n C .c b n /e D b n C .S b n /e:
(9.118)
An interesting thing that we must note in the definition of this model is that we never talk about any coordinate system. This is because the conformal geometric algebra is a free coordinate system framework, and it allows us to define our model without referring to any other. In the next section, we see how points in the space are projected to the catadioptric image through this model.
9.7.2 Point Projection A point x 2 R3 , represented with X in the conformal space, is projected to a point Y on the catadioptric image in a two-step projection. The first step is the projection of the point X onto the sphere S ; this means that we must find the line on which the point X and the center of the sphere lie. This can easily be done using the line L1 D X ^ S ^ e1 ;
(9.119)
and then the intersection of the line with the sphere is Z D S L1 ;
(9.120)
which is a point-pair (Z D P1 ^ P2 ). From it, we take the nearest point with P1 D
Z C jZ j : Z e
(9.121)
The second step is the projection of the point P1 to the catadioptric image plane. This is done by intersecting the line L2 D N ^ P1 ^ e1
(9.122)
9.7 Omnidirectional Vision
267
Fig. 9.7 Point projection in the catadioptric unifying model, which is expressed in the conformal geometric algebra framework
with the plane ˘ ; that is,
Q D L2 ˘:
(9.123)
With these simple steps, we can project any point in the 3D visual space to the catadioptric image through the unit sphere (see Fig. 9.7).
9.7.3 Inverse Point Projection In the previous section, we saw how a point in space is projected onto the catadioptric image. Now, in this section, we see the inverse process: given a point on the catadioptric image, how can we recover the original 3D point? First, let Q be a point in the catadioptric image; then the first step is the projection of the point Q onto the sphere S . This is done by intersecting the line
with the sphere S , that is,
L2 D Q ^ N ^ e1
(9.124)
Z D L2 S :
(9.125)
From the point-pair Z, we extract the point P1 using P1 D
Z C jZ j ; Z e
(9.126)
which is nearest to Q. The second step is to find the original line L1 with L1 D S ^ P1 ^ e1 :
(9.127)
268
9 Geometric Algebra of Computer Vision
The original 3D point X lies on the line L1 , but the exact point cannot be found because a single view does not allow us to know the projective depth.
9.8 Invariants in the Conformal Space The previously mentioned invariants can also be calculated using the conformal geometric algebra. The reason to do this is because our omnidirectional system model is developed in the CGA framework. Thus, it would be nice if we could relate the omnidirectional system with the invariants theory. To compute the 1D cross-invariant, we first embed the 2D point x 2 G2 with the conformal point X 2 G3;1 using (6.14). Now, recall that the outer product of r conformal points can be described with (6.88). If you observe that Ar term (6.89), you will see that it encodes the outer product of the Euclidean points embedded by the conformal points. Therefore, the outer product of r conformal points, besides representing geometric objects, can also be used to calculate projective invariants. Consider the outer product of two conformal points X1 ; X2 2 G3;1 ; that is, 1 1 ˙ X 1 ^ X 2 D Ar AC r e 0 Ar e 1 Ar E 2 2 1 1 D .x1 ^ x2 / .x2 x1 /e0 .x21 x2 x22 x1 /e1 .x22 x21 /E: 2 2 (9.128) Note that it represents a point-pair or a 1D sphere. Also note that the term Ar contains the outer product of the Euclidean points x1 and x2 . Now, to extract the Ar term from a conformal geometric entity, we can do the following: 1 1 ˙ X1 ^ X2 ^ E D Ar ^ E AC r e 0 ^ E Ar e 1 ^ E Ar E ^ E 2 2 (9.129) D Ar ^ E; since e0 ^ E D e1 ^ E D E ^ E D 0. Thus, the outer product of a conformal geometric entity with E gives us the term Ar D x1 ^ x2 D ıe1 e2 multiplied by E, which is Ar ^ E D x1 ^ x2 ^ E D ıe1 e2 eC e D ıe12C ; (9.130) where I3;1 D e12C is the pseudoscalar of G3;1 . If we multiply the above equation 1 1 by I3;1 , we obtain ı D .Ar ^ E/I3;1 . Therefore, the 1D cross-ratio using four conformal points X1 ; X2 ; X3 ; X4 can be calculated with C1 .X1 ; X2 ; X3 ; X4 / DD
1 1 .X3 ^ X4 ^ E/I3;1 .X1 ^ X2 ^ E/I3;1 1 1 .X1 ^ X3 ^ E/I3;1 .X2 ^ X4 ^ E/I3;1
:
(9.131)
9.8 Invariants in the Conformal Space
269
A similar formulation can be done for 2D and 3D cross-ratios. The 2D cross-ratio can be calculated with C2 .X1 ; X2 ; X3 ; X4 ; X5 / D
1 1 .X5 ^ X2 ^ X1 ^ E/I4;1 .X5 ^ X4 ^ X3 ^ E/I4;1 1 1 .X5 ^ X1 ^ X3 ^ E/I4;1 .X5 ^ X2 ^ X4 ^ E/I4;1
;
(9.132) where I4;1 D e123C is the pseudoscalar for G4;1 . Observe that in this case we are working with circles (or 2D spheres) instead of point-pairs, since the outer product of three points leads to a circle. The 3D cross-ratio can be calculated with C3 .X1 ; X2 ; X3 ; X4 ; X5 ; X6 / D 1 1 .X2 ^ X3 ^ X4 ^ X5 ^ E/I5;1 .X1 ^ X4 ^ X5 ^ X6 ^ E/I5;1 1 1 .X1 ^ X4 ^ X2 ^ X5 ^ E/I5;1 .X3 ^ X4 ^ X6 ^ X5 ^ E/I5;1
:
(9.133)
9.8.1 Invariants and Omnidirectional Vision We have seen how to calculate projective invariants from circles using the conformal geometric algebra. Furthermore, in Sect. 9.7, we also saw how to model the omnidirectional vision system using conformal geometric algebra. Now, we will see how to combine both ideas. The first thing that we must know is that the projective invariants do not hold in the catadioptric image. However, they hold on the image sphere. Thus, if we project the points on the catadioptric image to the unit sphere, we can recover the projective invariant. To clarify this, we explain the 1D projective case, which can be seen as a cross section of the 2D case. First, let S be the unit sphere centered at the origin, defined as 1 S D e0 e1 : (9.134) 2 Also, let the Euclidean points q1 ; q2 ; : : : ; qn with conformal representation 1 Qi D qi C q2i e1 C e0 ; for i D 1; : : : ; n ; 2
(9.135)
be points in the catadioptric image plane. Remember that the number of points (n) needed in the 1D and 2D cases are four and five, respectively. Note that in the 1D cases the points Qi lie on the intersection of the plane e3 (i.e., the plane passing through the origin and with normal e3 ) with the catadioptric image plane (Fig. 9.8). Using the Eqs. 9.124–9.126, we project the points Qi in the catadioptric image onto the sphere to get the points Pi (in the 1D case the points lie on a circle that is the intersection of the sphere with the e3 -plane); see Fig. 9.8. To compare the invariants on the sphere with the invariants on the projective plane, we define the projective plane ˘p as ˘p D e2 C e1 ;
(9.136)
270
9 Geometric Algebra of Computer Vision
Fig. 9.8 Projection of the points Qi in the catadioptric image to the points Pi onto the sphere
which is the plane with normal e2 and a Hesse distance equal to 1. Now, to project the points Pi onto the projective image plane (see Fig. 9.9), we find the line L1;i that passes through the center of the sphere and the point Pi with Eq. 9.127. Then the lines L1;i are intersected with the projective plane to find the points Ui with Ui D L1;i ˘p for i D 1; : : : ; n:
(9.137)
The point Ui is called a flat point, which is the outer product of a conformal point with the null vector e1 (the point at infinity). To obtain the conformal point from the flat point, we can use Vi D
1 Ui ^ e0 C .Ui E/E 2
Ui ^ e0 .Ui E/E
2
e1 C e0 :
(9.138)
Once we have the points Pi on the sphere S and the points Vi on the plane ˘p , we calculate their respective 1D invariants with Eqs. 9.60 and 9.131; thus, ı D C1 .P1 ; P2 ; P3 ; P4 / D C1 .V1 ; V2 ; V3 ; V4 / :
(9.139)
In the 2D case, we do the same, but instead we use (9.132) (Fig. 9.10) to calculate the invariants (Figs. 9.11 and 9.12). Therefore, we now know that if we project the points on the catadioptric image onto the sphere, we can compute the projective invariants.
9.8 Invariants in the Conformal Space
271
Fig. 9.9 Points on the sphere projected onto the projective image plane
9.8.2 Projective and Permutation p 2 -Invariants Equation 9.132 is a projective invariant; however, it is permutation-sensitive. In [135], the authors introduce what they call projective and permutation p 2 invariants. Besides removing the dependence of labeling, the p2 -invariants use a redundant representation that significantly increases the tolerance to positional errors and allows the design of less sensitive correspondence algorithms. Given the five points (P1 ; P2 ; : : : ; P5 ) on the sphere, we calculate two independent projective invariants as follows: 1 D C2 .P1 ; P2 ; P3 ; P4 ; P5 /; 2 D C2 .P2 ; P1 ; P3 ; P4 ; P5 /:
(9.140)
With these two values, we calculate the components ˛i of a five-dimensional vector v D ˛1 e1 C ˛2 e2 C ˛3 e3 C ˛4 e4 C ˛5 e5 2 G5 as 1 ; ˛1 D J .1 /; ˛2 D J .2 /; ˛3 D J 2 2 1 1 Œ2 1 ˛4 D J ; ˛5 D J ; (9.141) 1 1 2 Œ1 1
272
9 Geometric Algebra of Computer Vision
Fig. 9.10 (a) Points Qi on the catadioptric image projected onto the sphere as the points Pi . These points define the four circles necessary to calculate the 2D invariant on the sphere. (b) Points Pi projected onto the plane ˘p as the points Vi and the four circles formed with them to calculate the 2D invariant on the plane
Fig. 9.11 The figure shows how five coplanar points on the space are projected in the catadioptric image plane through a sphere
9.10 Exercises
273
Fig. 9.12 Points Qi on the catadioptric image projected onto the sphere; these points define the four circles necessary to calculate the 2D invariant on the sphere
where J is defined as in [135], that is, J . / D
2 6 6 5 C 9 4 8 3 C 9 2 6 C 2 : 6 3 5 C 3 4 3 C 3 2 3 C 1
(9.142)
In this way, the obtained v-invariant is independent of the order of the points to calculate it.
9.9 Conclusion This chapter has outlined the use of geometric algebra as a framework for analysis and computation in computer vision. In particular, the framework for projective geometry was described and the analysis of tensorial relations between multiple camera views was presented in a wholly geometric fashion. The projective geometry operations of meet and join are easily expressed analytically and easily computed in geometric algebra. Indeed, it is the ease with which we can perform the algebra of incidence (intersections of lines, planes, etc.) that simplifies many of the otherwise complex tensorial relations. The concept of duality has been discussed and used specifically in projecting down from the world to image planes; in geometric algebra, duality is a particularly simple concept and one in which the nonmetric properties of the inner product become apparent.
9.10 Exercises 9.1 Compute in P 2 the intersecting point of the lines A D x 1^x 2 and B D y 1^y 2 . The homogeneous coordinates of the involved points are x 1 D .3; 1; 1/, x 2 D .1; 1; 1/, y 1 D .2; 1; 1/, and y 2 D .2; 0; 1/.
274
9 Geometric Algebra of Computer Vision
9.2 Points and planes are in P 3 . Find in P 3 the relative positions of the points P D .2; 4; 4; 1/ and Q D .2; 4; 5; 1/, respective to the plane E D A 1 ^ A 2 ^ A 3 , where A 1 D .2; 1; 1; 6/, A 2 D .1; 1; 1; 0/, and A 3 D .1; 0; 0; 4/. Explain these results. (Hint: The join P ^ E D dI spans a quatrivector, where the scalar d corresponds to the Hesse distance or foot. Thus, you must compute the brackets ŒPE and ŒQE . When the bracket is positive, the point is at the right, and when it is negative, the point is at the left.) 9.3 In P 3 compute the relative orientation of the lines L1 D A 1 ^ A 2 and L2 D A 3 ^A 4 , where A 1 D .2; 1; 3; 1/, A 2 D .1; 3; 5; 2/, A 3 D .1; 2; 1; 4/, and A 4 D .3; 1; 2; 4/. (Hint: The join L1 ^ L2 D dI spans a quatrivector, where the scalar d corresponds to the Hesse distance or foot. Thus, you must compute the bracket ŒL1 L2 > 0 and interpret your result using the right-hand rule.) 9.4 In P 3 compute the intersection of the following planes: E 1 D A 1 ^A 2 ^A 3 and E 2 D B 1 ^ B 2 ^ B 3 , where A 1 D .2; 1; 3; 1/, A 2 D .1; 3; 5; 2/, A 3 D .1; 2; 1; 4/, and B 1 D .4; 2; 6; 2/, B 2 D .4; 3; 5; 1/, B 3 D .3; 6; 3; 12/. 9.5 In P 2 compute the resulting intersecting point p of the intersecting lines a^b and the line passing by the point c which lies off the line a ^ b for the following points: a D .2; 3; 1/, b D .10; 10; 1/, and c D .5; 9; 1/. (Hint: First compute the line passing by point c using the direction orthogonal to the line a^b; then compute the meet of these lines.) 9.6 Derive the mapping of a plane D A ^ B ^ C of the projective space to the projective plane. The expected result should be 0 D n0 C I3 d . 9.7 Prove the Simpson rule using the join of three projected points. If the projecting point p lies at the circumference, the join of the projected points is zero. Take three arbitrary points lying on a unit circumference and a fourth one nearby. Use CLICAL for your computations. (Hint: First compute the projected points as the meet of three lines passing by the point p orthogonal to the triangle sides. The triangle is formed by three arbitrary points lying on the circumference.) 9.8 Prove the Pascal theorem using the incidence algebra in P 2 . Given six points lying in a conic, compute six intersecting lines using the join operation. The meet of these lines should give three intersecting points, and the join of these three intersecting points should be zero. Use Clifford 10.0 and any six points belonging to a conic of your choice. 9.9 Prove the Desargues theorem using the algebra of incidence. Let x 1 , x 2 , x 3 and y 1 , y 2 , y 3 be the vertices of two triangles in P 2 , and suppose that the meets of the lines fulfill the equations .x 1 ^x 2 / \ .y 1 ^y 2 / D z3 , .x 2 ^x 3 / \ .y 2 ^y 3 / D z1 , and .x 3 ^x 1 / \ .y 3 ^y 1 / D z2 . Then the join of these points z1 ^z2 ^z3 D 0 if and only if a point p exists such that x 1 ^y 1 ^p D x 2 ^y 2 ^p D x 3 ^y 3 ^p D 0. In this problem, use Clifford 4.0 and define two triangles of your choice.
9.10 Exercises
275
9.10 A point p lies on the circumcircle of an arbitrary triangle with vertices x 1 , y 1 , and z1 . From the point p, draw three perpendiculars to the three sides of the triangle to meet the circle at points x 2 , y 2 , and z2 , respectively. Using incidence algebra, show that the lines x 1 ^x 2 , y 1 ^y 2 and z1 ^z2 are parallel. 9.11 Consider a world point X 2 P 3 projected onto two image planes (see Fig. 9.3). Show that the bilinear constraint can be expressed as ˛iT FQij ˇj D 0, where ˛iT and ˇj are tensor notations. (Hint: Use the geometric constraint A 0 ^B 0 ^A 0 ^B 0 D 0, where A 0 and B 0 are the optical centers of the images and A 0 D ˛i A i and B 0 D ˇj B j are the image points spanned using three arbitrary image points Ai or B i . The epipolar plane is given by A 0 ^B 0 ^A 0 .) 9.12 Consider two non-intersecting world lines L12 and L34 . Their projecting lines intersect in the points ˛i and ˇj . Show that ˛i lies on the epipolar line passing by ˇj . 9.13 Consider a world line L12 2 P 3 projected onto three image planes (see Fig. 9.4). The projected line onto the first image plane ˚A is given by h i C liA LAi D A 0 ^ ljB lkC f.B 0 ^LB / \ .C ^L /g \ ˚A : 0 j k
(9.143)
Geometrically, this equation means that the optical planes B 0 ^ ljB LB j and C 0 ^ C C lj Lj intersect in the line L12 . Now the join operation of L12 with A 0 builds the optical plane that intersects the image plane ˚A by the line lAi LAi (linear combination of three arbitrary image lines LAi /. Expand the equation to get the coefficients liA in terms of the trilinear constraint or trifocal tensor as follows: liA D Tijk ljB lkC : 9.14 Write two sets of equations for the Pascal theorem, one set involving two cameras and the second set involving three. Note that the brackets relate the cameras via the bilinear or trilinear constraints. See Sect. 14.2 for a discussion of conics and the Pascal theorem.
Chapter 10
Geometric Neuralcomputing
10.1 Introduction It appears that for biological creatures, the external world may be internalized in terms of intrinsic geometric representations. We can formalize the relationships between the physical signals of external objects and the internal signals of a biological creature by using extrinsic vectors to represent those signals coming from the world and intrinsic vectors to represent those signals originating in the internal world. We can also assume that external and internal worlds employ different reference coordinate systems. If we consider the acquisition and coding of knowledge to be a distributed and differentiated process, we can imagine that there should exist various domains of knowledge representation that obey different metrics and that can be modeled using different vectorial bases. How it is possible that nature should have acquired through evolution such tremendous representational power for dealing with such complicated signal processing [111]. In a stimulating series of articles, Pellionisz and LlinJas [147, 148] claim that the formalization of geometrical representation seems to be a dual process involving the expression of extrinsic physical cues built by intrinsic central nervous system vectors. These vectorial representations, related to reference frames intrinsic to the creature, are covariant for perception analysis and contravariant for action synthesis. The geometric mapping between these two vectorial spaces can thus be implemented by a neural network that performs as a metric tensor [148]. Along this line of thought, we can use Clifford, or geometric, algebra to offer an alternative to the tensor analysis that has been employed since 1980 by Pellionisz and LlinJas for the perception and action cycle (PAC) theory. Tensor calculus is covariant, which means that it requires transformation laws for defining coordinateindependent relationships. Clifford, or geometric, algebra is more attractive than tensor analysis because it is coordinate-free, and because it includes spinors, which tensor theory does not. The computational efficiency of geometric algebra has also been confirmed in various challenging areas of mathematical physics [50]. The other mathematical system used to describe neural networks is matrix analysis. But, once again, geometric algebra better captures the geometric characteristics of the problem independent of a coordinate reference system, and it offers other computational
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 10, c Springer-Verlag London Limited 2010
277
278
10 Geometric Neuralcomputing
advantages that matrix algebra does not, for example, bivector representation of linear operators in the null cone, incidence relations (meet and join operations), and the conformal group in the horosphere. Initial attempts at applying geometric algebra to neural geometry have already been described in earlier papers [8, 10, 89, 90]. In this chapter, we demonstrate that standard feedforward networks in geometric algebra are generalizable. We present the geometric multi-layer perceptron and the geometric radial basis function. The chapter introduces also the Clifford support vector machines (CSVM) as a generalization of the real- and complex-valued support vector machines using the Clifford geometric algebra. In this framework, we handle the design of kernels involving the Clifford or geometric product. In this approach, one redefines the optimization variables as multivectors. This allows us to have a multivector as output. Therefore, we can represent multiple classes according to the dimension of the geometric algebra in which we work. We show that one can apply CSVM for classification and regression and interpolation. The CSVM is an attractive approach for the MIMO processing of high-dimensional geometric entities.
10.2 Real-Valued Neural Networks The approximation of nonlinear mappings using neural networks is useful in various aspects of signal processing, such as in pattern classification, prediction, system modeling, and identification. This section reviews the fundamentals of standard realvalued feedforward architectures. Cybenko [46] used for the approximation of a continuous function, g.x/, the superposition of weighted functions y.x/ D
N X
wj sj wTj x C j ;
(10.1)
j D1
where s.:/ is a continuous discriminatory function like a sigmoid, wj 2 R and x; j ; wj 2 Rn . Finite sums of the form of Eq. 10.1 are dense in C 0 .In /, if jgk .x/ yk .x/j < " for a given " > 0 and all x 2 Œ0; 1n . This is called a density theorem and is a fundamental concept in approximation theory and nonlinear system modeling [46, 101]. A structure with k outputs, yk , having several layers using logistic functions, is known as the multilayer perceptron (MLP) [166]. The output of any neuron of a hidden layer or of the output layer can be represented in a similar way, 0 oj D fj @
Ni X i D1
1 wj i xj i C j A ;
0 yk D fk @
Nj X j D1
1 wkj okj C k A ;
(10.2)
10.3 Complex MLP and Quaternionic MLP
279
where fj ./ is logistic and fk ./ is logistic or linear. Linear functions at the outputs are often used for pattern classification. In some tasks of pattern classification, a hidden layer is necessary, whereas in some tasks of automatic control, two hidden layers may be required. Hornik [101] showed that standard multilayer feedforward networks are accurately able to approximate any measurable function to a desired degree. Thus, they can be seen as universal approximators. In the case of a training failure, we should attribute any error to inadequate learning, an incorrect number of hidden neurons, or a poorly defined deterministic relationship between the input and output patterns. Poggio and Girosi [155] developed the radial basis function (RBF) network, which consists of a superposition of weighted Gaussian functions, yj .x/ D
N X
wj i Gi Di .x ti / ;
(10.3)
i D1
where yj is the j -output, wj i 2 R, Gi is a Gaussian function, Di an N N dilatation diagonal matrix, and x; ti 2 Rn . The vector ti is a translation vector. This architecture is supported by the regularization theory.
10.3 Complex MLP and Quaternionic MLP An MLP is defined to be in the complex domain when its weights, activation function, and outputs are complex-valued. The selection of the activation function is not a trivial matter. For example, the extension of the sigmoid function from R to C, f .z/ D
1 ; .1 C e z /
(10.4)
where z 2 C, is not allowed, because this function is analytic and unbounded 2 [68]; this is also true for the functions tanh(z) and ez . We believe these kinds of activation functions exhibit problems with convergence in training due to their singularities. The necessary conditions that a complex activation f .z/ D a.x; y/ C i b.x; y/ has to fulfill are f .z/ must be nonlinear in x and y, the partial derivatives ax , ay , bx and by must exist (ax by 6 bx ay ), and f .z/ must not be entire. Accordingly, Georgiou and Koutsougeras [68] proposed the formulation f .z/ D
z ; c C 1r jzj
(10.5)
where c; r 2 RC . These authors thus extended the traditional real-valued backpropagation learning rule to the complex-valued rule of the complex multilayer perceptron (CMLP).
280
10 Geometric Neuralcomputing
Arena et al. [2] introduced the quaternionic multilayer perceptron (QMLP), which is an extension of the CMLP. The weights, activation functions, and outputs of this net are represented in terms of quaternions [78]. Arena et al. chose the following non-analytic bounded function f .q/ D f .q0 C q1 i C q2 j C q3 k/ 1 1 1 1 C iC jC k; D 1 C e q0 1 C e q1 1 C e q2 1 C e q3 (10.6) where f ./ is now the function for quaternions. These authors proved that superpositions of such functions accurately approximate any continuous quaternionic function defined in the unit polydisc of C n . The extension of the training rule to the CMLP was demonstrated in [2].
10.4 Geometric Algebra Neural Networks Real, complex, and quaternionic neural networks can be further generalized within the geometric algebra framework, in which the weights, the activation functions, and the outputs are now represented using multivectors. For the real-valued neural networks discussed in Sect. 10.2, the vectors are multiplied with the weights, using the scalar product. For geometric neural networks, the scalar product is replaced by the geometric product.
10.4.1 The Activation Function The activation function of Eq. 10.5, used for the CMLP, was extended by Pearson and Bisset [146] for a type of Clifford MLP by applying different Clifford algebras, including quaternion algebra. We propose here an activation function that will affect each multivector basis element. This function was introduced independently by the authors [10] and is in fact a generalization of the function of Arena et al. [2]. The function for an n-dimensional multivector m is given by f .m/ D f .m0 C mi ei C mj ej C mk ek C C mij ei ^ej C C mijk ei ^ej ^ek C C mn e1 ^e2 ^ ^en / D .m0 / C f .mi /ei C f .mj /ej C f .mk /ek C C f .mij /ei ^ej C C f .mijk /ei ^ej ^ek C C f .mn /ei ^ej ^ ^en ; (10.7) where f ./ is written in bold to distinguish it from the notation used for a singleargument function f ./. The values of f ./ can be of the sigmoid or Gaussian type.
10.4 Geometric Algebra Neural Networks
281
10.4.2 The Geometric Neuron The McCulloch–Pitts neuron uses the scalar product of the input vector and its weight vector [166]. The extension of this model to the geometric neuron requires the substitution of the scalar product with the Clifford or geometric product, that is, wT x C
)
wx C D w x C w ^ x C :
(10.8)
Figure 10.1 shows in detail the McCulloch–Pitts neuron and the geometric neuron. This figure also depicts how the input pattern is formatted in a specific geometric algebra. The geometric neuron outputs a richer kind of pattern. We can illustrate this with an example in G3;0;0 : o D f .wx C / D f .s0 C s1 e1 C s2 e2 C s3 e3 C s4 e1 e2 C s5 e1 e3 C s6 e2 e3 C s7 e1 e2 e3 / D f .s0 / C f .s1 /e1 C f .s2 /e2 C f .s3 /e3 C f .s4 /e1 e2 C C f .s5 /e1 e3 C f .s6 /e2 e3 C f .s7 /e1 e2 e3 ;
(10.9)
where f is the activation function defined in Eq. 10.7, and si 2 R. If we use the McCulloch–Pitts neuron in the real-valued neural network, the output is simply the scalar given by ! N X oDf wi xi C : (10.10) i
The geometric neuron outputs a signal with more geometric information o D f .wx C / D f .w x C w ^ x C /:
(10.11)
It has both a scalar product like the McCulloch–Pitts neuron, f .w x C / D f .s0 / f
N X i
Fig. 10.1 McCulloch–Pitts neuron and geometric neuron
! wi xi C ;
(10.12)
282
10 Geometric Neuralcomputing
and also the outer product given by f .w^x C / D f .s1 /e1 C f .s2 /e2 C f .s3 /e3 C f .s4 /e1 e2 C C f .s5 /e1 e3 C f .s6 /e2 e3 C f .s7 /e1 e2 e3 : (10.13) Note that the outer product gives the scalar cross-products between the individual components of the vector, which are nothing more than the multivector components of points or lines (vectors), planes (bivectors), and volumes (trivectors). This characteristic can be used for the implementation of geometric preprocessing in the extended geometric neural network. To a certain extent, this kind of neural network resembles the higher-order neural networks of [149]. However, an extended geometric neural network uses not only a scalar product of higher order, but also all the necessary scalar cross-products for carrying out a geometric cross-correlation. Figure 10.2 shows a geometric network with its extended first layer.
Fig. 10.2 Geometric neural network with extended input layer
10.4 Geometric Algebra Neural Networks
283
In conclusion, a geometric neuron can be seen as a kind of geometric correlation operator, which, in contrast to the McCulloch–Pitts neuron, offers not only points but higher-grade multivectors such as lines, planes, spheres and hyper-volumes for interpolation.
10.4.3 Feedforward Geometric Neural Networks Figure 10.3 depicts standard neural network structures for function approximation in the geometric algebra framework. Here, the inner vector product has been extended to the geometric product, and the activation functions are according to (10.7). Equation 10.1 of Cybenko’s model in geometric algebra is N X
y.x/ D
wj f .wj x C wj ^ x C j /:
(10.14)
j D1
The extension of the MLP is straightforward. The equations using the geometric product for the outputs of hidden and output layers are given by θ1
a gp w11
gp
f*
w21
+ f*
w22
+
θ12 gp
w12
gp
x
+ y(x)
θ1N gp w1N
+
k
f*
gp w2N
gp : geometric product θ1
b gp
w11 gp
w12
gp
+ G* θ12
w21
+ G*
w22
gp
x
+ y(x)
θ1N gp w1N
+ G*
k
gp w2N
x, w, y and θ are multivectors
Fig. 10.3 Geometric network structures for approximation: (a) Cybenko’s, (b) GRBF network, (c) GMLPp;q;r
284
10 Geometric Neuralcomputing
0 1 Ni X oj D f j @ wj i x j i C wj i ^ x j i C j A ; 0 yk D f k @
i D1 Nj X
1 wkj okj C wkj ^ okj C k A :
(10.15)
j D1
In radial basis function networks, the dilatation operation, given by the diagonal matrix Di , can be implemented by means of the geometric product with a dilation i iN
D i D e ˛ 2 [94], that is, Q i; Di .x ti / ) D i .x t i /D y k .x/ D
N X
Q j /: wkj G j .D j .x j i t j /D
(10.16) (10.17)
j D1
Note that in the case of the geometric RBF we are also using an activation function according to (10.7). Equation 10.17 with wkj 2 R represents the equation of an RBF architecture for multivectors of 2n dimension, which is isomorphic to a realvalued RBF network with 2n -dimensional input vectors. In Sect. 10.5, we show that we can use support vector machines for the automatic generation of an RBF network for multivector processing.
10.4.4 Generalized Geometric Neural Networks One major advantage to using Clifford geometric algebra in neurocomputing is that the nets function for all types of multivectors: real, complex, double (or hyperbolic), and dual, as well as for different types of computing models, like horospheres and null cones (see [18, 91]). The chosen multivector basis for a particular geometric algebra Gp;q;r defines the signature of the involved subspaces. The signature is computed by squaring the pseudoscalar: if I 2 D 1, the net will use complex numbers; if I 2 D 1, the net will use double or hyperbolic numbers; and if I 2 D 0, the net will use dual numbers (a degenerated geometric algebra). For example, for G0;2;0 , we can have a quaternion-valued neural network; for G1;1;0 , a hyperbolic MLP; for G0;3;0 , a hyperbolic (double) quaternion-valued RBF; or for G3;0;0 , a net which works in C the entire Euclidean three-dimensional geometric algebra; for G4;1;0 , a net which C works in the horosphere; or, finally, for G3;3;0 , a net which uses only the bivector null cone. The conjugation involved in the training learning rule depends on whether we are using complex, hyperbolic, or dual-valued geometric neural networks, and varies according to the signature of the geometric algebra (see Eqs. 10.21–10.23).
10.4 Geometric Algebra Neural Networks
285
10.4.5 The Learning Rule This section demonstrates the multidimensional generalization of the gradient descent learning rule in geometric algebra. This rule can be used for training the geometric MLP (GMLP) and for tuning the weights of the geometric RBF (GRBF). Previous learning rules for the real-valued MLP, complex MLP [68], and quaternionic MLP [2] are special cases of this extended rule.
10.4.6 Multidimensional Back-Propagation Training Rule The norm of a multivector x for the learning rule is given by 1 2
jxj D .xjx/ D
X
! 12 Œx2A
:
(10.18)
A
The geometric neural network with n inputs and m outputs approximates the target mapping function Yt W .Gp;q;r /n ! .Gp;q;r /m ; (10.19) where .Gp;q;r /n is the n-dimensional module over the geometric algebra Gp;q;r [146]. The error at the output of the net is measured according to the metric Z 1 ED jYw Yt j2 ; 2 x 2X
(10.20)
where X is some compact subset of the Clifford module .Gp;q;r /n involving the product topology derived from Eq. 10.18 for the norm and where Yw and Yt are the learned and target mapping functions, respectively. The back-propagation algorithm [166] is a procedure for updating the weights and biases. This algorithm is a function of the negative derivative of the error function (Eq. 10.20) with respect to the weights and bases themselves. The computing of this procedure is straightforward, and here we will only give the main results. The updating equation for the multivector weights of any hidden j -layer is 20 wij .t C 1/ D 4@
Nk X
1
3
ıkj ˝ wkj A ˇ F 0 .net ij /5 ˝ oi C ˛wij .t/;
(10.21)
k
for any k-output with a nonlinear activation function
wjk .t C 1/ D .y kt y ka / ˇ F 0 .net jk / ˝ oj C ˛wjk .t/;
(10.22)
286
10 Geometric Neuralcomputing
and for any k-output with a linear activation function wjk .t C 1/ D .y kt y ka / ˝ oj C ˛wjk .t/:
(10.23)
In the above equations, F is the activation function defined in Eq. 10.7, t is the update step, and ˛ are the learning rate and the momentum, respectively, ˝ is the Clifford or geometric product, ˇ is the scalar product, and ./ is the multivector anti-involution (reversion or conjugation). In the case of the non-Euclidean G0;3;0 , ./ corresponds to the simple conjugation. Each neuron now consists of p C q C r units, each for a multivector component. The biases are also multivectors and are absorbed as usual in the sum of the activation signal, here defined as netij . In the learning rules (Eqs. 10.21–10.23), the computation of the geometric product and the anti-involution varies depending on the geometric algebra being used [156]. To illustrate, the conjugation required in the learning rule for quaternion algebra is xN D x0 x1 e1 x2 e2 x3 e1 e2 , where x 2 G0;2;0 .
10.4.7 Simplification of the Learning Rule Using the Density Theorem Given X and Y as compact subsets belonging to .Gp;q /n and .Gp;q /m , respectively, and considering Yt : X ! Y a continuous function, we are able to find some coefficients w1 ; w2 ; w3 ; : : : ; wNj 2 R and some multivectors y 1 ; y 2 ; y 3 ; : : : ; y Nj 2 Gp;q and 1 ; 2 ; 3 ; : : : ; Nj 2 Gp;q so that the following inequality 8 > 0 is valid: 2 0 1 3 Nj Ni X X E.Yt ; Yw / D sup 4jYt .x/ wj f j @ wi x C i A jx 2 X 5 < ; (10.24) j D1
i D1
where f j is the multivector activation function of Eq. 10.7. Here, the approximation is given by 0 1 Nj Ni X X SD wj fj @ wi x C i A ; (10.25) j D1
i D1
which is the subset of the class of functions C 0 .Gp;q / with the norm jYt j D supx2X jYt .x/j:
(10.26)
And finally, since Eq. 10.24 is true, we can say that S is dense in C 0 .Gp;q /. The density theorem presented here is the generalization of the one used for the quaternionic MLP by Arena et al. [2].
10.4 Geometric Algebra Neural Networks
287
The density theorem shows that the weights of the output layer for the training of geometric feedforward networks can be real values. Therefore, the training of the output layer can be simplified; that is, the output weight multivectors can be the scalars of the blades of grade k. This k-grade element of the multivector is selected by convenience (see Eq. 1.56).
10.4.8 Learning Using the Appropriate Geometric Algebras The primary reason for processing signals within a geometric algebra framework is to have access to representations with a strong geometric character and to take advantage of the geometric product. It is important, however, to consider the type of geometric algebra that should be used for any specific problem. For some applications, the decision to use the model of a particular geometric algebra is straightforward. However, in other cases, without some a priori knowledge of the problem, it may be difficult to assess which model will provide the best results. If our pre-existing knowledge of the problem is limited, we must explore the various network topologies in different geometric algebras. This requires some orientation in the different geometric algebras that could be used. Since each geometric algebra is either isomorphic to a matrix algebra of R, C, or H, or simply the tensor product of these algebras, we must take great care in choosing the geometric algebras. Porteous [156] showed the isomorphisms GpC1;q D RpC1;q Š GqC1;p D RqC1;p ;
(10.27)
and presented the following expressions for completing the universal table of geometric algebras: Gp;qC4 D Rp;qC4 Š Rp;q ˝ R0;4 Š R0;4 Š H.2/; Gp;qC8 D Rp;qC8 Š Rp;q ˝ R0;8 Š R0;8 Š R.16/;
(10.28)
where ˝ stands for the real tensor product of two algebras. Equation 10.28 is known as the periodicity theorem [156]. We can use Table 10.1, which presents the Clifford, or geometric, algebras up to dimension 16, to search for the appropriate geometric algebras. The entries of Table 10.1 correspond to the p and q of the Gp;q , and each table element is isomorphic to the geometric algebra Gp;q . Table 10.1 Clifford or geometric algebras up to dimension 16
p #
q
!
R 2 R R(2) C (2) H(2)
C R(2) 2 R(2) R(4) C (4)
H C (2) R(4) 2 R(4) R(8)
H H(2) C (4) R(8) 2 R(8) 2
H(2) 2 H(2) H(4) C (8) R(16)
C (4) H(4) 2 H(4) H(8) C (16).
288
10 Geometric Neuralcomputing
Examples of this table are the geometric algebras R Š G0;0 , R0;1 Š C Š G0;1 , H Š G0;2 and R1;1 Š 2 R Š G1;1 , C.2/ Š C ˝ R(2)Š G3;0 Š G1;2 for the 3D space, and H.2/ Š G1;3 for the 4D space.
10.5 Support Vector Machines in Geometric Algebra The support vector machine (SV machine) developed by Vladimir N. Vapnik [190] applies optimization methods for learning. Using SV machines, we can generate a type of two-layer network and RBF networks, as well as networks with other kernels. Our idea is to generate neural networks by using SV machines in conjunction with geometric algebra, and thereby in the neural processing of multivectors. We will call our approach the support multivector machine (SMVM). We review SV machines briefly and then explain the SMVM. The SV machine maps the input space Rd into a high-dimensional feature space H , given by ˚: Rd ) H , satisfying a kernel K.x i ; x j / D ˚.xi / ˚.xj /, which fulfills Mercer’s condition [190]. The SV machine constructs an optimal hyperplane in the feature space that divides the data into two clusters. SV machines build the mapping ! X f .x/ D sign yi ˛i K.x i ; x/ b : (10.29) support vectors
The coefficients, ˛i , in the separable case (and analogously in the nonseparable case) are found by maximizing the functional based on Lagrange coefficients: W .˛/ D
l X i D1
1X ˛i ˛j yi yj K.xi ; xj /; 2 l
˛i
(10.30)
i;j
P subject to the constraints li D1 ˛i yi D 0, where ˛i 0, i D 1; 2; : : : ; l. This functional coincides with the functional for finding the optimal hyperplane. Examples of SV machines include K.x; x i / D Œ.x x i / C 1d (polynomial learning machines); K .jx xi j/ D expf jx x i j2 g (radial basis function machines); K.x; x i / D S v.x x i / C c (two-layer neural networks):
(10.31) (10.32) (10.33)
10.6 Linear Clifford Support Vector Machines for Classification For the case of the Clifford SVM for classification, we represent the data set in a certain Clifford algebra Gn , where n D p C q C r, where any multivector base squares to 0, 1 or 1 depending on whether they belong to p, q, or r multivector bases,
10.6 Linear Clifford Support Vector Machines for Classification
289
respectively. We consider the general case of an input comprising D multivectors and one multivector output, that is, each data i th-vector has D multivector entries x i D Œx i1 ; x i 2 ; : : : ; x iD T , where x ij 2 Gn and D is its dimension. Thus, the i th-vector dimension is D2n , then each data i th-vector x i 2 GnD . This i th-vector will be associated with one output of the 2n possibilities given by the following multivector output: y i D yi s C yi e1 C yi e2 C C yi I 2 f˙1 ˙ e1 ˙ e2 : : : ˙ I g; where the first subindex s stands for scalar part, for example 22 D 4 outputs for quaternions (or G0;2;0 ) y i D yi s C yi e2 e3 C yi e3 e1 C yi e1 e2 2 f˙1 ˙ e2 e3 ˙ e3 e1 ˙ e1 e2 g. For the classification, the CSVM separates these multivector-valued samples into 2n groups by selecting a good enough function from the set of functions T
f .x/ D w x C b;
(10.34)
where x; w 2 GnD and f .x/; b 2 Gn . An entry of the optimal hyperplane w D Œw1 ; w2 ; :::; wk ; :::; wD T
(10.35)
wk D wk s C C wk e1 e2 e1 e2 C C wk I I 2 Gn :
(10.36)
is given by
Let us see in detail the last function equation: T
f .x/ D w x C b
D Œw1 ; w2 ; :::; wD T Œx 1 ; x 2 ; :::; x D C b D
D X
wi x i C b;
(10.37)
i D1
where wi x i corresponds to the Clifford product of two multivectors and wi is the reversion of the multivector wi . Next, we introduce a structural risk similar to the real-valued SVM for classification. By using a loss function similar to Vapnik’s -insensitive one, we utilize the following linear constraint quadratic programming for the primal equation: min L.w; b; /P D
X 1 T w wCC ij 2 i;j
subject to T
yij .f .x i //j D yij .w x i C b/j >D 1 ij ij >D 0
for all i; j;
(10.38)
290
10 Geometric Neuralcomputing
where ij stands for the slack variables, i indicates the data i th-vector, and j indexes the multivector component, that is, j D 1 for the coefficient of the scalar part, j D 2 for the coefficient of e1 ; : : : ; j D 2n for the coefficient of I . By using Lagrange multiplier techniques [30, 60], we obtain the Wolfe dual programming of Eq. 10.38: max L.w; b; /D D
X i;j
1 T ˛ij w w; 2
(10.39)
subject to aT 1 D 0, and all the Lagrange multipliers for each entry wk should fulfill 0 .˛ks /j C , 0 .˛ke1 /j C; : : : ; 0 .˛ke1 e2 /j C; : : : ; 0 .˛kI /j C for j D 1; : : : ; l. In aT 1 D 0, 1 denotes a vector of all ones and the entries of the vector a D ŒŒa1s ; a1e1 ; a1e2 ; : : : ; a1e1 e2 ; : : : ; a1I ; : : : ; Œaks ; ake1 ; ake2 ; : : : ; ake1 e2 ; : : : ; akI ; : : : ; ŒaDs ; aDe1 ; aDe2 ; : : : ; aDe1 e2 ; : : : ; aDI
(10.40)
are given by aTks D Œ.˛ks /1 .yks /1 ; .˛ks /2 .yks /2 ; :::; .˛ks /l .yks /l ; aTke D Œ.˛ke1 /1 .yke1 /1 ; : : : ; .˛ke1 /l .yke1 /l ; 1
aTkI
D Œ.˛kI /1 .ykI /1 ; .˛kI /2 .ykI /2 ; :::; .˛kI /l .ykI /l I
(10.41)
note that the vector aT has dimension .D 2n l/ 1. Consider the optimal weight vector w in Eq. 10.39, an entry of it is given by wk ; see Eq. 10.36. Each of the components of wk is computed applying the KKT conditions to the Lagrangian of Eq. 10.38 using l multivector samples as follows: wks D
l X .˛s /j .ys /j .xks /j ; j D1
wke1 D ::: wkI D
l X .˛e1 /j .ye1 /j .xke1 /j ; j D1 l X .˛I /j .yI /j .xiI /j ;
(10.42)
j D1
where .˛s /j ; .˛e1 /j ; : : : ; .˛I /j ; j D 1; : : : ; l; are the Lagrange multipliers.
10.6 Linear Clifford Support Vector Machines for Classification
291
The threshold b 2 Gn can be computed by using the KKT conditions with the Clifford support vectors as follows: b D b s C b e1 e 1 C C b e1 e2 e 1 e 2 C C b I I D
l X
T
.y j w x j /= l:
(10.43)
j D1
However, it is desirable to formulate a compact representation of the ensuing Gramm matrix involving multivector components; this will certainly help in the programing of the algorithm. For this, let us first consider the Clifford product of T w w, which can be expressed as follows: T
T
T
T
T
w w D hw wis C hw wie1 C hw wie2 C C hw wiI :
(10.44)
Since w has the components presented in (10.42), Eq. 10.44 can be rewritten as follows: T
T
T
w w D asT hx xis as C C asT hx xie1 e2 ae1 e2 T
T
C C asT hx xiI aI C aeT1 hx xis as T
T
C C aeT1 hx xie1 e2 ae1 e2 C C aeT1 hx xiI aI T
T
C C aIT hx xis as C aIT hx xie1 ae1 T
T
C C aIT hx xie1 e2 ae1 e2 C C aIT hx xiI aI : T
Renaming the matrices of the t-grade parts of hx xit , we rewrite the previous equation as T
w w DasT Hs as C asT He1 ae1 C asT He1 e2 ae1 e2 C C asT HI aI C aeT1 Hs as C aeT1 He1 ae1 C C aeT1 He1 e2 ae1 e2 C C aeT1 HI aI C C aIT Hs as C aIT He1 ae1 C C aIT He1 e2 ae1 e2 C C aIT HI aI :
(10.45) T
We gather the submatrices of the t-grade parts of hx xit in a positive semidefinite matrix H that is the expected generalized Gram matrix: 3 2 Hs He1 He2 :::: :::: ::: ::: He1e2 ::: HI 6 H T Hs ::: He :::::He e ::: HI Hs 7 7 6 e1 4 1 2 7 6 H T H T H ::: H 7 6 e2 e1 s e1 e2 ::: HI Hs He1 7 6 (10.46) H D 6: 7: 7 6 7 6: 7 6 5 4: HIT ::: HeT1e2 :::::::::::::HeT2 HeT1 Hs
292
10 Geometric Neuralcomputing
Note that the diagonal entries are equal to Hs , and since H is a symmetric matrix, the lower matrices are transposed. Finally, using the previous definitions and equations, the Wolfe dual programming can be written as follows: max L.w; b; /D D
X i;j
1 ˛ij aT Ha; 2
(10.47)
subject to aT 1 D 0, 0 .˛ks /j C , 0 .˛ke1 /j C; : : : ; 0 .˛ke1 e2 /j C; : : : ; 0 .˛kI /j C for j D 1; : : : ; l, where a is given by Eq. 10.40.
10.7 Nonlinear Clifford Support Vector Machines For Classification For the nonlinear Clifford-valued classification problems, we require a Clifford algebra-valued kernel K.x; y/. In order to fulfill the Mercer theorem, we resort to a component-wise Clifford algebra-valued mapping:
x 2 Gn ! ˚.x/ D ˚s .x/ C ˚e1 .x/e1 C ˚e2 .x/e2 C C I˚I .x/ 2 Gn : In general, we build a Clifford kernel K.xm ; xj / by taking the Clifford product between the conjugate of xm and xj as follows: K.x m ; x j / D ˚.x m /˚.x j /:
(10.48)
Next, as an illustration we present kernels using different geometric algebras. According to the Mercer theorem, there exists a mapping u W G ! F, which maps the multivectors x 2 Gn into the complex Euclidean space: v
x ! u.x/ D ur .x/ C I uI .x/:
(10.49)
Recall that the center of a geometric algebra, that is, fs; I D e1 e2 g, is isomorphic with C: K.x m ; x n / D u.x m / u.x n / D .u.x m /s u.x n /s C u.xm /I u.xn /I / CI.u.x m /s u.x n /I u.xm /I u.x n /s /; D .k.x m ; x n /ss C k.x m ; x n /II / CI.k.x m ; x n /I s k.x m ; x n /sI / D Hr C I HI :
(10.50)
10.7 Nonlinear Clifford Support Vector Machines For Classification
293
For the quaternion-valued Gabor kernel function, we use i D e2 e3 , j D e3 e1 , k D e1 e2 . The Gaussian window Gabor kernel function reads K.xm ; xn / D g.xm ; xn /expi w0 .xm xn / ; T
(10.51)
where the normalized Gaussian window function is given by jjxm xn jj2 1 2 2 ; exp g.xm ; xn / D p 2
(10.52)
and the variables w0 and xm xn stand for the frequency and space domain, respectively. Unlike the Hartley transform or the 2D complex Fourier transform, this kernel function nicely separates the even and odd components of the given signal, that is, K.xm ; xn / D K.xm ; xn /s C K.xm ; xn /e2 e3 C C K.xm ; xn /e3 e1 C K.xm ; xn /e1 e2 D g.xm ; xn / cos.wT0 xm / cos.wT0 xm / C C g.xm ; xn / cos.wT0 xm / sin.wT0 xm /i C C g.xm ; xn / sin.wT0 xm / cos.wT0 xm /j C C g.xm ; xn / sin.wT0 xm / sin.wT0 xm /k: Since g.xm ; xn / fulfills Mercer’s condition, it is straightforward to prove that k.xm ; xn /u in the above equations satisfy these conditions as well. After defining these kernels, we can proceed the P formulation of the SVM conditions. We substitute the mapped data ˚i .x/ D nuD1 < ˚i .x/ >u into the linear T
function f .x/ D w ˚.x/ C b. The problem can be stated in a similar fashion to (10.47)–(10.43). In fact, we can replace the kernel function in (10.47) to accomplish the Wolfe dual programming and thereby obtain the kernel function group for nonlinear classification:
Hs D Ks .xm ; xj / m;j D1; :::; l ;
He1 D Ke1 .xm ; xj / m;j D1; :::; l ; :: :
Hen D Ken .xm ; xj / m;j D1; :::; l ;
HI D KI .xm ; xj / m;j D1; :::; l :
(10.53)
294
10 Geometric Neuralcomputing
In the same way we can use the kernel functions to replace the scalar product of the input data in (10.46). Now, for the valency state classification, one uses the output function of the nonlinear Clifford SVM given by h i h T i y D csignm f .x/ D csignm w ˚.x/ C b ;
(10.54)
where m stands for the state valency.
10.8 Clifford SVM for Regression The representation of the data set for the case of the Clifford SVM for regression is the same as for the Clifford SVM for classification; we represent the data set in a certain Clifford algebra Gn . Each data i th-vector has multivector entries x i D Œx i1 ; x i 2 ; : : : ; x iD T , where x ij 2 Gn and D is its dimension. Let (x1 , y 1 ), (x 2 , y 2 ), . . . , (x j , y j ), . . . , (x l , y l ) be the training set of independently and identically distributed multivector-valued sample pairs, where each label is given by y i D ys i C ye1 i e1 C ye2 i e2 C C yI i I . The regression problem using multivectors is to find a multivector-valued function, f .x/, that has at most an "-deviation from the actually obtained targets y i 2 Gn for all the training data and, at the same time, is as smooth as possible. We will use a multivector-valued "-insensitive loss function and arrive at the formulation of Vapnik [190]: min L.w; b; / D
X 1 T w wCC .ij C Nij / 2 i;j
subject to T y i w x i b
ij >D 0; Nij >D 0
for all i; j;
(10.55)
where w; x 2 GnD , and ./j extracts the scalar accompanying a multivector base. Next, we proceed as in Sect. 10.6, since the expression of the orientation of the optimal hyperplane is similar to that of Eq. 10.35, the components of an entry wk of the optimal hyperplane w are computed using l multivector samples as follows: wks D
l X .˛ks /j .˛N ks /j .xks /j ; j D1
wke1 D
l X .˛ke1 /j .˛N ke1 /j .xke1 /j ; : : : ; j D1
10.8 Clifford SVM for Regression
wke2 e3 D
295
l X .˛ke2 e3 /j .˛N ke2 e3 /j .xke2 e3 /j ; : : : ; j D1
wkI D
l X .˛kI /j .˛N kI /j .xkI /j :
(10.56)
j D1
We can now redefine the entries of the vector a of Eq. 10.40 for a vector of D multivectors as follows: h a D ŒaO 1s ; aO 1e1 ; aO 1e2 ; : : : ; aO 1I ; : : : ; ŒaO ks ; aO ke1 ; aO ke2 ; : : : ; aO kI ; i : : : ; ŒaO Ds ; aO De1 ; aO De2 ; : : : ; aO DI : (10.57) and the entries for the k element are computed using l samples as follows: aO Tks D Œ.˛ks 1 ˛ ks 1 /; .˛ks 2 ˛ ks 2 /; : : : ; .˛ks l ˛ ks l /; aO Tke D Œ.˛ke1 1 ˛ ke1 1 /; : : : ; .˛ke1 l ˛ ke1 l /; 1
aO TkI
:::; D Œ.˛kI 1 ˛ kI 1 /; .˛kI 2 ˛ kI 2 /; : : : ; .˛kI l ˛ kI l /:
(10.58)
T
Now, we can rewrite the Clifford product w w, as we did in (10.44)–(10.45) and rewrite the primal problem as follows: 1 T a Ha C C. C N / 2 subject to .y w x b/j . C /j
min
.w x C b y/j . C N /j ij >D 0; Nij >D 0 for all i; j;
(10.59)
Thereafter, we write straightforwardly the dual of (10.59) for solving the regression problem: 1 max ˛N T .N C y/ ˛T . y/ aT H a 2 subject to l X
.˛s j ˛ s j / D 0;
j D1 l X
.˛I j ˛ I j / D 0;
j D1
l X
.˛e1 j ˛ e1 j / D 0; : : : ;
j D1
296
10 Geometric Neuralcomputing
0 .˛s /j C; 0 .˛e1 /j C; :::; 0 .˛e1 e2 /j C; :::; 0 .˛I /j C; j D 1; :::; l; 0 .˛N s /j C; 0 .˛N e1 /j C; :::; 0 .˛N e1 e2 /j C; :::; 0 .˛N I /j C; j D 1; :::; l:
(10.60)
As explained in Sect. 10.7, for nonlinear regression we utilize a particular kernel for computing k.x m ; x n / D ˚.x m /˚.x n /. We can use the kernels described in Sect. 10.7. By the use of other loss functions, like the Laplace, complex, or polynomial, one can extend Eq. 10.60 to include extra constraints.
10.9 Conclusion According to the literature, there are basically two mathematical systems used in neural computing: tensor algebra and matrix algebra. In contrast, the author has chosen to use the coordinate-free system of Clifford or geometric algebra for the analysis and design of feedforward neural networks. Our work shows that real-, complex-, and quaternion-valued neural networks are simply particular cases of geometric algebra multidimensional neural networks, and that some can be generated using support multivector machines. Also, in this chapter the real-valued SVM is generalized to Clifford-valued SVM and is used for classification, regression, and interpolation. In particular, the generation of RBF networks in geometric algebra is easier using an SVM, which allows one to find the optimal parameters automatically. The CSVM accepts multiple multivector inputs and multivector outputs, like a MIMO architecture, that allows us to have multiclass applications. We can use CSVM over complex, quaternion, or hypercomplex numbers according to our needs. The use of feedforward neural networks and the SV machines within the geometric algebra framework widens their sphere of applicability and, furthermore, expands our understanding of their use for multidimensional learning. The experimental analysis confirms the potential of geometric neural networks and Clifford-valued SVMs for a variety of real applications using multidimensional representations, such as in graphics, augmented reality, machine learning, computer vision, medical image processing, and robotics.
Part IV
Geometric Computing of Robot Kinematics and Dynamics
Chapter 11
Kinematics
11.1 Introduction This chapter presents the formulation of robot manipulator kinematics within the geometric algebra framework. In this algebraic system, the 3D Euclidean motion of points, lines, and planes can be advantageously represented using the algebra of motors. The computational complexity of direct and indirect kinematics and other problems concerning robot manipulators are dependent on the robot’s degrees of freedom as well as its geometric characteristics. Our approach makes possible a direct algebraic formulation of the concrete problem in such a way that it reflects the underlying geometric structure. This is achieved by describing parts of the problem based on motor representations of points, lines, planes, circles, and spheres where necessary. The chapter presents the formulation and computation of closed-form solutions of direct and indirect kinematics for standard robot manipulators and a simple example of a grasping task. The flexible method presented here widens the current standard point or line representation–based approaches for the treatment of problems related to robot manipulators. We present the computation of the inverse kinematics in motor algebra using points, lines, and planes, and as an extension we solve the same problem by using conformal geometric algebra that also allows us to represent the use of circles and spheres.
11.2 Elementary Transformations of Robot Manipulators The study of the rigid motion of objects in 3D space plays an important role in robotics. In order to linearize the rigid motion of the Euclidean space, homogeneous coordinates are normally utilized. That is why in the geometric algebra framework we choose the special or degenerated geometric algebra to extend the algebraic system from 3D Euclidean space to 4D space. In this system, we can nicely model the motion of points, lines, and planes with computational advantages and geometric insight. Let us start with a description of the basic elements of robot manipulators C in terms of the special or degenerated geometric algebra G3;0;1 or motor algebra.
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 11, c Springer-Verlag London Limited 2010
299
300
11 Kinematics
The most basic parts of a robot manipulator are revolute joints, prismatic joints, connecting links, and the end-effectors. In the following sections, we will treat the kinematics of the prismatic and revolute manipulator parts using the 4D geometric C algebra G3;0;1 and will illustrate an end-effector grasping task.
11.2.1 The Denavit–Hartenberg Parameterization The computation of direct or inverse kinematics requires both an exact description of the robot manipulator’s structure and its configuration. The descriptive approach used most often is known as the Denavit–Hartenberg (DH) procedure [48]. It is based on a uniform description of the position of the reference coordinate system of a particular joint relative to the next joint in consideration. Figure 11.3a shows how coordinate frames are attached to a joint of a robot manipulator. Table 11.1 presents the specifications of two robot manipulators, the SCARA and the Stanford manipulators, as shown in Figs. 11.1 and 11.2, respectively. This tells us whether the joint is for rotation (revolute) or for translation (prismatic). The transformation of the reference coordinate system between two joints will be called a jointtransition. Figure 11.3b shows the involved screws in a jointtransition according to the Denavit–Hartenberg parameters. The frame, or reference coordinate system, related to the i th joint is attached at the end of this link and it is notated Fi . In Table 11.1, parameter “v” indicates that the involved variable of the joint is variable, and parameter “c” indicates that it is constant. The position and orientation of the end-effector in relation to the reference coordinate system of the robot basis can be computed by linking all jointtransitions. In this way, we are able to get straightforwardly the direct kinematics. Conversely, in the case of the inverse kinematics, given the position and orientation of the end-effector, we must find values of the variable parameters of the
Table 11.1 Kinematic configuration of two robotic manipulators Robot type SCARA
link 1 2 3 4
Revolute i 1 2 3 0
v/c v v v
Prismatic di d1 d2 0 d4
v/c c c
Stanford
1 2 3 4 5 6
1 2 0 4 5 6
v v
d1 d2 d3 0 0 d6
c c v
v v v
v
c
Twist angle ˛i 0 0 0 0
Link length li l1 l2 0 0
90ı 90ı 0 90ı 90ı 0
0 0 0 0 0 0
11.2 Elementary Transformations of Robot Manipulators
301
Fig. 11.1 SCARA-type manipulator according to DH parameters given in Table 11.1 (variable parameters are circled)
jointtransitions that satisfy this requirement. In the following sections, we will provide more details for the computation of the direct and inverse kinematics of robot manipulators.
11.2.2 Representations of Prismatic and Revolute Transformations The transformation of any point, line, or plane between coordinate systems Fi 1 and Fi is revolute when the involved degree of freedom is only a variable angle i , and it is prismatic when the degree of freedom is only a variable length di . The transformation motor i 1M i between Fi and Fi 1 consists of a sequence of two screw transformations, one fixed (i.e., M x˛O i ) and another variable (i.e., M zO ) (see i
Fig. 11.3b). Note that we use dual angles (see [11]): Oi D i C Idi ; ˛O i D ˛i C I li :
(11.1) (11.2)
302
11 Kinematics
Fig. 11.2 Stanford-type manipulator according to DH parameters in Table 11.1 (variable parameters are encircled)
In the revolute case, the latter equation uses as a variable parameter, the angle i , and in the prismatic case, the displacement di is the variable parameter. The transformation reads i 1
M i D M zO M x˛O i D T zd R z T xli R x˛i 0
i
i
i
0 0 11 0 li I I D @1 C @ 0 AA R zi @1 C @ 0 AA R x˛i : 2 2 0 di 0
11
(11.3)
For the sake of clarity, the dual bivectors of the translators are given as a column vector, which makes the variable parameters explicit.
11.2 Elementary Transformations of Robot Manipulators
303
a b
Fig. 11.3 Motor description of a manipulator joint: (a) i th joint of a robot manipulator and attached coordinate frames according to Denavit–Hartenberg; the encircled i is the variable parameter; (b) transformation from frame Fi to Fi1 (motor i1M i consists of two screw transformations, M x˛Oi and M zO ) i
Since i 1M i
i 1f
M i D 1, we obtain fx M ex R fz D T ex ez ez M i 1 D M l i ˛ i T d i R i : ˛O i O
i
(11.4)
i
Be aware that for the rest of the chapter jM i will denote a motor transformation from Fi to Fj . We will now give general expressions for the transformation of points, lines, and planes using a single parameter, i or di , as the variable and two fixed parameters, ˛i and li . For the joint depicted in Fig. 11.3b, a revolute transformation will take place only when i varies and a prismatic transformation only when di varies. Now, taking a point X represented in the frame Fi 1 , we can describe its transformation from Fi 1 to Fi using the motor algebra [11], with either i or di as the variable parameter. We will call this transformation a forward transformation. The multivector representation of point X related to the frame Fi will be expressed as iX , with X D iM i 1
i
f fx fz X iM i 1 D M ˛O i M O
i 1
ex R ex ez ez DT l i ˛ i T d i R i D 1 C I ix ;
i 1
X
i 1 X i e z R x˛ T ex R zi T li di i
M zO M x˛O i i
(11.5)
where ix is a bivector representing the 3D position of X referred to Fi . A transformation in the reverse sense will be called a backward transformation. In this case,
304
11 Kinematics
a point X referenced in the frame Fi is transformed to the frame Fi 1 as follows: i 1
X D i 1M i iX D 1CI
fx M fz M i D M zO M x˛O i iX M ˛O i O
i 1 f
i
i
i 1
x:
(11.6)
Note that the motor applied from the right side is not purely conjugated as is the case with the line. The same will also be true for a plane [11]. Consider a line L represented in the frame Fi 1 by i 1L D i 1n C I i 1m, where n and m are bivectors indicating the orientation and moment of the line, respectively. We can write the forward transformation of this line related to the frame Fi according to [11]: fx M f i 1 D M fz L iM ˛O i O
i 1
L D iM i 1
i
i 1
i
L M zO M x˛O i i
D in C I im :
(11.7)
Its backward transformation reads i 1
L D i 1M i iL D i 1n C I
fx M fz M i D M zO M x˛O i iL M ˛O i O
i 1 f
i
i
i 1
m:
(11.8)
Finally, the forward transformation of a plane H represented in Fi 1 reads H D iM i 1
i
f fx fz H iM i 1 D M ˛O i M O
i 1
i
i 1
H M zO M x˛O i i
D n C I dH ; i
i
(11.9)
and, as above, its backward transformation equation is i 1
H D i 1M i iH D i 1n C I
fx M fz M i D M zO M x˛O i iH M ˛O i O
i 1 f
i 1
i
dH :
i
(11.10)
11.2.3 Grasping by Using Constraint Equations In this section we illustrate grasping as a manipulation-related task, grasping operation, that involves the positioning of a two-finger grasper in front of a static object. Figure 11.4 shows the grasper and the considered object O. The manipulator must move the grasper near the object, and together they must fulfill some conditions in order to grasp the object firmly. To determine the overall transformation, 0M n , which moves the grasper into an appropriate grasping position, we claim that 0M n must fulfill the three constraints below. In order to formulate these constraints, we can take advantages of the point, line, and plane representations of the
11.2 Elementary Transformations of Robot Manipulators
305
Fig. 11.4 Two-finger grasper approaching an object
motor algebra. In the following explanation, we assume that the representations of geometric entities attached to the object O in frame F0 are known. Attitude Condition The grasping movement of the two fingers should be in the reference plane H O of O, that is, the yz-plane of the end-effector frame Fn should be equal to the reference plane H O . The attitude condition can be simply formulated in terms of a plane equation, as follows: 0f 0 M n nH yz n Mn H O 0 ;
0
(11.11)
T T where nH yz n D .1; 0; 0/ C I 0 D .1; 0; 0/ (see Fig. 11.4).
Alignment Condition After the application of the motor 0M n , the grasper and the object should be in a parallel alignment, that is, the direction of the y-axis and the line LO should be the same. The alignment condition can be simply expressed in terms of a line equation, f n id h0LO id 0 ; h0M n nLyn 0M
(11.12)
where nLyn D .0; 1; 0/T C I.0; 0; 0/T D .0; 1; 0/T and hLid denotes the direction components of line L. Touching condition The motion 0M n should also guarantee that the grasper is in the correct grasping position, that is, the origin, P on , of the end-effector frame, Fn , should touch the reference point X O of O. The formulation of this constraint in our framework is f 0X 0 : M n nP on 0 M n O
0
(11.13)
With these three conditions, we can obtain constraints for the components of M n , and we can determine 0M n numerically. The next step is to determine the variable joint parameters of the robot manipulator that lead to the position and orientation of the end-effector frame Fn described by 0M n . This problem is called the inverse kinematics problem of robot manipulators and is treated in Sect. 11.4. 0
306
11 Kinematics
11.3 Direct Kinematics of Robot Manipulators The determination of direct kinematics involves the computation of the position and orientation of the end-effector or frame Fn given the parameters of the jointtransitions (see Fig. 11.5). In this section, we show how the direct kinematics can be computed when we use as geometric objects points, lines, or planes. The notation we use for points, lines, and planes is shown in Fig. 11.6. The direct kinematics for the general case of a manipulator with n joints can be written as M n D 0M 1 1M 2 2M 3 n1M n D
0
n Y
i 1
Mi :
(11.14)
i D1
We can formulate straightforwardly the direct kinematics in terms of point, line, or plane representations as follows: n n f D Y i 1M nX Y ni M f X D 0M n nX 0 M n i nC1i ;
0
i D1
LD
0
H D
0
n Y i D1 n Y
i 1
n Y
i 1
i D1 n Y
M i nL
i D1
M i nH
i D1
ni f
M nC1i ;
ni f
M nC1i :
(11.15)
i D1
Fig. 11.5 Direct and inverse kinematics
Fig. 11.6 Notations for origin, coordinate axis, and coordinate planes for frame-specific entities
11.3 Direct Kinematics of Robot Manipulators
307
Now, let us write the motor 0M 4 for the direct kinematics for points, lines, and planes in the form of equation (11.15) for the SCARA manipulator specified by the Denavit–Hartenberg parameters of Table 11.1. First, using equation (11.14) with n D 4, we can easily express the required motor 0M 4 , as follows: M 4 D 0M 1 1M 2 2M 3 3M 4 D .M zO M x˛O 1 / .M zO M x˛O 4 /
0
1
4
D .T zd1 R z1 T xl1 R x˛1 / .T zd4 R z4 T xl4 R x˛4 / 0 0 0 11 0 11 0 0 11 0 l1 0 I I I D @1 C @ 0 AA R z1 @1 C @ 0 AA @1 C @ 0 AA 2 2 2 0 d1 d2 0 0 0 11 0 11 l2 0 I I R z2 @1 C @ 0 AA R z3 @1 C @ 0 AA : (11.16) 2 2 0 d4 Note that translators with zero translation and rotors with zero angle become 1. f from the right for point and Applying the motor 0M 4 from the left and 0 M 4 f 4 from the right for line plane equations, and the motor 0M 4 from the left and 0 M equations, as indicated by Eq. 11.15, we derive the direct kinematics equations of points, lines, and planes for the SCARA robot manipulator.
11.3.1 MAPLE Program for Motor Algebra Computations Since the nature of our approach requires symbolic computation, we chose the MAPLE computer package to develop a comfortable program for computations in the frame of different geometric algebras. To employ the program, we simply have to specify the vector basis of the selected geometric algebra. The program also employs a variety of useful algebraic operators to carry out computations involving reversion, Clifford conjugations, inner and wedge operations, rotations, translations, motors, extraction of the i -blade of a multivector, etc. Here, as a first illustration, we C use the program for computations in the motor algebra framework G3;0;1 to compute o the direct kinematic equation of the origin P 4 of F4 for the SCARA manipulator specified by the Denavit–Hartenberg parameters of Table 11.1. Figure 11.7 shows the frames and the point P o4 referred to F0 . The final result of the computer calculation is 0
0 11 0 f D 0M @1 C I @ 0 AA 0 M f 0 o P 4 D 0M 4 4P o4 0 M 4 4 4 0 0 1 l2 cos.1 C 2 / C l1 cos.1 / D 1 C I @ l2 sin.1 C 2 / C l1 sin.1 / A : d1 C d2 C d4
(11.17)
308
11 Kinematics
Fig. 11.7 Representation 0P o4 of P o4 in frame F0 is computed using 0M 4
11.4 Inverse Kinematics of Robot Manipulators Using Motor Algebra Since calculations for the inverse kinematics are more complex than those for direct kinematics, our aim should be to find a systematic approach that will exploit the point, line, and plane motor algebra representations. Unfortunately, the procedure for the computation of the inverse kinematics is not amenable to a general formulation as in the case of the direct kinematics equation (11.14). That is why we are compelled to choose a real robot manipulator and compute its inverse kinematics in order to show all the characteristics of the computational assumptions. The Stanford robot manipulator is well known among researchers concerned with the design of strategies for the symbolic computation of the inverse kinematics. According to Table 11.1, the variable parameters to be computed are 1 , 2 , 4 , 5 , 6 , and d3 . Using the Stanford robot manipulator, we will show that the motor algebra approach gives us the freedom to switch between the point, line, or plane representation, according to the geometrical circumstances. This is one of the most important advantages of our motor algebra approach. Mechanically, the Stanford manipulator can be divided into two basic components. The first comprises joints 1, 2, and 3 and is dedicated to the task of general positioning. The second comprises joints 4, 5, and 6, and is dedicated to the wristlike orientation of the end-effector. Since the philosophy of our approach relies on the application of point, line, or plane representation where it is needed, we must first evaluate, for any given case, which of these three representations is suitable for the jointtransitions. In this way, a better geometric insight is guaranteed and
11.4 Inverse Kinematics of Robot Manipulators Using Motor Algebra
309
the solution method is more easily developed. The first three joints of the Stanford manipulator are used to position the origin of the coordinate frame F3 . Therefore, we apply a point representation to describe this part of the problem. The last three joints are used to achieve the desired orientation of the end-effector frame. For the formulation of this subproblem, we use a line and a plane representation because these entities allow us to model orientations.
11.4.1 The Rendezvous Method The next important step is to represent the motor transformations from the beginning of a chain of jointtransitions to their completion, and vice versa, as depicted in Fig. 11.8. As a result, we gain a set of equations for each meeting point. In each of these points, the forward equation is equivalent to the backward equation. We use these equalities as a guide in the computation of the unknowns. We will call this procedure the rendezvous method. This simple idea has proved to be very useful as a strategy for the solution of the inverse kinematics.
11.4.2 Computing 1 , 2 , and d3 Using a Point In the case of the Stanford manipulator, the orientation and position of frame F6 uniquely determines the position of frame F3 . An explanation for this relation follows.
Fig. 11.8 Rendezvous method: If iX and jX are known, we can compute kX for each i k j in one of two different ways: (1) by successive forward transformations of iX , and (2) by successive backward transformations of jX
310
11 Kinematics
The position of frame F3 with respect to F0 is described by the multivector representation 0P o3 of P o3 in F0 . By a successive forward transformation applied on 3 o P 3 D 1, we find the representation 6P o3 of P o3 in F6 by 0
1 0 f D1I @ A : 6 o P 3 D 6M 3 3P o3 6 M 0 3 d6
(11.18)
Now we can compute 0P o3 by 0 f D 0M P o3 D 0M 6 6P o3 0 M 6 6
0
0
1 Px D 1 C I @ Py A : Pz
0
11 0 f @1 I @ 0 AA 0 M 6 d6 (11.19)
Note that 0M 6 is given. The vector .Px ; Py ; Pz /T describes the position of the origin P o3 of frame F3 in frame F0 for a given overall transformation 0M 6 . Now we can apply the rendezvous method, since we know the representation of P o3 in the two different frames F0 and F3 (see Fig. 11.9).
Fig. 11.9 The rendezvous method applied to P o3 in order to determine the equations shown in Table 11.2 (the equations of the rendezvous frame F1 are chosen to compute the variable parameters 1 , 2 , and d3 )
11.4 Inverse Kinematics of Robot Manipulators Using Motor Algebra
311
Table 11.2 Rendezvous equations obtained for P o3 regarding frames F0 , F1 ,
F2 , and F3
Frame
Equation 1 2 3 1
Forward Px Py Pz Py s1 C Px c1
D D D D
Backward d3 s2 c1 d2 s1 d3 s2 c1 C d2 c1 d3 c 2 C d1 d3 s2
F1
2 3 1
d 1 Pz Py c1 Px s1 Pz s2 C d1 s2 C Px c1 c2 C Py s1 c2
D D D
d3 c2 d2 0
F2
2 3 1
d2 Py c1 C Px s1 Pz c2 d1 c2 C Px c1 s2 C Py s1 s2 Pz s2 C d1 s2 C Px c1 c2 C Py s1 c2
D D D
0 d3 0
F3
2 3
d2 Py c1 C Px s1 Pz c2 d1 c2 C Px c1 s2 C Py s1 s2 d3
D D
0 0
F0
By applying successive forward transformations, we obtain f ; P o3 D 1M 0 0P o3 1 M 0 2 o 2 1 o 2f P D M P M ; 1
1
3
3
P o3
D
3
1
f : M 2 P o3 3 M 2
3
2
(11.20)
These computations were carried out with our MAPLE program, which calculated the left-hand sides of the four groups of equations in Table 11.2. On the other hand, by applying successive backward transformations to the origin of F3 given by 0 1 0 3 o P3 D 1 C I @ 0 A D 1 ; (11.21) 0 we get f P o3 D 2M 3 3P o3 2 M 3
2
f P o3 D 1M 2 2P o3 1 M 2
1
f P o3 D 0M 1 1P o3 0 M 1
0
0
1 0 D 1C I @ 0 A; d3 0 1 d3 sin.2 / D 1 C I @ d3 cos.2 / A ; d2 0 1 d3 sin.2 / cos.1 / d2 sin.1 / D 1 C I @ d3 sin.2 / sin.1 / C d2 cos.1 / A: (11.22) d3 cos.2 / C d1
312
11 Kinematics
These equations correspond to the right-hand sides of the four groups of equations in Table 11.2. For simplicity, we have used the abbreviation si for sin.i / and ci for cos.i /. Using the third equation of the rendezvous frame F1 , we compute 1 D arctan2 .x1=2 ; y1=2 /;
(11.23)
where
x1=2 and
d2 Py y1=2 D ; Px 8 ˆ ˆ ˆ ˆ ˆ <
y1=2 D
Py d2 ˙ Px
q
Px2 C Py2 d22
Px2 C Py2
W W arctan2 .x; y/ D undefined W ˆ ˆ ˆ 2 W ˆ ˆ : arctan. x / C
W y arctan. xy / 2
y y y y y
;
>0 D 0 and x > 0 D 0 and x D 0 D 0 and x < 0 <0:
(11.24)
(11.25)
This gives two values for 1 . Now, let us look for d3 and 2 . For that, we consider the first and second equations of the rendezvous frame F1 . With a1=2 D Py x1=2 C Px y1=2 and b D Pz d1 , we obtain two values for d3 . Since for the Stanford manipulator d3 must be positive, we choose d31=2 D
q
2 a1=2 C b2 :
(11.26)
Using this value in Eqs. (11.21) and (11.22), we compute straightforwardly, 2 D arctan2
a1=2 b ; d31=2 d31=2
! :
(11.27)
11.4.3 Computing 4 and 5 Using a Line These variables will be computed using the jointtransition from F3 to F6 . Given the geometric characteristics of the manipulator, an appealing option is to use the line representation to set up an appropriate equation system. The representation 0Lz6 of the line Lz6 in frame F0 can be computed using 0M 6 : 00 1 0 11 0 0 0 z g6 D 0M 6 @@ 0 A C I @ 0 AA 0 M g6 : L6 D 0M 6 6Lz6 0 M 1 0
(11.28)
11.4 Inverse Kinematics of Robot Manipulators Using Motor Algebra
313
Since the z-axis of the F6 frame crosses the origin of F3 , we can see that the z-axis line related to this frame has zero moment. Thus, we can claim that Lz6 in the F3 frame is 0 1 0 1 Ax 0 3 z 3 0 z 3g @ A @ L6 D M 0 L6 M 0 D Ay C I 0 A : (11.29) Az 0 Note that 3M 0 is known since we have already computed 1 , 2 , and d3 . Now, applying successive forward transformations, as follows 4 z L6 5 z L6 6 z L6
g3 ; D 4M 3 3Lz6 4 M 4 z 5g 5 D M 4 L6 M 4 ; g5 ; D 6M 5 5Lz 6 M
(11.30)
6
we obtain the left-hand sides of the four groups of equations in Table 11.3. The z-axis line Lz6 of F6 represented in F6 has zero moment, and can therefore be expressed as 0 1 0 1 0 0 6 z L6 D @ 0 A C I @ 0 A : (11.31) 1
0
Next, applying successive backward transformations, we obtain 5 z L6 4 z L6 3 z L6
f6 ; D 5M 6 6Lz6 5 M 4 5 z 4f D M 5 L6 M 5 ; f4 : D 3M 4 4Lz 3 M
(11.32)
6
Using our MAPLE program, we then compute the right-hand sides of the four groups of equations in Table 11.3. Table 11.3 Rendezvous equations obtained for Lz6 regarding frames F3 ; F4 ; F5 , and F6 Frame Equation Forward Backward 1
Ax
D
c4 s5
F3
2 3 1
Ay Az Ay s4 C Ax c4
D D D
s4 s5 c5 s5
F4
2 3 1
Az Ay c4 Ax s4 Az s5 C Ax c4 c5 C Ay s4 c5 c6
D D D
c5 0 0
F5
2 3 1
Ay c4 Ax s4 Az c5 Ax c4 s5 Ay s4 s5 Ax s4 s6 Ay c4 s6 C Ay s4 c5 c6 C Ax c4 c5 c6 Az s5 c6
D D D
0 1 0
F6
2 3
Ax s4 c6 C Ay c4 c6 C Ay s4 c5 c6 C Ax c4 c5 s6 Az s5 s6 Az c5 Ax c4 s5 Ay s4 s5
D D
0 1
314
11 Kinematics
Finally, we will consider the equations of rendezvous frame F4 . Using the third equations of Table 11.3, we compute 4 D arctan2 .x1=2 ; y1=2 / ;
(11.33)
where x1=2 D
Ay y1=2 Ay D ˙q ; Ax 2 A C A2 x
y1=2 D ˙ q
y
Ax A2x
C A2y
:
(11.34)
This results in two values for 4 , which when substituted into the first and second equations of Table 11.3 helps us to find two solutions for 5 : 5 D arctan2 .s5 ; c5 / D arctan2 .Ay s4 Ax c4 /; Az :
(11.35)
11.4.4 Computing 6 Using a Plane Representation Since 1 , 2 , d3 , 4 , and 5 are now known, we can compute the motor 5M 0 . The yz-plane H yz 6 represented in F6 has a Hesse distance of zero; thus, H yz 6
6
0 1 0 1 1 1 D @0A C I0 D @0A: 0
(11.36)
0
The transformation of this plane to F0 reads 0 yz H6
D
0 1 1 f : @ 0 A 0M 6 0
(11.37)
1 Nx 5f M 0 D @ Ny A C I 5 dH yz : 6 Nz
(11.38)
0f 0 M 6 6H yz M6 6
D 0M 6
Now we compute 5H yz 6 : 0
5 0 yz H yz 6 D M0 H 6
5
The orientation bivector .Nx ; Ny ; Nz /T describes the orientation of the yz-plane of frame F6 in frame F5 , given the values of the joint variables 1 ; 2 ; 4 ; 5 , and d3 . By applying forward transformation from F5 to F6 , we obtain 6 5 yz 6 f H yz M5 : 6 D M5 H 6
6
(11.39)
11.5 Inverse Kinematics Using the 3D Affine Plane Table 11.4 Rendezvous yz equations obtained for H 6 regarding frames F5 and F6
Frame
315 Equation
Forward
1
Nx
D
c6
Backward
F5
2 3 1
Ny Nz Ny s6 C Nx c6
D D D
s6 0 1
F6
2 3
Nx s6 Ny c6 Nz
D D
0 0
Then, using our MAPLE program, we calculate the left-hand sides of the two groups of equations in Table 11.4. Since the values for 1 , 2 , d3 , 4 , and 5 are not unique, we will obtain different values for the equations. By applying 5M 6 to 6H yz 6 , we next obtain the right-hand sides of the two groups of equations in Table 11.4: 5 yz H6
D
yz f 5 M 6 6H 6 5 M 6
0
1 sin.6 / D @ cos.6 / A : 0
D 5M 6
0 1 1 f @ 0 A 5M 6 0 (11.40)
Finally, we will consider the equations of the rendezvous frame F5 . Using the first and second equations in Table 11.4, we can compute 6 by 6 D arctan2 .s6 ; c6 / D arctan2 .Nx ; Ny / :
(11.41)
Note that since there are two values for 4 and two values for 5 , there is more than one solution for 6 .
11.5 Inverse Kinematics Using the 3D Affine Plane In this section we compute the inverse kinematics of a robot manipulator using the framework of the 3D affine plane (see Chap. 5). On the one hand, one can use a geometric algebra Gp;q;0 with a Minkowski metric for computations of projective geometry and the algebra of incidence, as is the case for computer vision problems (see Chap. 9). On the other hand, one can use a degenerated algebra for computations involving rigid motions, as is often the case in the field of robotics C (see, e.g., our use of the motor algebra G3;0;1 in Sect. 11.4 or the Clifford algebra G3;0;1 in [169]). In a more general way, the 3D affine plane allows calculations that can involve both 3D rigid transformations and the meet and join operations of the algebra of incidence.
316
11 Kinematics
Note that in this section we assume that the projection PA back to the affine plane is always carried out, and thus we need not make it explicit in the formulas. In the procedure, after the equations to compute the angles i have been found, simple dual trigonometric relations can then be used to determine the angles. In order to show the methodology at a symbolic level, we will avoid showing these trigonometric computation details. The transformation Mt of a robot manipulator that takes the end-effector from its home position to a configuration determined by the n-degrees of freedom of the joint angles 1 ; 2 ; : : :; n is given by Mt D M1 M2 M3 Mn ;
(11.42)
where the screw versor of a joint Mi D Ti Ri is dependent on the angle i . The task of inverse kinematics is to calculate the angles i for a given final configuration of the end-effector. Robot manipulators are equipped with a single parallel revoluted axis as well as intersecting axes. The latter can be located at the end-effector or at the home position. Two typical configurations are illustrated in Fig. 11.10a,b. The mechanical characteristics of the robot manipulators can be used to simplify the computations – by considering either the invariant plane h , in the
a
L1
L5
L3
L2
L4 L6 L1
b
L2 L6 L3 L4
L5
Fig. 11.10 Robot manipulators: (a) (top) intersecting revoluted line axes at the end-effector; or (b) (bottom) at the home position
11.5 Inverse Kinematics Using the 3D Affine Plane
317
case of three parallel revoluted line axes (see Fig. 11.10a), or the invariant point p h , in the case of the intersecting revoluted line axis (see Fig. 11.10b). We can solve the inverse kinematics problem by breaking it up into a series of separate equations, using the strategy suggested by Selig [169] (see Chap. 11). We will illustrate the procedure for a robot with 6 DOF. First, we rearrange the terms of Eq. 11.42: M2 M3 M4 D f M 1 Mt f M 6f M 5:
(11.43)
In the case of three parallel joints, we can isolate them by considering the common perpendicular plane h that satisfies the equation f1 Mt .f f4 f f2 D M M 6 .f M 5 h M5 /M6 /f M t M1 : (11.44) h D M2 M3 M4 h M M 3M In the case of a meeting point p h , we can isolate the three coincident joints as follows: f4 f f6 f p h D M2 M3 M4 p h M M 1 Mt M M 3f M2 D f M 5 p h M5 f M 6f M t M1 : (11.45) In this way, we have separated the problem into two systems of equations: f6 M ft M1 h f f 5 h M5 M6 M M 1 Mt D M
(11.46)
or f6 f f1 Mt D M ft M1 p h M M 5 p h M5 M6 ; M f5 D M 0 : M 6M M2 M3 M4 D M1 Mt f
(11.47) (11.48)
We can now compute for 1 ; 5 ; 6 with the help of either Eq. 11.46 (see Fig. 11.10a) or Eq. 11.47 (see Fig. 11.10b). Then, by using these results and Eq. 11.48, we can solve for 2 ; 3 ; 4 . Let us demonstrate how the procedure works for the case of three intersecting revoluted joint axes in the common plane at the end-effector (see Fig. 11.10a). When the plane bh (perpendicular to the line axes l2 , l3 , and l4 ) is rotated about the endjoint, the point pih on the line axis of the revoluted end-joint l6h remains invariant. Using a meet operation and Eq. 11.46, the angle 6 can be eliminated: M t M1 h f M 6f pih D .f M 1 Mt / \ l6h D .f M 5 h M5 M6 / \ l6h D .f M 5 h M5 / \ l6h :
(11.49)
In the case of a robot manipulator in the home position, the revoluted joint axes are the manipulator basis. Equation 11.47 shows that the point p h is an invariant for the parallel fourth and fifth line axes. Thus, we can use the equation f5 f f6 D M M6 f M t p h Mt M M 4 p h M4 M5
(11.50)
318
11 Kinematics
to solve for the angles 4 and 5 . Using the line l5 and p h , we get the invariant plane, ft p h Mt f ih D M6 M M 6 ^l5 :
(11.51)
The 3D coordinates of this plane are known in advance and correspond to the plane x z or e32 ; thus, Eq. 11.51 allows us to solve for the angle 6 . Having determined angle 6 , we can now use Eq. 11.50 to complete the calculation of 4 and 5 . Now consider the three coincident line axes l1h ; l2h ; l3h given in Fig. 11.10b. We can isolate the angle 2 by considering the invariant relation based on the meet of two of these lines. Thus, f2 / \ l1h D .M 0 l3h f .M2 l3h M M 0 / \ l1h D p0h ;
(11.52)
where M 0 D M1 M2 M3 and p0h is the invariant intersecting point. When the lines are parallel, as shown in Fig. 11.10a, we can use the same invariant relation, but now we consider the intersecting point to be at infinity, so that M 0 D M2 M3 M4 .
11.6 Inverse Kinematic Using Conformal Geometric Algebra In this section, we briefly describe the procedural steps to compute the inverse kinematics of a 5-DOF robot arm using the conformal geometric algebra framework. We have presented above a similar computation using motor algebra G3;0;1 and the affine plane and the next conformal geometric algebra G4;1 . Our purpose is that the reader by studying these examples learns to compute using the geometric algebra and he or she can get a better understanding of the potential and limitations of the different computational framework models of geometric algebra. Objective Find the joint angles in order to place the robot arm at the point p t in such a way that the gripper will stay parallel to the plane t . Step 1: Find the position of the point p 2 l y D e2 E 1 st D pt d32 e1 2 zt D st ^ t j z D zt ^e1 l d D d.pt ; l y /^pt ^e1
11.6 Inverse Kinematic Using Conformal Geometric Algebra
jl D j z ^.l d E / d
P P2 D jld ^zt q P P2 C P P2 P P2 p2 D ; P P2 e1 where d.pt ; l y / stands for the directed distance between p t and l y . Step 2: Find the point p1 : d12 e1 2 d2 s 2 D p 2 2 e1 2
1 D l y ^p2 D e2 E ^p2 s 1 D e0
2 D e3 Ic P P1 D s1 ^s2 ^ 1 q P P1 C P P1 P P1 p1 D : P P1 e1 Step 3: Find the point p0 :
1 D l y ^p2 s 0 D e0
d02 e1 2
319
320
11 Kinematics
π2 l2
ly
s2 s1
d
2
p2
p1
z2
θ2
z1
θ3
d1
l1 π1
0 D e2 he1 z0 D s0 ^ 0 P P0 D z0 ^ 1 p0 D
P P0 C
q
P P0 P P0
P P0 e1
:
11.6 Inverse Kinematic Using Conformal Geometric Algebra
321
Step 4: Compute the lines l 1 , l 2 , and l 3 : l 1 D p 0 ^p 1 ^e1 l 2 D p 1 ^p 2 ^e1 l 3 D p 2 ^p 1 ^e1 :
(11.53)
Step 5: Compute the angles cos.1 / D
l 1 l y
1 2 ; cos. / D 2 j 1 jj 2 j jl 1 jjl y j
1 3 l 1 l 3 ; cos.4 / D jl 1 jjl 3 j j 1 jj 3 j l l cos.5 / D 3 2 : jl 3 jjl 2 j
cos.3 / D
Next, we present a procedure involving the computing of the inverse kinematics of a pan-tilt unit (PTU). Objective Find the joint angles of the PTU unit in order to orient its principal ray towards the target point pt .
322
11 Kinematics
Step 1: Find p 2 1 s1 D p 1 d22 e1 2 D D d.p1 ; pt / 1 s2 D p 1 .D d22 /e1 2 l y D p 0 ^p 1 ^e1
1 D l y ^p t PP 2 D s1 ^s2 ^ 1 : Step 2: Compute the line and the plane: l 2 D p1 ^p2 ^e1
2 D l y ^e3 ; Step 3: Compute the tilt and pan angles: ly l2 jl y jjl 2 j
1 2 ; cos.pan / D j 1 jj 2 j cos.tilt / D
where d.p1 ; p t / stands for the directed distance between p 1 and pt .
11.7 Conclusion This chapter presented the application of the algebra of motors for the treatment of the direct and inverse kinematics of robot manipulators. When dealing with 3D rigid motion, it is usual to use homogeneous coordinates in the 4D space to linearize this nonlinear 3D transformation. With the same effect, we model the prismatic and revolute motion of points, lines, and planes using motors that are equivalent to screws. In our approach, we can also use the representation of planes, and this expands the useful geometric language for the treatment of robotic problems. In addition, we illustrate the use of the 3D affine plane for computing inverse kinematics, taking advantage of incidence algebra operations to simplify computations that use intersections and unions of points, lines, and planes under rigid motion. The chapter has shown the flexibility of the motor algebra approach for the solution of the direct and inverse kinematics of robot manipulators. To solve for the robot’s inverse kinematics, we have shown that, depending on our needs, we can use representations either of points, lines, or planes, which help to enormously reduce the complexity of the computation without losing geometric insight.
11.7 Conclusion
323
Similarly, the use of incidence algebra in the affine plane framework allows us to define geometric constraints, or flags, by taking advantage of meet and joint operations, thereby allowing easy identification of the most important geometric objects involved. For the computation of inverse kinematics, we have also used the conformal geometric algebra which besides points, lines, and planes uses circles and spheres as well; the use of this broad set of geometric entities helps us greatly to simplify the complex computation of the inverse kinematics. The main contribution of this chapter has been to show that our approach offers more flexibility while at the same time preserving geometric insight during computation. The author believes that the versatility of the geometric algebra framework offers an alternative approach to the increasingly complex of application of multilink mechanisms.
Chapter 12
Dynamics
12.1 Introduction The study of the kinematics and dynamics of robot mechanisms has employed different frameworks, such as vector calculus, quaternion algebra, or linear algebra; the last is used most often. However, in these frameworks handling the kinematics and dynamics involving only points and lines is very complicated. In the previous chapter, the motor algebra was used to treat the kinematics of robot manipulators using points, lines, and planes. We also used the conformal geometric algebra that also includes for the representation circles and spheres. The use of additional geometric entities helps even more to reduce the representation and computational difficulties. In this chapter, we show that the mathematical treatment of dynamics indeed becomes much easier using the conformal geometric algebra framework.
12.2 Differential Kinematics The direct kinematics for serial robot arms is a succession of motors and is valid for points, lines, planes, circles, and spheres: Q0 D
n Y
MiQ
i D1
n Y
f ni C1 : M
(12.1)
i D1
The direct kinematics equation (12.1) can be used for points as x 0p D
n Y
M i xp
i D1
n Y
f ni C1 : M
(12.2)
i D1
This equation can be used in conformal geometric algebra, using motors to represent 3D rigid transformations similar as with the motor algebra (see Sect. 3.6). Now we produce an expression for differential kinematics via the total differentiation of (12.2): E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 12, c Springer-Verlag London Limited 2010
325
326
12 Dynamics
dx 0p
D
n X
n Y
@qj
j D1
M i xp
i D1
n Y
!
f ni C1 dqj : M
(12.3)
i D1
Each term of the sum is the product of two functions in qj , and the differential can be written as 2 ! n j n n X Y Y Y 0 f ni C1 4@q dx p D Mi M i xp M j
j D1
C
i D1
n Y
M i xo
i D1
nj Y
i Dj C1
i D1
0
fni C1 @q @ M j
i D1
13
n Y
f ni C1 A5 dqj : M
(12.4)
i D nj C1
1 Since M D e 2 q L , the differential of the motor is d.M / D 12 M Ldq; thus, we can write the partial differential of the motor’s product as follows:
j Y
@qj
! Mi
i D1
j
jY 1
i D1
i D1
1Y 1 D M i Lj D 2 2
!
M i Lj M j :
f/ D f D e 12 q L gives us d.M Similarly, the differential of M differential of the product is 0
1
n Y
@qj @
f ni C1 A D 1 M f j Lj M 2
i Dnj C1
n Y
1 2 M Ldq,
f ni C1 : M
(12.5)
and the
(12.6)
i D nj C2
Replacing (12.5) and (12.6) in (12.4), we get dx 0p
n P
D
"
j D1
12
jQ 1 i D1
M i Lj M j
n Q
M i xp
i Dj C1
n Q
f ni C1 M
i D1
# nj jQ 1 n Q 1Q f ni C1 M f j Lj f j i dqj ; C M i xo M M 2 i D1 i D1 i D1
(12.7)
which can be further simplified as dx 0p
n 1P D 2 j D1
n Q i Dj
"
jQ 1
n Q
i D1
i Dj
M i Lj
M i xp
nj QC1 i D1
M i xo !
f ni C1 Lj M
nj QC1
f ni C1 M
i D1
!
jQ 1 i D1
!
#
f j i dqj : M
(12.8)
12.2 Differential Kinematics
327
Note that the product of a vector with an r-vector is given by (see [94]) 1 aBr C .1/rC1 Br a : 2
a Br D
(12.9)
Using Eq. 12.9, we can further simplify (12.8). As L is a bivector and xp is a vector, we will rewrite (12.8) as follows: dx 0p D
n X
2 4
j D1
jY 1
1 1 ! 00 n !3 nj jY 1 Y YC1 f ni C1 A Lj A f j i 5 dqj : M i @@ M i x p M M
i D1
i Dj
i D1
i D1
(12.10) As with points, all the transformations in conformal geometric algebra can also be applied to the lines; thus, dx 0p
D
"
n P j D1
jQ 1
n Q
i D1
i Dj
Since
jQ 1 i D1
dx 0p
Mi
D
n Q i Dj
n X j D1
Mi D
"
n Y
n Q
Mi
M i xp
jQ 1
jQ 1
i D1
i D1
M i Lj
nj QC1
jQ 1
i D1
i D1
f j i M
f ni C1 M
!#
f j i M
!
dqj :
(12.11)
M i ; we have
i D1
M i xp
i D1
n Y
f ni C1 M
!
i D1
jY 1
jY 1
i D1
i D1
M i Lj
f j i M
!# dqj : (12.12)
Recall Eq. 12.2 of direct kinematics; since in Eq. 12.12 x 0p appears again, we can replace (12.2) in (12.12) to get dx 0p
D
n X
" x 0p
j D1
jY 1
jY 1
i D1
i D1
M i Lj
f j i M
!# dqj :
(12.13)
If we define L0 as function of L as follows: L0j D
jY 1
jY 1
i D1
i D1
M i Lj
f j i ; M
(12.14)
we get a very compact expression of differential kinematics: dx 0p D
n X j D1
x 0p L0j dqj :
(12.15)
328
12 Dynamics
In this way, we can finally write 0 1 qP 1 B C xP 0p D x 0p L01 x 0p L0n @ ::: A :
(12.16)
qPn :
12.3 Dynamics In this section, we describe the equations of kinetic and potential energy in terms of geometric algebra. Based on these equations and using the Lagrange equation, we synthesize the dynamic model of any n-degrees of freedom serial robot.
12.3.1 Kinetic Energy We introduce in our analysis the mass center in order to formulate an expression that describes the kinetic energy of a system of particles. Kinetic Energy of a System of Particles We are considering a system with n particles shown in Fig. 12.1. The total relative kinetic energy, K, of the system is given by KD
n X 1 i D1
Fig. 12.1 System of particles with their mass center
2
mi Vi2 :
(12.17)
12.3 Dynamics
329
Now we will rewrite Eq. 12.17 to introduce the mass center. Here ri represents the distance to the particle, rc the distance to the mass center, and ci the distance from the mass center to the particle: ri D rc C ci :
(12.18)
The time derivative of Eq. 12.18 is rPi D rPc C Pci :
(12.19)
Therefore, the velocity equation of the i th particle (Pci ) with respect to the mass center is given by Vi D Vc C Pci :
(12.20)
By substitution of Eq. 12.20 in the expression of kinetic energy (12.17), we obtain KD D
n X 1
2
i D1 n X
1 2
mi .Vc C Pci /2 mi .Vc /2 C
i D1
n X
1X mi .Pci /2 : 2 n
mi Vc Pci C
i D1
As Vc is not related to the sum index i , we can extract it: ! n n n 1X 1 2 X d X 2 m i C Vc mi ci C mi Pci : K D Vc 2 dt 2 i D1
(12.21)
i D1
i D1
(12.22)
i D1
P P As m D niD1 mi is the total mass of the system and considering that niD1 mi ci is by definition equal to zero, 1 1X 2 K D mVc2 C mi Pci : 2 2 n
(12.23)
i D1
As a conclusion, we see that kinetic energy with respect to a reference system can be considered as the sum of two parts: (1) the kinetic energy of total mass moves with respect to this reference system at the same velocity, plus (2) the kinetic energy of the particles moves with respect to the mass center (momentum of inertia). The Kinetic Energy of a Robot Arm We denote by xi the mass center i in its initial position and by x 0i the mass center as a function of joints variables (see Fig. 12.2). Similarly, we will denote the joints axis i as Li and the joints axis i as a function of the joints variables as L0i .
330
12 Dynamics
Fig. 12.2 Robot arm Adept Six 600
Recall the direct kinematics equation (12.1) that relates xi to x 0i and L to L0i and is written using conformal geometric algebra as Q 2M Q i M Q 1; x 0i D M 1 M 2 M i xi M 0 Q 2M Q i 1 M Q 1: Li D M 1 M 2 M i 1 Li M
(12.24) (12.25)
We have seen that the kinetic energy is equal to the sum of the energy related to the velocity of the mass center and the energy related to the momentum of inertia. So the kinetic energy of the link i is computed as 0 12 i X 1 1 2 Ki D mi xP0i C Ii @ qPj A ; 2 2
(12.26)
j D1
where Ii is the inertia of the link i and xP0i represents the velocity of the mass center x 0i . The velocity of the mass center is computed using the equation of differential kinematics (12.15), explained in Sect. 12.2: 0 xP0 i D x 0i @
i X
j D1
1 L0j qPj A :
(12.27)
12.3 Dynamics
331
Replacing Eq. 12.27 in (12.26), we have the expression of kinetic energy in conformal geometric algebra: 0 2 132 0 12 i i 1 4 0 @ X 0 A5 1 @X A Ki D mi x i Lj qP j C Ii qP j : 2 2 j D1
(12.28)
j D1
P The total kinetic energy on the arm is given by the expression niD1 Ki , where n is the number of degrees of freedom. In order to simplify the explanation, we will separate the kinetic energy K D Kv CKI in two components, Kv and KI , defined as 0 2 132 n i 1 X 4 0 @ X 0 A5 Kv D mi x i Lj qP j ; 2 i D1 j D1 0 12 n i 1 X @X A Ii qP j : KI D 2 i D1
(12.29)
(12.30)
j D1
We will attend first to Kv and later to KI . The objective is to simplify the expression of total kinetic energy in the arm:
Kv D
n 1X
2
2
0
mi 4x 0i @
i D1
i X
132 L0j qPj A5 :
(12.31)
j D1
The square of the velocity’s magnitude is equal to the dot product of the vector with itself: 0 0 2 13 2 13 n i i X X X 1 Kv D mi 4x 0i @ L0j qPj A5 4x 0i @ L0j qPj A5 ; (12.32) 2 i D1
j D1
j D1
introducing the term x 0i into the sum, 0 1 0 1 n i i X 1 X @X 0 mi .x i L0j /qPj A @ .x 0i L0j /qPj A : Kv D 2 i D1
j D1
(12.33)
j D1
Evaluating the sums for j from 1 to i gives 1X 0 mi x i L01 qP1 C C x 0i L0i qPi 2 i D1 0 x i L01 qP1 C C x 0i L0i qPi : n
Kv D
(12.34)
332
12 Dynamics
Evaluating the sum for i from 1 to n gives 1 m1 .x 01 L01 qP 1 / .x 01 L01 qP 1 / C Kv D 2 m2 .x 02 L01 qP1 C x 02 L02 qP2 / .x 02 L01 qP1 C x 02 L02 qP2 / C :: :
0 mn x n L01 qP 1 C C x 0n L0n qPn x 0n L01 qP 1 C C x 0n L0n qPn : (12.35) Reorganizing the terms and extracting q, P we have 2Kv D m1 .x 01 L01 / .x 01 L01 / C C mn .x 0n L01 / .x 0n L01 / qP 12 C m1 .x 01 L02 / .x 01 L02 / C C mn .x 0n L02 / .x 0n L02 / qP 22 :: :: :: : : : 0 0 0 0 0 0 0 0 C m1 .x 1 Ln / .x 1 Ln / C C mn .x n Ln / .x n Ln / qPn2 C m2 .x 02 L01 / .x 02 L02 / C C m2 .x 0n L01 / .x 0n L02 / qP1 qP2 C m3 .x 03 L01 / .x 03 L03 / C C m2 .x 0n L01 / .x 0n L03 / qP1 qP3 :: : 0 0 0 0 Cmn .x n Ln1 / .x n Ln /qPn1 qPn : (12.36) Thanks to this decomposition, it is easy to see the symmetry of the terms; they could be organized in a matrix form to get a better compression of this equation: 0 1 qP 1 1 B :: C Kv D (12.37) qP 1 qPn Mv @ : A ; 2 qPn where Mv is equal to 0 Pn 0 j D1 mj .x j B Pn m .x 0 1 B j D2 j j Mv D B 2B @ Pn 0 j Dn mj .x j
L01 / .x 0j L01 /
Pn
Pjn Dn j Dn
mj .x 0j L01 / .x 0j L0n /
L02 / .x 0j L01 / mj .x 0j :: :: : : P 0 0 0 Ln / .x j L1 / nj Dn mj .x 0j
1
L02 / .x 0j L0n /C C C; :: C A : 0 0 0 Ln / .x j Ln /
(12.38) and each element Mvij of the matrix Mv is computed using Mvij D
n X k D Max.i;j /
mk .x 0k L0i / .x 0k L0j /:
(12.39)
12.3 Dynamics
333
Note that the symmetric matrix Mv can be separated in the product of three matrices, two triangular and one diagonal: 0
x 01 L01 B 0 B Mv D B :: @ :
x 02 L01 x 02 L02 :: :
:: :
0
0 0
x 01 L01 B x 0 L0 1 B 2 B :: @ :
x 0n L01
10 m1 x 0n L01 B 0 x 0n L02 C CB CB : :: A @ :: :
x 0n L0n
0 m2 :: :
:: :
0 0 :: :
0
mn
0
1 C C C A
1
0 0 x 2 L02 :: :
:: :
0 0 :: :
x 0n L02
x 0n L0n
C C C: A
(12.40)
Now we will define the matrix m and the matrix V as 0
m1 B 0 B m WD B : @ :: 0
:: :
0 0 :: :
0
mn
0
C C C; A
L01 L01
0 0 x 2 L02 :: :
:: :
0 0 :: :
x 0n L01
x 0n L02
x 0n L0n
x 01 B x0 B 2
V WD B @
1
0 m2 :: :
:: :
(12.41) 1 C C C: A
(12.42)
Then we can write Eq. 12.40 as follows: Mv D V T mV:
(12.43)
The elements of the matrix V are vectors and the elements of m are scalars, which means that the contribution of kinetic energy produced due to mass displacements with respect to the reference frame can be easily computed as Kv D
1 T T qP V mV q: P 2
(12.44)
Now we follow a similar procedure for the component of the kinetic energy Ki : 0 12 n i 1 X @X A KI D Ii qPj : 2 i D1
j D1
(12.45)
334
12 Dynamics
Evaluating the sums for i and j from 1 to n, we get KI D
1 I1 .qP1 /2 C I2 .qP1 C qP2 /2 C C In .qP1 C C qPn /2 : 2
(12.46)
Expanding the expression, extracting q, P and writing in matrix form give us 0
1 qP1 1 B:C KI D qP 1 qPn MI @ :: A ; 2 qP n
(12.47)
where 0 Pn i D1 Ii B P n Ii B i D2 MI D B :: @ : Pn i Dn Ii
Pn I PinD2 i i D2 Ii :: : Pn i Dn Ii
:: :
Pn 1 I PinDn i C i Dn Ii C C: :: A : Pn i Dn Ii
(12.48)
The matrix MI can be written as the product of two matrices ı and I if we define them as 0
1 1 B0 1 B MI D ıI D B : : @ :: :: 0 0
:: :
10 I1 1 B I2 1C CB :: C B :: : A@ :
0 I2 :: :
:: :
0 0 :: :
1
In
In
In
1 C C C: A
(12.49)
In such a way, the component of kinetic energy due to the movement of links around their mass center is given by KI D
1 T qP ıI q: P 2
(12.50)
In conclusion, we have an expression to compute the total kinetic energy of the serial robot using the axes of the robot and the mass centers in conformal geometric algebra: KD
1 T T qP .V mV C ıI /q: P 2
(12.51)
Note that this expression allows us to compute the kinetic energy without the derivatives.
12.3 Dynamics
335
12.3.2 Potential Energy In contrast to kinetic energy, the potential energy does not depend on the velocity, but it depends on the position of each link of the serial robot. Thanks to the equations of direct kinematics (12.1), we can compute the position x 0i of each link. In order to know the potential energy Ui , we compute the dot product of these points and the force applied to each point: Ui D x 0i Fi :
(12.52)
Here the potential energy is due to conservative forces such as the gravity forces, then Fi D mi ge2 . Also the total potential energy of the system is equal to the sum of all Ui : U D
n X
x 0i Fi :
(12.53)
i D1
12.3.3
Lagrange’s Equations
The dynamic equations of a robot can be computed based on Newton’s equations, but the formulation becomes complicated when the number of degrees of freedom increases. For this reason, we will use Lagrange’s equations of movement. The Lagrangian £ is defined as the difference between the kinetic energy and potential energy of the system: £ D K U:
(12.54)
Lagrange’s equations of movement are given by d dt
@£ @£ D : @qP @q
(12.55)
We first compute the partial derivative of £ with respect to q: P @K @U @K @£ D D : @qP @qP @qP @qP
(12.56)
Note that the partial derivative of U with respect to qP is always zero since U does not depend on the velocities of the joints q. P Replacing K in Eqs. 12.56 and 12.51 gives @£ @ D @qP @qP
1 T T qP .V mV C ıI /qP 2
D .V T mV C ıI /q: P
(12.57)
336
12 Dynamics
In order to simplify the notation, the matrix M is defined as M D Mv C MI D V T mV C ıI and Eq. 12.57 is now written as @£ D Mq: P @qP
(12.58)
On the other hand, the partial derivative of £ with respect to q is given by @K @U 1 @£ D D qP T @q @q @q 2
@U @M qP : @q @q
(12.59)
Replacing Eqs. 12.58 and 12.59 in the Lagrange equation (12.55), we get
@U 1 T @M d ŒMq P qP qP D : dt 2 @q @q
(12.60)
The time derivative of Eq. 12.60 gives P qP 1 qP T MqR C M 2
@U @M qP C D : @q @q
(12.61)
In order to simplify the expression, we rename parts of the equation as follows: 1 T @M P C D M qP ; 2 @q @U GD ; @q
(12.62) (12.63)
where C is the Coriolis and centrifugal matrix and G is the vector of gravitational components. Therefore, we can write the dynamic equation for a serial robot with n degrees of freedom: MqR C C qP C G D :
(12.64)
Now we analyze the G matrix, looking for an equation that allows us to get it without partial derivatives. Using Eq. 12.53, we write @U @ GD D @q @q
n X
! Fi
x 0i
:
(12.65)
i D1
Since the forces Fi D mi ge2 are produced by gravity, they do not depend on the joints positions q: GD
n X i D1
Fi
@ 0 xi : @q
(12.66)
12.3 Dynamics
337
Recalling the equation of differential kinematics (12.15), we know that 0 0 xi 0 B @ 0 Bx i xi D B @ @q
1 L01 L02 C C :: C : : A
(12.67)
x 0i L0i
Evaluating the sum of Eq. 12.66 from i D 1 to n and introducing in each term the evaluation of Eq. 12.67 give us 0
0 0 0 0 1 1 1 x 01 L01 x 2 L01 x n L01 B 0 C Bx 0 L0 C B x 0 L0 C 2C 2C B B 2 B n C G D B : C F1 C B : C F2 C C B : C Fn : @ :: A @ :: A @ :: A x 0n L0n
0
0
(12.68)
Equation 12.68 is written as a matrix: 0
x 01 L01 B 0 B GDB :: @ :
x 02 L01 x 02 L02 :: :
:: :
0
0
10 1 F1 x 0n L01 BF2 C x 0n L02 C CB C CB : C: :: A @ :: A :
x 0n L0n
(12.69)
Fn
As one can see, this matrix is basically the transpose of the matrix V given in Eq. 12.42. Calling F the vector with components Fi , we can finally write the equation: G D V T F:
(12.70)
Furthermore, F is given by the product of two matrices 0
m1 B 0 B F DB : @ :: 0
0 m2 :: :
:: :
0 0 :: :
0
mn
10 1 ge2 C Bge2 C CB C C B : C D ma; A @ :: A
(12.71)
ge2
where a is a vector of accelerations. Equation 12.71 allows us to separate the constant matrices and the variables of the serial robot: G D V T ma:
(12.72)
Finally, we have a short and useful equation to compute the vector G using the information of the joints’ axes. Now, we analyze the Coriolis matrix C . In fact, there
338
12 Dynamics
are many ways to compute this matrix, and there are many matrices C that satisfy the dynamic equation (12.64). Although we already have an equation to compute the matrix C (12.62), we will look for a simpler equation to avoid the necessity of derivatives. Based on the properties of the matrices M and C , we know that P D C C CT: M
(12.73)
M D V T mV C ıI:
(12.74)
On the other hand, we know
Computing the time derivative of M produces P D d M D d V T mV C ıI ; M dt dt d T P D V mV; M dt P D V T mVP C VP T mV; M P D V T mVP C .V T mVP /T : M
(12.75) (12.76) (12.77) (12.78)
Taking into account Eqs. 12.73 and 12.78, we have a short and clear equation to compute the matrix C without derivatives: C D V T mVP :
(12.79)
The last sentence is true since we can compute the matrix VP without derivatives just as a function of the joints’ values q, q, P and the axes of the robot. In order to compute the element VPij , the time derivative of x 0i L0j is needed, using Eq. 12.27 d VPij D .x 0i L0j / D xP 0i L0j C x 0i LP 0j ; dt j 1 i X 1X 0 VPij D .x 0i L0k / L0j qPk C x i .L0j L0k L0k L0j /qPk : 2 kD1
(12.80) (12.81)
kD1
Note that Vij D 0 whenever j > i because VPij D 0. Perhaps these equations to get VP can be confused. We will rewrite these equations as matrices to give a clearer explanation of the method to compute VP . It is possible to write the matrix V as the product of two matrices: 0
x 01 B 0 B V DB : @ ::
0 x 02 :: :
:: :
0 0 :: :
0
0
x 0n
10
L01 C B L0 CB 1 CB : A @ :: L01
0 L02 :: :
:: :
0 0 :: :
L02
L0n
1 C C C D XL; A
(12.82)
12.3 Dynamics
339
then P VP D XP L C X L;
(12.83)
with 0
xP 01 B 0 B XP D B : @ ::
0 xP 02 :: :
:: :
0 0 :: :
0
0
xP 0n
1 C C C: A
(12.84)
Computing xP 0i is simple using the equation of differential kinematics (12.15): 0
1 xP 01 B :: C P @ : A D XLqP D V q:
(12.85)
xP 0n
On the other hand, 0 P0 L1 B LP 0 B 1 LP 0 D B : @ :: LP 01
0 P L02 :: : P L0 2
:: :
1 0 0 C C :: C : : A P 0n L
(12.86)
To compute LP 0i , which represents the velocity of the axis i produced by the rotation around the previous axes, we can do the following: 0 P0 1 20 0 0 L1 L1 L1 0 0 0 0 0 B BLP 0 C 6 L L L 1 6B 2 1 2 L2 B 1C B : C D 6B : :: @ :: A 2 4@ :: : LP 0n L0n L01 L0n L02 0 0 0 L1 L1 0 B L0 L0 L0 L0 2 2 B 1 2 B : :: : @ : : L01 L0n L02 L0n
:: : :: :
0 0 :: :
1 C C C A
L0n L0n 13 0 7 0 C C7 P :: C7 q: : A5
(12.87)
L0n L0n
In conclusion, using Eqs. 12.57 and 12.79, we have rewritten the dynamic equation of a serial robot with n degrees of freedom: .V T mV C ıI /qR C V T mVP qP C V T F D ; ıI qR C V T .mV qR C mVP qP C F / D :
(12.88) (12.89)
340
12 Dynamics
This decomposition allows us to see the components of inertia momentum, and centrifugal and gravity forces. Replacing F with ma and extracting m gives ıI qR C V T m.V qR C VP qP C a/ D :
(12.90)
Finally Eq. 12.90 is the dynamic equation of an n-degrees-of-freedom serial robot where the elements of the matrices are multivectors of the geometric algebra G4;1;0 . ı and a are constant and known matrices while m and I depend on the robot parameters but are also constants. Only the matrix V (Eq. 12.82) and, therefore, VP (Eq. 12.83) changes over time. Example 1 As an example, we will compare the classical and the proposed approach. We will compute the dynamics of a 2-DOF robot arm; see Fig. 12.3. First, we compute the matrices V and VP . These matrices will be used to compute M, C , and G: 0 C1 L01 0 V D ; (12.91) C20 L01 C20 L02 VP1;1 D .C1 L1 / L1 qP1 C .C1 L2 / L1 qP2 ;
Fig. 12.3 Sequence of movement of a 2-DOF robot arm
12.3 Dynamics
341
VP2;1 D .C2 L1 / L1 qP1 C .C2 L2 / L1 qP2 ; 1 VP2;2 D .C2 L1 / L2 qP1 C .C2 L2 / L2 qP2 C .C2 .L2 L1 L1 L2 //qP1 : 2 The matrix M is computed as M1;1 D m1 lc1 lc1 C m2 l1 l1 C m2 lc2 lc2 C 2m2 l1 lc2 cos.q2 / C I1 C I2 ; M1;2 D m2 lc2 lc2 C m2 l1 lc2 cos.q2 / C I2 ; M2;2 D m2 lc2 lc2 C I2 : And using conformal geometric algebra gives M1;1 D m1 .V1;1 V1;1 / C m2 .V2;1 V2;1 /; M1;2 D m2 .V2;1 V2;2 /; M2;2 D m2 .V2;2 V2;2 /: The element M2;1 D M1;2 and MI is computed as MI D
1 1 0 1
I1 I2
0 I2
D
I1 C I2 I2
I2 I2
:
Similarly, the matrix G is computed using the equations G1 D .m1 lc1 C m2 l1 /g sin.q1 / C m2 glc2 sin.q1 C q2 /; G2 D m2 glc2 sin.q1 C q2 /: In conformal geometry, they are given by G1 D m1 .V1;1 ge2 / C m2 .V2;1 ge2 /; G2 D m2 .V2;2 ge2 /: Finally, the Coriolis matrix is given by C1;1 D m2 l1 lc2 sin.q2 /qP2 ; C2;1 D m2 l1 lc2 sin.q2 /.qP1 /; C1;2 D m2 l1 lc2 sin.q2 /.qP1 C qP2 /; and in geometric algebra as C1;1 D m1 .V1;1 VP1;1 / C m2 .V1;2 VP2;1 /; C2;1 D m2 .V2;2 VP2;1 /; C1;2 D m2 .V1;2 VP2;2 /;
342
12 Dynamics
where C2;2 D 0. This is a simple example, but when the number of degrees of freedom increases, in contrast to the classical approach, the new one preserves its clarity. Example 2 In this example, we show the form of the matrices for a 6-DOF serial robot like a robot arm Adept Six 600 (see Fig. 12.4). Matrix V 0
C10 L01 B C 0 L0 B 2 1 B 0 B C3 L01 V DB 0 B C4 L01 B 0 @ C5 L01 C60 L01
0 0 C2 L02 C30 L02 C40 L02 C50 L02 C60 L02
0 0 C30 L03 C40 L03 C50 L03 C60 L03
0 0 0 C40 L04 C50 L04 C60 L04
0 0 0 0 C50 L05 C60 L05
1 0 0 C C C 0 C C (12.92) 0 C C 0 A C60 L06
Matrix m 0
m1 B 0 B B B 0 mDB B 0 B @ 0 0
0 m2 0 0 0 0
0 0 m3 0 0 0
0 0 0 m4 0 0
0 0 0 0 m5 0
1 0 0 C C C 0 C C 0 C C 0 A m6
(12.93)
Matrix a 0 1 ge2 Bge C B 2C B C Bge C a D B 2C Bge2 C B C @ge2 A ge2
Fig. 12.4 Sequence of movement of a 6-DOF robot arm
(12.94)
12.4 Complexity Analysis
343
12.4 Complexity Analysis 12.4.1 Computing M The component of kinetic energy Kv D V T mV 0
V1;1 B 0 B Mv D B : @ ::
10 m1 Vn;1 B C Vn;2 C B 0 :: C B :: : A@ :
:: :
V2;1 V2;2 :: :
0 Vn;n 0 V1;1 0 B V2;1 V2;2 B B : :: :: @ :: : : Vn;1 Vn;2
0
0 0 0 :: :
0 m2 :: :
:: :
0 0 :: :
0
mn
1
1 C C C A
C C C A
(12.95)
Vn;n
The number of computations for mV is 3n.n C 1/=2 and the product of two matrices is 0
V1;1 B 0 B Mv D B : @ :: 0
V2;1 V2;2 :: :
:: :
10 m1 V1;1 Vn;1 B C Vn;2 C B m2 V2;1 :: :: C B : : A@
0 m2 V2;2 :: :
:: :
0 0 :: :
0
Vn;n
mn Vn;1
mn Vn;2
mn Vn;n
1 C C C : (12.96) A
The number of products for each component is given by 0
n Bn 1 B Bn 2 B B : @ :: 1
n1 n1 n2 :: :
n2 n2 n2 :: :
:: :
1 1 1C C 1C C: C A
1
1
1
The total number of computations is less than or equal to
(12.97)
1 3 1 2 1 n C n C n. 3 2 6
12.4.2 Computing G
G D VTF
(12.98)
344
12 Dynamics
The matrix G is computed as 0
V1;1 B 0 B GDB : @ :: 0
V2;1 V2;2 :: :
:: :
10 1 F1 Vn;1 BF2 C Vn;2 C CB C :: C B :: C : : A@ : A
0
Vn;n
(12.99)
Fn
The number of products for each component is given by 0
1 n Bn 1C B C Bn 2C B C: B : C @ :: A 1
(12.100)
The sum of 1 C 2 C 3 C C n D n.n C 1/=2. Then the total number of products needed to compute G is at most n.n C 1/=2.
12.5 Conclusion This chapter has shown the advantages of the use of the geometric algebra framework to solve problems of dynamics of robot manipulators. The traditional approaches based on vector calculus, quaternion algebra, or linear algebra are complicated to handle the kinematics and dynamics involving points, lines, and planes. Conformal geometric algebra allows a more complete repertoire of geometric primitives and also it offers versors, which are a much more efficient representation of linear transformations. When you are dealing with Euler–Lagrange equations for robot dynamics, depending upon of the degrees of freedom of the system, the equation will have big inertia and Coriolis tensors. How to treat those tensors within the geometric algebra? One way is to disentangle those tensors in terms of matrices with entries as inner products between screw axes and the mass centers of the robot limbs. In this manner, we avoid quadratic entries in the tensors. A reformulation of these tensors in this way using conformal geometric algebra facilitates the plant identification, namely we can measure the screw axes and carry out the inner products between these lines and the mass centers. After this study, you can see that it would be very difficult to factorize the tensors using matrix algebra. We hope that the reader will be thankful this illustration and that he or she will be stimulated to apply conformal geometric algebra to develop new algorithms for system identification and for the nonlinear control of robot mechanisms.
Part V
Applications I: Image Processing, Computer Vision, and Neurocomputing
Chapter 13
Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
This chapter first presents Lie operators for key points’ detection working in the affine plane. This approach is stimulated by certain evidence of the human visual system; therefore these Lie filters appear to be very useful for implementing, in the near future, a humanoid vision system. The second part of the chapter presents an application of the quaternion Fourier transform for preprocessing for neuralcomputing. In a new way, the 1D acoustic signals of French spoken words are represented as 2D signals in the frequency and time domains. These kinds of images are then convolved in the quaternion Fourier domain with a quaternion Gabor filter for the extraction of features. This approach allows us to greatly reduce the dimension of the feature vector. Two methods of feature extraction are tested. The feature vectors were used for the training of a simple MLP, a TDNN, and a system of neural experts. The improvement in the classification rate of the neural network classifiers is very encouraging, which amply justifies the preprocessing in the quaternion frequency domain. This work also suggests the application of the quaternion Fourier transform for other image processing tasks. The third part of the chapter presents the theory and practicalities of the quaternion wavelet transform. This work generalizes the real and complex wavelet transforms and derives a quaternionic wavelet pyramid for multiresolution analysis using the quaternionic phase concept. As an illustration, we present an application of the discrete QWT for optical flow estimation. For the estimation of motion through different resolution levels, we use a similarity distance evaluated by means of the quaternionic phase concept and a confidence mask.
13.1 Lie Filters in the Affine Plane This section carries out the computations in the affine plane Ae3 .N 2 / for image analysis. We utilize the Lie algebra of the affine plane explained in Chap. 5 for the design of image filters to detect visual invariants. As an illustration, we apply these filters for the recognition of hand gestures.
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 13, c Springer-Verlag London Limited 2010
347
348
13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
13.1.1 The Design of an Image Filter In the experiment, we used simulated images of the optical flow for two motions, a rotational and a translational motion (see Fig. 13.1a), and a dilation and a translational motion (see Fig. 13.2a). The experiment uses only bivector computations to determine the type of motion, the axis of rotation, and/or the center of the dilation. To study the motions in the affine plane, we used the Lie algebra of bivectors of the neutral affine plane Ae3 .N 2 /, see Sect. 5.6. The computations were carried out with the help of a computer program that we wrote in CCC . Each flow vector at any point x of the image was coded x D xe1 Cye2 Ce3 2 N 3 . At each point of the flow image, we applied the commutator product of the six bivectors of Eq. 5.86. Using the resultant coefficients of the vectors, the computer program calculated which type of differential invariant or motion was present. Figure 13.1b shows the result of convolving, via the geometric product, the bivector with a Gaussian kernel of size 5 5. Figure 13.1c presents this result using the output of the kernel. The white center of the image indicates the lowest magnitude. Figure 13.2 shows the results for the case of a flow which is expanding. Comparing Fig. 13.1c with Fig. 13.2c, we note the duality of the differential invariants: the centerpoint of the rotation is invariant, and the invariant of the expansion is a line.
Fig. 13.1 Detection of visual invariants: (a) rotation (Lr ) and translational flow (Lx ) fields, (b) convolving via geometric product with a Gaussian kernel, (c) magnitudes of the convolution
Fig. 13.2 Detection of visual invariants: (a) expansion (Ls ) and translational flow (Lx ) fields, (b) convolving via the geometric product with a Gaussian kernel, (c) magnitudes of the convolution
13.1 Lie Filters in the Affine Plane
349
13.1.2 Recognition of Hand Gestures Another interesting application, suggested by the seminal paper of Hoffman [98], is to recognize a gesture using the key points of an image along with the previous Lie operators arranged in a detection structure, as depicted in Fig. 13.3. These Lie filters may be seen to be perceptrons, which play an important role in image preprocessing in the human visual system. It is believed [98] that during the first years of human life, some kinds of Lie operators combine to build the higher-dimensional Lie algebra SO(4,1). In this sense, we assume that the outputs of the Lie operators are linearly combined with an outstar output according to the following equation: O˛ .x; y/ D w1 Lx .x; y/ C w2 Ly .x; y/ C w3 Lr .x; y/ C w4 Ls .x; y/ C w5 Lb .x; y/ C w6 LB .x; y/;
(13.1)
where the weights wi can be adjusted if we apply a supervised training procedure. If the desired feature or key point ˛ at the point .i; y/ is detected, the output O˛ .x; y/ goes to zero.
Fig. 13.3 Lie perceptron arrangement for feature detection
350
13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
Fig. 13.4 Gesture detection: (a) (top images) gestures for robot guidance ( follow, stop, and explore), and (b) (lower images) detected gestures by the robot vision system using Lie operators
Table 13.1 Weights of the detection of hand gestures Hand gesture Lx Ly Fingertip 0 0 Stop 0 0 Fist 0 0
Lie perceptron arrangement for the Lr
Ls
Lb
LB
9 3 2
4 1 2
11 1 2
9 4 1
Tolerance 10% 10% 10%
Figure 13.4a shows hand gestures given to a robot. By applying the Lie perceptron arrangement to the hand region (see Fig. 13.4b), a robot can detect whether it should follow, stop, or move in circles. Table 13.1 presents the weights, wi , necessary for detecting the three gestures. Detection tolerance is computed as follows: O˛ .x; y/ min C
.max min/Tolerance ! detection of a feature type; 100
where min and max correspond to the minimal and maximal Lie operator outputs, respectively.
13.2 Representation of Speech as 2D Signals In this work, we use the psycho-acoustical model of a loudness meter suggested by Zwicker [199]. This meter model is depicted in Fig. 13.5a. The output is a 2D representation of sound loudness over time and frequency. The motivation of this work is to use the loudness image in order to take advantage of the variation in time of the frequency components of the sound. A brief explanation of this meter model follows.
13.2 Representation of Speech as 2D Signals
351
a
b filters 20
2 1 msec
c d Sone
Barks
Fig. 13.5 From the top: (a) The psycho-acoustical model of loudness meter suggested by Zwicker, (b) 2D representation (vertical outputs of the 20 filters, horizontal axis is the time), (c) a 3D energy representation of (b) where the energy levels are represented along the z-coordinate, (d) main loudness signal or total output of the psycho-acoustical model (the sum of the 20 channels)
The sound pressure is picked up by a microphone and converted to an electrical signal, which in turn is amplified. Thereafter, the signal is attenuated to produce the same loudness in a diffuse and free-sound field. In order to take into account the frequency dependence of the sound coming from the exterior and passing through the outer ear, a transmission factor is utilized. The signal is then filtered by a filter bank with filter bands dependent on the critical band rate (Barks). In this work,
352
13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
we have taken only the first 20 Barks, because this study is restricted to the band of speech signals. At the output of the filters, the energy of each filter signal is calculated to obtain the maximal critical band level varying with time. Having 20 of these outputs, a 2D sound representation can be formed as presented in Fig. 13.5b. At each filter output, the loudness is computed taking into account temporal and frequency effects according to the following equation: ETQ 0:25 N D 0:068 s E0 0
" # E 0:25 sone 1sCs ; 1 ETQ Bark
where ETQ stands for the excitation level at a threshold level in quite, E is the main excitation level, E0 is the excitation that corresponds to the reference intensity W , s stands for the masking index, and finally one sone is equivalent to I0 D 1012 m 2 40 phones. To obtain the main loudness value, the specific loudness of each critical band is added. Figure 13.5c depicts the time–loudness evolution in 3D, where the loudness levels are represented along the z-coordinate.
13.3 Preprocessing of Speech 2D Representations Using the QFT and Quaternionic Gabor Filter This section presents two methods of preprocessing. The first is a simple one that can be formulated; however, we show that the extracted features do not yield a high classification rate, because the sounds of the consonants of the phonemes are not well recognized.
13.3.1 Method 1 A quaternion Gabor filter is used for the preprocessing (see Fig. 13.6a). This filter is convolved with an image of 80 80 pixels .5 16Œchannels/ .800=10Œms/ using the quaternion Fourier transform. In order to have for the QFT a 8080 square matrix for each of the 16 channels, we copied the rows 5 times, for 80 rows. We used only the first approximation of the psycho-acoustical model, which comprises only the first 16 channels. The feature extraction is done in the quaternion frequency domain by searching features along the lines of expected maximum energy (see Fig. 13.6b). After an analysis of several images, we found the best place for these lines in the four images of the quaternion image. Note that this approach first expands the original image 80 80 D 1600 to 4 1600 D 6400 and then reduces this result to a feature vector of length 16 [channels] 8 [analysis lines] D 128. This clearly explains the motivation of our approach; we use a four-dimensional
13.3 Preprocessing of Speech 2D Representations
a
Quaternion Gabor Filter Mask Real part
i part
j part
k part
353
b
c
French words
d
128
128
Fig. 13.6 Method 1: ( from upper left corner) (a) quaternion Gabor filter, (b) selected 128 features according to energy level (16 channels and 8 analysis lines in the four r, i, j, k images, 16 8 D 128), (c) stack of feature vectors for 10 numbers and 29 speakers (the ordinate axis shows the words of the first 10 French numbers and the abscissa the time in milliseconds), (d) the stack of the French word neuf spoken by 29 speakers (also presented in (c))
representation of the image for searching features along these lines; as a result we can effectively reduce the dimension of the feature vector. Figure 13.6c shows the feature vectors for the phonemes of 10 decimal numbers spoken by 29 different female or 20 different male speakers. For example, for the neuf of Fig. 13.6c, we have stacked the feature vectors of the 29 speakers, as Fig. 13.6d shows. The consequence of considering the first 16 channels was a notorious loss in high-frequency components, making the detection of the consonants difficult. We also noticed that in the first method, using a wide quaternion Gabor filter, even though higher levels of energy were detected, the detection of the consonants did not succeed. Conversely, using a narrow filter, we were able to detect the consonants, but the detected information of the vowels was very poor. This indicated that we should filter only those zones where changes of sound between consonants and vowels take place. The second method, presented next, is the implementation of this key idea.
354
13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
13.3.2 Method 2 The second method does not convolve the whole image with the filter. This method uses all the 20 channels of the main loudness signal. First, we detect the inflection points where the changes of sounds take place, particularly those between consonants and vowels. These inflection points are found by taking the first derivative with respect to the time of the main loudness signal (see Fig. 13.7a for the sept). Let us imagine that someone says ssssseeeeeeepppppth, with the inflection points, one detects two transition regions of 60 ms, one for se and another for ept (see Fig. 13.7b). By filtering these two regions with a narrow quaternion Gabor filter, we split each region to another two, separating s from e (first region) and e from pth (for the second region). The four strips represent what happens before and after the vowel e (see Fig. 13.8a). The feature vector is built by tracing a line through the maximum levels of each strip. We obtain four feature columns: column 1 for s,
a
b
c
Quaternion Gabor Filter Mask Real part
i part
j part
k part
Fig. 13.7 (a) Method 2: the main loudness signals for sept and neuf spoken by two speakers, (b) determination of the analysis strips using the lines of the inflection points (20 filter responses at the ordinate axis and the time in milliseconds at the abscissa axis), (c) narrow-band quaternion Gabor filter
13.4 Recognition of French Phonemes Using Neurocomputing
355
b
a
Sept
Ij
Neuf
Speaker 1
Speaker 2
Fig. 13.8 Method 2: from the left (a) selected strips of the quaternion images for the words sept and neuf spoken by two speakers; (b) zoom of a strip of the component j of the quaternionic image and the selected 4 columns for feature extraction (20 channels 4 lines D 80 features)
column 2 for e, column 3 for e, and column 4 for pth (see Fig. 13.8b). Finally, one builds one feature vector of length 80 by arranging the four columns (20 features each). Note that the second method reduces the feature vector length even more, from 128 to 80. Figure 13.9a shows the feature vectors of length 80 for the phonemes of the 10 decimal numbers spoken by 29 different female or male speakers. For example, for the neuf of Fig. 13.7b, we have stacked the feature vectors of the 29 speakers, as shown in Fig. 13.9b.
13.4 Recognition of French Phonemes Using Neurocomputing The extracted features using method 1 were used for training a multi-layer perceptron, depicted in Fig. 13.10a. We used a training set of 10 male and 9 female speakers; each one spoke the first 10 French numbers zero; un; deux; trois, quatre; cinq; seize; sept; huit, neuf ; thus, the training set comprises 190 samples. After the training, a set of 100 spoken words was used for testing method 1 (10 numbers spoken by 5 male and 5 female speakers makes 100 samples). The recognition percentage achieved was 87%. For the second approach, the features were extracted using method 2. The structure used for recognition consists of an assembly of three neural experts regulated by a neural network arbitrator. In Fig. 13.10b, the arbitrator is called main and each neural expert is dedicated to recognize a certain group of spoken words. The recognition percentage achieved was 98%. The great improvement in the recognition rate is mainly due to the preprocessing of method 2 and
356
13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
a
b
Fig. 13.9 Method 2: (a) stack of feature vectors of 29 words spoken by 29 speakers, (b) the stack for the word neuf spoken by 29 speakers
b
a
Fig. 13.10 (a) Neural network used for method 1, (b) group of neural networks used for method 2
the use of a neural expert system. We carried out a similar test using a set of 150 training samples (10 male and 5 female speakers) and 150 recall samples (5 male and 10 female speakers); the recognition rate achieved was a bit lower than 97%. This may be due to the lower number of female speakers during training and their higher number in the recall. This means that the system specializes itself better for samples spoken by males.
13.5 Application of QWT Table 13.2 Comparison of the methods (M D male and F D female) Method Training speakers Test speakers Samples/speaker Test samples 1 (MLP) 10 M, 9 F 5 M, 5 F 1 100 2 (Neural Experts) 10 M, 9 F 5 M, 5 F 1 100 2 (Neural Experts) 10 M, 5 F 5 M, 10 F 10 1500 3 (TDNN) 10 M, 5 F 5 M, 10 F 10 1500
357
Rate 87% 98% 97% 93.8%
In order to compare with a standard method used in speech processing, we resorted to the time delay neural network (TDNN) [136]. We used the stuttgart neural network simulator (SNNS). The input data in the format required for the SNNS was generated using Matlab. The selected TDNN architecture was as follows: (input layer) 20 inputs for the 20 features that code the preprocessed spoken numbers, 4 delays of length 4; (hidden layer) 10 units, 2 delays of length 2; (output layer) 10 units for the 10 different numbers. We trained the TDNN using 1,500 samples, 150 samples for each spoken number. We used the SNNS learning function timedelaybackprop and for updating the function Timedelay order. During the learning, we carried out 1,000 cycles, that is, 1,000 iterations for each spoken number. We trained the neural network in two ways: (i) one TDNN with 10 outputs; and (ii) a set of three TDNNs, where each neural network was devoted to learn a small disjoint set of spoken numbers. Of the two methods, the best result was obtained using the first TDNN, which managed to get a 93.8% rate of recognition success. Table 13.1 summarizes the test results. The letters F and M stand for female and male speakers, respectively. Table 13.2 shows that the best architecture was the one composed of a set of neural experts (method 2). Clearly, the TDNN performed better than the MLP. Taking into account the performance of method 2 and of the TDNN, we find that the preprocessing using the quaternion Fourier transform played a major role.
13.5 Application of QWT The motion estimation using quaternionic wavelet filters is inferred by means of the measurements of phase changes of the filter outputs. The accuracy of such an estimation depends on how well our algorithm deals with the correspondence problem, which can be seen as a generalization of the aperture problem depicted in Fig. 13.11. This section deals with the estimation of the optical flow in terms of the estimation of the image disparity. The disparity of a couple of images f .x; y/1 and f .x; y/2 is computed by determining the local displacement, which satisfies f .x; y/1 D f .x C dx ; y C dy /2 D f .x C d/, where d D .dx ; dy /. The range of d has to be small compared with the image size; thus, the observed features always have to be within a small neighborhood in both images. In order to estimate the optical flow using the quaternionic wavelet pyramid, first we compute the quaternionic phase at each level, then the confidence measure, and finally the determination of the optical flow. The first two are explained in the next sections. Thereafter, we show two examples of optical flow estimation.
358
13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
Fig. 13.11 The aperture problem
13.5.1 Estimation of the Quaternionic Phase The estimation of the disparity using the concept of local phase begins with the assumption that a couple of successive images are related as follows: f1 .x/ D f2 .x C d.x//;
(13.2)
where d.x/ is the unknown vector. Assuming that the phase varies linearly (here the importance of shifting-invariant filters), the displacement d.x/ can be computed as dx .x/ D
2 .x/ 1 .x/ C n.2 C k/ ; 2 uref
dy .x/ D
2 .x/ 1 .x/ C m
; 2 vref (13.3)
with reference frequencies .uref ; vref / that are not known a priori. Here .x/ and .x/ are the first two components of the quaternionic local phase of the quaternionic filter. We choose n; m 2 Z, so that dx and dy are within a valid range. Depending on m, k is defined as kD
n 0, if m is even, 1, if m is odd.
(13.4)
A good disparity estimation is achieved if .uref ; vref / are chosen well. There are two methods of dealing with the problem: (i) the constant model, where uref and vref are chosen as the central frequencies of the filters; (ii) the model for the complex case called the local model, which supposes that the phase takes the same value ˚1 .x/ D ˚2 .x C d / in two corresponding points of both images. Thus, one estimates d by approximating ˚2 via a first-order Taylor’s series expansion about x: ˚2 .x C d/ 2 .x/ C .d 5/2 .x/;
(13.5)
13.5 Application of QWT
359
where we call ˚ D .; /. Solving Eq. 13.5 for d, we obtain the estimated disparity of the local model. In our experiments, we assume that varies along the x-direction and along y. Using this assumption, the disparity (Eq. 13.3) can be estimated using the following reference frequencies: uref D
1 @1 .x/; 2 @x
vref D
1 @1 .x/: 2 @y
(13.6)
In the locations where uref and vref are equal to zero, Eqs. 13.6 are undefined. One can neglect these localities using a sort of confidence mask. This is explained in the next section.
13.5.2 Confidence Interval In neighborhoods where the energy of the filtered image is low, one cannot estimate the local phase monotonically; similarly, at points where the local phase is zero, the estimation of the disparity is impossible because the disparity is computed by the phase difference divided by the local frequency. In this regard, we need a confidence measurement that indicates the quality of the estimation at a given point. For this purpose, we can design a simple binary confidence mask. Using complex filters, this can be done depending on whether the filter response is reasonable. In the case of quaternionic filters, we need two confidence measurements for and . According to the multiplication rule of the quaternions, we see that the first two components of the quaternionic phase are defined for almost each point as .x/ D argi .k q .x/ˇ.k q .x///; .x/ D argj .˛.k q .x//k q .x//;
(13.7)
where k q is the quaternionic filter response, and argi and argj were defined in Eq. 3.34. The projections of the quaternionic filter are computed according to Eq. 3.35. Using these angles and projections of the response of the quaternionic filter, we can now extend the well-known confidence measurement of the complex filters Conf.x/ D
n 1, 0,
if |k(x)| > , otherwise
(13.8)
to the quaternionic case: Ch .k q .x// D Cv .k q .x// D
n1 0 n1 0
if modi .k q .bf x/ˇ.k q .x/// > , otherwise if modj .˛.k q .x//k q .x/ > , otherwise.
(13.9) (13.10)
360
13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
Given the outputs of the quaternionic filters k1q and k2q of two images, we can implement the following confidence masks: Confh .x/ D Ch .k1q .x//Ch.k2q .x//; q
q
Confv .x/ D Cv .k1 .x//Cv .k2 .x//;
(13.11) (13.12)
where Confh and Confv are the confidence measurements for the horizontal and vertical disparity, respectively. However, one cannot use these measurements, because they are fully identical. This can easily be seen due to the simple identity for any quaternion q modi .qˇ.q// N D modj .˛.q/q/; N
(13.13)
which can be checked straightforwardly by making explicit both sides of the equation. We can conclude, using the previous formulas, that for the responses of two quaternionic filters of either image, the horizontal and vertical confidence measurements will be identical: Confh .x/ D Confv .x/: (13.14) For a more detailed explanation of the quaternionic confidence interval, the reader can resort to the Ph.D. thesis of B¨ulow [29].
13.5.3 Discussion on Similarity Distance and the Phase Concept In the case of the complex conjugated wavelet analysis, the similarity distance Sj ..x; y/; .x 0 ; y 0 // for any pair of image points on the reference image f .x; y/ and the matched image (after a small motion) f 0 .x 0 ; y 0 / is defined using the six differential components Dj;p ; DQ j;p ; p D 1; 2; 3, of Eq. 8.76. The best match will be achieved by finding u; v, which minimize minu;v Sj ..k; l/; .k 0 C u; l 0 C v//:
(13.15)
The authors [127, 144] show that, with the continuous interpolation of Eq. 13.15, one can get a quadratic equation for the similarity distance: Sj;p ..k; l/; .k 0 C u; l 0 C v// D s1 .u uo /2 C s3 .u uo /.v vo / C s4 ; (13.16) where u0 ; v0 is the minimum point of the similarity distance surface Sj;p , s1 ; s2 ; s3 are the curvature directions, and s4 is the minimum value of the similarity distance Sj . The parameters of this approximate quadratic surface provide a subpixel-accurate motion estimate and an accompanying confidence measure. By its utilization in the complex-valued discrete wavelet transform hierarchy, the authors claim to handle the aperture problem successfully. In the case of the quaternion wavelet transform, we directly use the motion information captured in the three phases of the detail filters. The approach is linear, due
13.5 Application of QWT
361
to the linearity of the polar representation of the quaternion filter. The confidence throughout the pyramid levels is assured by the bilinear confidence measure given by Eqs. 13.12, and the motion is computed by the linear evaluation of the disparity equations (13.3). The local model approach helps to estimate the uref and vref by evaluating Eqs. 13.6. The estimation is not of a quadratic nature, like Eq. 13.16. An extension to this kind of quadratic estimation of motion constitutes an extra avenue for further improvement of the application of the QWT for multiresolution analysis. Fleet [59] claims that the phase-based disparity estimation is limited to the estimation of the components of the disparity vector that is normal to an oriented structure in the image. These authors believe that the image is intrinsically onedimensional almost everywhere. In contrast to the case of the quaternion confidence measure (see Eq. 13.12), we see that it singles out those regions where horizontal and vertical displacement can reliably be estimated simultaneously. Thus, by using the quaternionic phase concept, the full displacement vectors are evaluated locally at those points where the aperture problem can be circumvented.
13.5.4 Optical Flow Estimation In this section, we show the estimation of the optical flow of the Rubik cube and Hamburg taxi image sequences. We used the following scaling and wavelet quaternionic filters c2 "!2 y c1 !1 x hq D g.x; 1 ; "/ exp i exp j ; (13.17) 1 1 cQ2 "!Q 2 y cQ1 !Q 1 x q exp j ; (13.18) g D g.x; 2 ; "/ exp i 2 2 with 1 D 6 and 2 D 5 6 so that the filters are in quadrature and c1 D cQ1 D 3, !1 D 1 and !2 D 1, and " D 1. The resulting quaternionic mask will also be subsampled through the levels of the pyramid. For the estimation of the optical flow, we use two successive images of the image sequence. Thus, two quaternionic wavelet pyramids are generated. For our examples, we computed four levels. According to Eq. 8.88, at each level of each pyramid we obtain 16 images, accounting for the 4 quaternionic outputs (approximation ˚ and the details 1 (horizontal), 2 (vertical), 3 (diagonal)). The phases are evaluated according to Eqs. 3.4. Figure 13.13 shows the magnitudes and phases obtained at level j using two successive Rubik’s images. After we have computed the phases, we proceed to estimate the disparity images using Eqs. 13.3, where the reference frequencies u and v are calculated according to Eq. 13.6. We apply the confidence mask according to the guidelines given in Sect. 13.5.2 and shown in Fig. 13.14a. After the estimation of the disparity has been filtered by the confidence mask, we proceed to estimate the optical flow at each point computing a velocity vector
362
13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
in terms of the horizontal and vertical details. Now, using the information from the diagonal detail, we adjust the final orientation of the velocity vector. Since the procedure starts from the higher level (top-down), the resulting matrix of the optical flow vectors is expanded in size equal to the next level, as shown for one level in Fig. 13.12. The algorithm estimates the optical flow at the new level. The result is compared with the one of the expanded previous levels. The velocity vectors of the previous level fill gaps in the new level.
Fig. 13.12 Procedure for the estimation of the disparity at one level
13.5 Application of QWT
363
Magnitude Approximation
Phase φ Approximation
Phase θ Approximation
Phase ψ Approximation
Magnitude Horizontal
Phase φ Horizontal
Phase θ Horizontal
Phase ψ Horizontal
Magnitude Vertical
Phase φ Vertical
Phase θ Vertical
Phase ψ Vertical
Magnitude Diagonal
Phase φ Diagonal
Phase θ Diagonal
Phase ψ Diagonal
Fig. 13.13 The magnitudes and phase images for Rubik’s sequence at a certain level j : (upper row) the approximation ˚ and (next rows) the details 1 (horizontal), 2 (vertical), 3 (diagonal)
b
Disparity Vectors
Fig. 13.14 (a) Confidence mask, (b) estimated optical flow
This procedure is continued until the bottom level. In this way, the estimation is refined smoothly, and the well-defined optical flow vectors are passed from level to level, increasing the confidence of the vectors at the finest level. It is unavoidable for some artifacts to survive at the final stage. A final refinement can be applied, imposing a magnitude thresholding and certainly deleting isolated small vectors. Figure 13.14b presents the computed optical flow for an image couple of the Rubik’s image sequence. Next, we present the computations of optical flow of the Hamburg taxi, using the QWT. Figure 13.15a shows a couple of images: Fig. 13.15b shows the confidence matrices at the four levels and Fig. 13.15c presents the fused horizontal and vertical disparities and the resulting optical flow at a high level of resolution. For comparison, we utilized the method of Bernard [23], which uses real-valued discrete wavelets. In Fig. 13.16, we can see that our method yields better results. Based
364
13 Applications of Lie Filters, and Quaternion Fourier and Wavelet Transforms
40 35 30 25 20 15 10 5 0 0
5
10 15 20 25 30 35 40
Fig. 13.15 (a) (top row) A couple of successive images, (b) (middle row) confidence matrices at four levels, (c) (bottom) horizontal and vertical disparities and optical flow
Fig. 13.16 Optical flow computed using discrete real-valued wavelets. (Top) Optical flow at fourth level. (Bottom) Optical flow at third, second, and first levels
13.6 Conclusion
365
on the results, we can conclude that our procedure using a quaternionic wavelet pyramid and the phase concept for the parameter estimation works very well in both experiments. We believe that the computation of the optical flow using the quaternionic phase concept should be considered by researchers and practitioners as an effective alternative for multiresolution analysis.
13.6 Conclusion In the first section, the chapter shows the application of feature detectors using Lie perceptrons in the 2D affine plane. The filters are not only applicable to optic flow, they may also be used for the detection of key points. Combinations of Lie operators yield to complex structures for detecting more sophisticated visual geometry. The second part presents the preprocessing for neuralcomputing using the quaternion Fourier transform. We applied this technique for the recognition of the first 10 French numbers spoken by different speakers. In our approach, we expanded the signal representation to a higher-dimensional space, where we search for features along selected lines, allowing us to surprisingly reduce the dimensionality of the feature vector. The results also show that the method manages to separate the sounds of vowels and consonants, which is quite rare in the literature of speech processing. For the recognition, we used a neural experts architecture. The third part of this chapter introduced the theory and practicalities of the QWT, so that the reader can apply it to a variety of problems making use of the quaternionic phase concept. We extended Mallat’s multiresolution analysis using quaternion wavelets. These kernels are more efficient than the Haar quaternion wavelets. A big advantage of our approach is that it offers three phases at each level of the pyramid, which can be used for a powerful top-down parameter estimation. As an illustration in the experimental part, we apply the QWT for optical flow estimation. We believe that this chapter can be very useful for researchers and practitioners interested in understanding and applying the quaternion Fourier and wavelet transforms.
Chapter 14
Invariants Theory in Computer Vision and Omnidirectional Vision
14.1 Introduction This chapter demonstrates that geometric algebra provides a simple mechanism for unifying current approaches in the computation and application of projective invariants using n-uncalibrated cameras. First, we describe Pascal’s theorem as a type of projective invariant, and then the theorem is applied for computing camera-intrinsic parameters. The fundamental projective invariant cross-ratio is studied in one, two, and three dimensions, using a single view and then n views. Next, by using the observations of two and three cameras, we apply projective invariants to the tasks of computing the view-center of a moving camera and to simplifying visually guided grasping. The chapter also presents a geometric approach for the computation of shape and motion using projective invariants within a purely geometric algebra framework [94,95]. Different approaches for projective reconstruction have utilized projective depth [172, 181], projective invariants [45], and factorization methods [154, 186, 187] (factorization methods incorporate projective depth calculations). We compute projective depth using projective invariants, which depend on the use of the fundamental matrix or trifocal tensor. Using these projective depths, we are then able to initiate a projective reconstruction procedure to compute shape and motion. We also apply the algebra of incidence in the development of geometric inference rules to extend 3D reconstruction. The geometric procedures presented here contribute to the design and implementation of perception action cycle (PAC) systems, as depicted in Fig. 14.1. In the last section, we present a robust technique for landmark identification using omnidirectional vision and the projective and permutation p 2 -invariants. Here, the use of the permutation p 2 -invariant makes more robust the identification of projective invariants in the projective space. This chapter is organized as follows: Sect. 14.2 briefly explains conics and Pascal’s theorem. Section 14.3 demonstrates a method for computing intrinsic camera parameters using Pascal’s theorem. Section 14.4 studies in detail the projective invariant cross-ratio in 1D, 2D, and 3D, as well as the generation of projective invariants in 3D. Section 14.5 presents the theory of 3D projective invariants using n-uncalibrated cameras. Section 14.6 illustrates the use of projective invariants
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 14, c Springer-Verlag London Limited 2010
367
368
14 Invariants Theory in Computer Vision and Omnidirectional Vision
Perception
Action
Per
cep
tion
Acti on
Fig. 14.1 Abstraction of biologic and artificial PAC systems
for a simplified task of visually guided robot-grasping. The problem of camera self-localization is discussed in Section 14.7. Computation of projective depth using projective invariants in terms of the trifocal tensor is given in Section 14.8. The treatment of projective reconstruction and the role of the algebra of incidence in completing the 3D shape is given in Section 14.9. Section 14.10 presents landmark identification using omnidirectional vision and projective invariants. Section 14.11 is devoted to the conclusions.
14.2 Conics and Pascal’s Theorem The role of conics and quadrics is well known in projective geometry [171]. This knowledge led to the solution of crucial problems in computer vision [140]. In the last decade, Kruppa’s equations, which rely on the conics concept, have been used to compute intrinsic camera parameters [130]. In the present work, we further explore the conics concept and use Pascal’s theorem to establish an equation system with clear geometric transparency. We then explain the role of conics and that of Pascal’s theorem in relation to fundamental projective invariants. Our work here is based primarily on that of Hestenes and Ziegler [95], that is, on an interpretation of linear algebra together with projective geometry within a Clifford algebra framework.
14.2 Conics and Pascal’s Theorem
369
In order to use projective geometry in computer vision, we utilize homogeneous coordinate representations, which allows us to embed both 3D Euclidean visual space in 3D projective space P 3 or R4 , and the 2D Euclidean space of the image plane in 2D projective space P 2 or R3 . Using the geometric algebra framework, we select for P 2 the 3D Euclidean geometric algebra G3;0;0 , and for P 3 the 4D geometric algebra G1;3;0 . The reader should refer to Chap. 9 for more details relating to the geometry of n cameras. Any geometric object of P 3 will be linear-projectivemapped to P 2 via a projective transformation. For example, the projective mapping of a quadric at infinity in the projective space P 3 results in a conic in the projective plane P 2 . Let us first consider a pencil of lines lying on the plane. Any pencil of lines may be well defined by the bivector addition of two of its lines: l D l a C sl b with s 2 R [ f1; C1g. If two pencils of lines l and l 0 D l 0a C s 0 l 0b can be related one to one so that l D l 0 for s D s 0 , we say that they are in projective correspondence. Using this idea, we show that the set of intersecting points of lines in projective correspondence builds a conic. Since the intersecting points x of the pencils of lines l and l 0 fulfill for s D s 0 , the following constraints are met: x^l D x^l a C sx^l b D 0; x^l 0 D x^l 0a C sx^l 0b D 0:
(14.1)
The elimination of the scalar s yields a second-order geometric-product equation in x: .x^l a /.x^l 0b / .x^l b /.x^l 0a / D 0:
(14.2)
We can also derive the parameterized conic equation, by simply computing the intersecting point x by means of the meet of the pencils of lines, as follows: x D .l a C sl b / \ .l 0a C sl 0b / D l a \ l 0a C s.l a \ l 0b C l b \ l 0a / C s 2 l b \ l 0b : (14.3)
Let us, for now, define the involved lines as a wedge of points, l a D a ^ b, l b D a ^ b0 , l 0a D a0 ^ b, and l 0b D a0 ^ b0 , such that l a \ l 0a D b and l b \ l 0b D b0 (see Fig. 14.2a). By substituting b00 D l a \ l 0b C l b \ l 0a D d C d 0 into Eq. 14.3, we get x D b C sb00 C s 2 b0 ;
(14.4)
which represents a nondegenerated conic for b^b00 ^b0 D b^.d C d 0 /^b0 6D 0. Now, using this equation, let us recompute the original pencils of lines. By defining l 1 D b00 ^ b0 , l 2 D b0 ^ b, and l 3 D b ^ b00 , we can use Eq. 14.4 to compute the projective pencils of lines: b^x D sb^b00 C s 2 b^b0 D s.l 3 sl 2 /; b0 ^x D b0 ^b C sb0 ^b00 D l 2 sl 1 :
(14.5)
370
14 Invariants Theory in Computer Vision and Omnidirectional Vision
b)
a)
Fig. 14.2 Pencils of lines related to a conic: (a) two projective pencils of lines used to generate the conic, (b) Pascal’s theorem
By considering the points a, a0 , b, and b0 and some other point y lying in the conic depicted in Fig. 14.2a, and by using Eq. 14.1 for s D I s 0 slightly different to s 0 , we get the bracketed expression ŒyabŒya0 b0 I Œyab0 Œya0 b D 0; ŒyabŒya0 b0 ; I D Œyab0 Œya0 b
(14.6)
for some scalar I ¤ 0. This equation is well known and represents a projective invariant, a concept which has been used quite a lot in real applications of computer vision. Sections 14.4 and 14.5 show this invariant using brackets of points, bilinearities, and the trifocal tensor (see also Bayro-Corrochano [13] and Lasenby [116]). We can evaluate I of Eq. 14.6 in terms of some other point c to develop a conic equation that can be fully expressed within brackets: ŒyabŒya0 b0
ŒcabŒca0 b0 Œyab0 Œya0 b D 0; Œcab0 Œca0 b
ŒyabŒya0 b0 Œab0 c 0 Œa0 bc 0 Œyab0 Œya0 bŒabc 0 Œa0 b0 c 0 D 0:
(14.7)
Again, the resulting equation is well known, which tells us that any conic is uniquely determined by five points in general positions a, a0 , b and b0 , and c . Now, considering Fig. 14.2b, we can identify three collinear intersecting points ˛1 , ˛2 , and ˛3 . By using the collinearity constraint and the pencils of lines in projective correspondence, we can now write a very useful equation: ..a0 ^b/ \ .c 0 ^ c// ^ ..a0 ^ a/ \ .b0 ^ c// ^ ..c 0 ^ a/ \ .b0 ^ b// D 0: (14.8) „ ƒ‚ … „ ƒ‚ … „ ƒ‚ … ˛1 ˛2 ˛3 This expression is a geometric formulation, using brackets, of Pascal’s theorem, which says that the three intersecting points of the lines which connect opposite
14.3 Computing Intrinsic Camera Parameters
371
vertices of a hexagon circumscribed by a conic are collinear. Equation (14.8) is used in the following section for computing intrinsic camera parameters.
14.3 Computing Intrinsic Camera Parameters This section presents a technique within the geometric algebra framework for computing intrinsic camera parameters. In the previous section it was shown that Eq. 14.7 can be reformulated to express the constraint of Eq. 14.8, known as Pascal’s theorem. Since Pascal’s equation fulfills a property of any conic, we should also be able to use this equation to compute intrinsic camera parameters. Let us consider three intersecting points which are collinear and fulfill the parameters of Eq. 14.8. Figure 14.3 shows the first camera image, where the projected rotated points of the conic at infinity RT A, RT B, RT A 0 , RT B 0 , and RT C 0 are
B
C
P1 P2
P2
P1 P3=0
A P3
B’
A’
C’
K1 [R1|t1]
Πinf
Kn [Rn|tn] K2 [R2|t2]
e12
C1
Π1
C2 e21
Π2
Πn
Fig. 14.3 A demonstration of Pascal’s theorem at conics at infinity and at images of n-uncalibrated cameras
372
14 Invariants Theory in Computer Vision and Omnidirectional Vision
a D KŒRj0RT A D KA, b D KŒRj0RT B D KB, a0 D KŒRj0RT A 0 D KA0 , b0 D KŒRj0RT B 0 D KB 0 , and c 0 D KŒRj0C 0 D KC 0 . The point c D KK T l c is dependent upon the camera-intrinsic parameters and upon the line l c tangent to the conic, computed in terms of the epipole = Œp1 ; p2 ; p3 T and a point lying at infinity upon the line of the first camera l c D Œp1 ; p2 ; p3 T Œ1; ; 0T . Now, using this expression for l c , we can simplify Eq. 14.8 to obtain the equations for the ˛’s in terms of brackets,
Œa0 bc 0 cŒa0 bcc 0 ^ Œa0 ab0 cŒa0 acb0 ^ Œc 0 ab0 bŒc 0 abb0 D 0 , (14.9)
ŒKA 0 KBKC 0 KK T l c ŒKA0 KBKK T l c KC 0 ^ ŒKA 0 KAKB 0 KK T l c ŒKA0 KAKK T l c KB 0 ^ ŒKC 0 KAKB 0 KB ŒKC 0 KAKBKB 0 D 0 det.K/K ŒA 0 BC 0 K T l c ŒA 0 BK T l c C 0 ^ det.K/K ŒA 0 AB 0 K T l c ŒA 0 AK T l c B 0 / ^ det.K/K ŒC 0 AB 0 B ŒC 0 ABB 0 / D 0 , (14.10) ŒA 0 BC 0 K T l c ŒA 0 BK T l c C 0 ^ ŒA 0 AB 0 K T l c ŒA 0 AK T l c B 0 /^ ŒC 0 AB 0 B ŒC 0 ABB 0 / D 0 , (14.11) 0 ŒA BC 0 K T l c ŒA 0 BK T l c C 0 ^ ŒA 0 AB 0 K T l c ŒA 0 AK T l c B 0 ƒ‚ … „ ƒ‚ … „ ˛1 ˛2 ^ ŒC 0 AB 0 B ŒC 0 ABB 0 D 0: (14.12) ƒ‚ … „ ˛3 det.K/3 K
Note that in Eq. 14.11 the scalars det.K/3 and K are cancelled out, thereby simplifying the expression for the ˛s. The computation of the intrinsic parameters should take into account two possible situations: the intrinsic parameters remain stationary when the camera is in motion, or they will vary. By using Eq. 14.12, we are able to develop after one camera movement a set of eight quadratic equations, from which we can compute four intrinsic camera parameters (see [17] for the technical details of the computer algorithm).
14.4 Projective Invariants In this section, we use the framework established in Chap. 9 to show how standard invariants can be expressed both elegantly and concisely using geometric algebra. We begin by looking at algebraic quantities that are invariant under projective transformations, arriving at these invariants using a method which can be easily generalized from one dimension to two and three dimensions.
14.4 Projective Invariants
373
14.4.1 The 1D Cross-Ratio The fundamental projective invariant of points on a line is the so-called cross-ratio, , defined as AC BD .t3 t1 /.t4 t2 / D D ; BC AD .t3 t2 /.t4 t1 / where t1 D jPAj; t2 D jPBj, t3 D jP C j, and t4 D jPDj. It is fairly easy to show that for the projection through O of the collinear points A; B; C , and D onto any line, remains constant. For the 1D case, any point q on the line L can be written as q D t 1 relative to P , where 1 is a unit vector in the direction of L. We can then move up a dimension to a 2D space, with basis vectors .1 ; 2 /, which we will call R2 and in which q is represented by the following vector Q: Q D T 1 C S 2 :
(14.13)
Note that, as before, q is associated with the bivector, as follows: qD
T T Q^2 D 1 2 1 D t 1 : Q2 S S
(14.14)
When a point on line L is projected onto another line L0 , the distances t and t 0 are related by a projective transformation of the form t0 D
˛t C ˇ : ˛t Q C ˇQ
(14.15)
This nonlinear transformation in E 1 can be made into a linear transformation in R2 by defining the linear function f 1 , which maps vectors onto vectors in R2 : Q 2; f 1 .1 / D ˛1 1 C ˛ Q 2: f 1 .2 / D ˇ1 1 C ˇ Consider two vectors X1 and X2 in R2 . Now form the bivector S1 D X1 ^X2 D 1 I2 ; where I2 D 1 2 is the pseudoscalar for R2 . We can now look at how S1 transforms under f 1 : S10 D X01 ^X02 D f 1 .X1 ^X2 / D .detf 1 /.X1 ^X2 /:
(14.16)
This last step follows as a result of a linear function, which must map a pseudoscalar onto a multiple of itself, the multiple being the determinant of the function. Suppose that we now select four points of the line L, whose corresponding vectors in R2 are
374
14 Invariants Theory in Computer Vision and Omnidirectional Vision
fXi g, i D 1; : : : ; 4, and consider the ratio R1 of two wedge products: X1 ^X2 : X3 ^X4
(14.17)
.detf 1 /X1 ^X2 X01 ^X02 : 0 0 D X3 ^X4 .detf 1 /X3 ^X4
(14.18)
R1 D Then, under f 1 , R1 ! R01 , where R01 D
R1 is, therefore, invariant under f 1 . However, we want to express our invariants in terms of distances on the 1D line. To do this, we must consider how the bivector S1 in R2 projects down to E 1 : X1 ^X2 D .T1 1 C S1 2 /^.T2 1 C S2 2 / D .T1 S2 T2 S1 /1 2 S1 S2 .T1 =S1 T2 =S2 /I2 D S1 S2 .t1 t2 /I2 :
(14.19)
In order to form a projective invariant that is independent of the choice of the arbitrary scalars Si , we must now consider ratios of the bivectors Xi ^Xj (so that detf 1 cancels), and then multiples of these ratios (so that the Si ’s cancel). More precisely, consider the following expression: Inv1 D
.X3 ^X1 /I21 .X4 ^X2 /I21 : .X4 ^X1 /I21 .X3 ^X2 /I21
(14.20)
Then, in terms of distances along the lines, under the projective transformation f 1 , Inv1 goes to Inv01, where Inv01 D
.t3 t1 /.t4 t2 / S3 S1 .t3 t1 /S4 S2 .t4 t2 / D ; S4 S1 .t4 t1 /S3 S2 .t3 t2 / .t4 t1 /.t3 t2 /
(14.21)
which is independent of the Si ’s and is indeed the 1D classical projective invariant, the cross-ratio. Deriving the cross-ratio in this way allows us to easily generalize it to form invariants in higher dimensions.
14.4.2 2D Generalization of the Cross-Ratio When we consider points in a plane, we once again move up to a space with one higher dimension, which we shall call R3 . Let a point P in the plane M be described by the vector x in E 2 , where x D x 1 C y 2 . In R3 , this point will be represented by X D X1 C Y 2 C Z3 , where x D X=Z and y D Y =Z. As described in Chap. 9, we can define a general projective transformation via a linear function f 2
14.4 Projective Invariants
375
by mapping vectors to vectors in R3 , such that f 2 .1 / D ˛1 1 C ˛2 2 C ˛ Q 3; Q 3; f 2 .2 / D ˇ1 1 C ˇ2 2 C ˇ Q 3: f 2 .3 / D ı1 1 C ı2 2 C ı
(14.22)
Now, consider three vectors (representing non-collinear points) Xi , i D 1; 2; 3, in R3 , and form the trivector S2 D X1 ^X2 ^X3 D 2 I3 ;
(14.23)
where I3 D 1 2 3 is the pseudoscalar for R3 . As before, under the projective transformation given by f 2 , S2 transforms to S20 , where S20 D detf 2 S2 :
(14.24)
Therefore, the ratio of any trivector is invariant under f 2 . To project down into E 2 , assuming that Xi 3 D Zi .1 C x i / under the projective split, we then write S2 I3 1 D hX1 X2 X3 I3 1 i D hX1 3 3 X2 X3 3 3 I3 1 i D Z1 Z2 Z3 h.1 C x 1 /.1 x 2 /.1 C x 3 /3 I3 1 i;
(14.25)
where x i represent vectors in E 2 . We can only get a scalar term from the expression within the brackets by calculating the product of a vector, two spatial vectors, and I3 1 , that is, S2 I3 1 D Z1 Z2 Z3 h.x1 x 3 x 1 x 2 x 2 x 3 /3 I3 1 i D Z1 Z2 Z3 f.x2 x 1 /^.x3 x 1 /gI2 1 :
(14.26)
It is, therefore, clear that we must use multiples of the ratios in our calculations, so that the arbitrary scalars Zi cancel. In the case of four points in a plane, there are only four possible combinations of Zi Zj Zk , and it is not possible to cancel all the Z’s by multiplying two ratios of the form Xi ^Xj ^Xk together. For five coplanar points fXi g, i D 1; : : : ; 5, however, there are several ways of achieving the desired cancellation. For example, Inv2 D
.X5 ^X4 ^X3 /I31 .X5 ^X2 ^X1 /I31 : .X5 ^X1 ^X3 /I31 .X5 ^X2 ^X4 /I31
According to Eq. 14.26, we can interpret this ratio in E 2 as
376
14 Invariants Theory in Computer Vision and Omnidirectional Vision
.x 5 x 4 /^.x5 x 3 /I21 .x 5 x 2 /^.x5 x 1 /I21 .x 5 x 1 /^.x5 x 3 /I21 .x 5 x 2 /^.x5 x 4 /I21 A543 A521 D ; A513 A524
Inv2 D
(14.27)
where 12 Aijk is the area of the triangle defined by the three vertices x i ; x j ; x k . This invariant is regarded as the 2D generalization of the 1D cross-ratio.
14.4.3 3D Generalization of the Cross-Ratio For general points in E 3 , we have seen that we move up one dimension to compute in the 4D space R4 . For this dimension, the point x D x 1 Cy 2 Cz 3 in E 3 is written as X D X1 C Y 2 C Z3 C W 4 , where x D X=W; y D Y =W; z D Z=W . As before, a nonlinear projective transformation in E 3 becomes a linear transformation, described by the linear function f 3 in R4 . Let us consider 4-vectors in R4 , fXi g, i D 1; : : : ; 4, and form the equation of a 4-vector: S3 D X1 ^X2 ^X3 ^X4 D 3 I4 ;
(14.28)
where I4 D 1 2 3 4 is the pseudoscalar for R4 . As before, S3 transforms to S30 under f 3 : S30 D X01 ^X02 ^X03 ^X04 D detf 3 S3 :
(14.29)
The ratio of any two 4-vectors is, therefore, invariant under f 3 , and we must take multiples of these ratios to ensure that the arbitrary scale factors Wi cancel. With five general points we see that there are five possibilities for forming the combinations Wi Wj Wk Wl . It is then a simple matter to show that one cannot consider multiples of ratios such that the W factors cancel. It is, however, possible to do this if we have six points. One example of such an invariant might be Inv3 D
.X1 ^X2 ^X3 ^X4 /I41 .X4 ^X5 ^X2 ^X6 /I41 : .X1 ^X2 ^X4 ^X5 /I41 .X3 ^X4 ^X2 ^X6 /I41
(14.30)
Using the arguments of the previous sections, we can now write .X1 ^X2 ^X3 ^X4 /I41 W1 W2 W3 W4 f.x2 x 1 /^.x3 x 1 /^.x4 x 1 /gI31 :
(14.31)
We can, therefore, see that the invariant Inv3 is the 3D equivalent of the 1D cross-ratio and consists of ratios of volumes,
14.4 Projective Invariants
377
Inv3 D
V1234 V4526 ; V1245 V3426
(14.32)
where Vijkl is the volume of the solid formed by the four vertices x i ; x j ; x k ; x l . Conventionally, all of these invariants are well known, but we have outlined here a general process which is straightforward and simple for generating projective invariants in any dimension.
14.4.4 Generation of 3D Projective Invariants Any 3D point may be written in G1;3;0 as Xn D Xn 1 C Yn 2 C Zn 3 C Wn 4 , and its projected image point written in G3;0;0 as x n D xn 1 C yn 2 C zn 3 , where xn D Xn =Wn ; yn D Yn =Wn , and, zn D Zn =Wn . The 3D projective basis consists of four base points and a fifth point for normalization: 0 1 1 B0C C X1 D B @0A; 0
0 1 0 B1C C X2 D B @0A; 0
0 1 0 B0C C X3 D B @1A; 0
0 1 0 B0C C X4 D B @0A; 1
0 1 1 B1C C X5 D B @1A: 1
Any other point Xi 2 hG1;3;0 i1 can then be expressed by Xi D Xi X1 C Yi X2 C Zi X3 C Wi X4 ;
(14.33)
with .Xi ; Yi ; Zi ; Wi / as homogeneous projective coordinates of Xi in the base fX1 ; X2 ; X3 ; X4 ; X5 g. The first four base points, projected to the projective plane, can be used as a projective basis of hG3;0;0 i1 if no three of them are collinear: 0 1 1 x1 D @ 0 A ; 0
0 1 0 x2 D @ 1 A ; 0
0 1 0 x3 D @ 0 A ; 1
0 1 1 x4 D @ 1 A : 1
Using this basis, we can express, in bracketed notation, the 3D projective coordinates Xn ; Yn ; Zn of any 3D point, as well as its 2D projected coordinates xn ; yn : Œ234nŒ1235 Yn Œ134nŒ1235 Zn Œ124nŒ1235 Xn ; ; : D D D Wn Œ2345Œ123n Wn Œ1345Œ123n Wn Œ1245Œ123n Œ23nŒ124 xn ; D wn Œ234Œ12n
yn Œ13nŒ124 : D wn Œ134Œ12n
(14.34)
(14.35)
These equations show projective invariant relationships, and they can be used, for example, to compute the position of a moving camera (see Sect. 14.7).
378
14 Invariants Theory in Computer Vision and Omnidirectional Vision
The projective structure and its projection on the 2D image can be expressed according to the following geometric constraint, as presented by Carlsson [33]: 0
0 Bw5 X5 B B 0 B B 0 B B Bw6 X6 B B 0 B Bw7 X7 B B : B @ : :
w5 Y5 0 w6 Y6 w6 Y6 0 w7 Y7 0 : : :
y5 Z5 x5 Z5 y6 Z6 y6 Z6 x6 Z6 y7 Z7 x7 Z7 : : :
1 .y5 w5 /W5 .x5 w5 /W5 C C .x5 w5 /W5 C C0 1 1 .y6 w6 /W6 C C X0 C B 1 C .x6 w6 /W6 C B Y0 C D 0; C .y7 w7 /W7 C @ Z01 A C 1 .x7 w7 /W7 C C W0 C : C A : :
(14.36)
where X0 ; Y0 ; Z0 ; W0 are the coordinates of the view-center point. Since the matrix is of rank <4, any determinant of four rows becomes a zero. Considering .X5 ; Y5 ; Z5 ; W5 / D .1; 1; 1; 1/ as a normalizing point, and taking the determinant formed by the first four rows of Eq. 14.36, we get a geometric-constraint equation involving six points (see Quan [159]): .w5 y6 x5 y6 /X6 Z6 C .x5 y6 x5 w6/X6 W6 C .x5 w6 y5 w6/X6 Y6 C .y5 x6 w5 x6/Y6 Z6 C .y5 w6 y5 x6/Y6 W6 C .w5 x6 w5 y6 /Z6 W6 D 0:
(14.37)
Carlsson [33] showed that Eq. 14.37 can also be derived using Pl¨ucker–Grassmann relations, which is then computed as the Laplace expansion of the 4 8 rectangular matrix involving the same six points as above: ŒX 1 ; X 2 ; X 3 ; X 4 ; X 5 ; X 5 ; X 6 ; X 7 D ŒX 0 ; X 1 ; X 2 ; X 3 ŒX 4 ; X5 ; X 6 ; X 7 ŒX 0 ; X 1 ; X 2 ; X 4 ŒX 3 ; X 5 ; X 6 ; X 7 CŒX 0 ; X 1 ; X 2 ; X 5 ŒX 3 ; X 4 ; X 6 ; X 7 ŒX 0 ; X 1 ; X 2 ; X 6 ŒX 3 ; X4 ; X 5 ; X 7 C ŒX 0 ; X 1 ; X 2 ; X 7 ŒX 3 ; X 4 ; X 5 ; X 6 D 0: (14.38) By using four functions like Eq. 14.38 in terms of the permutations of six points as indicated by their subindices in the table below, X0 0 0 0 0
X1 1 2 3 4
X2 5 6 5 6
X3 1 1 1 1
X4 2 2 2 2
X5 3 3 3 3
X6 4 4 4 4
X7 5 6 5 6
14.4 Projective Invariants
379
we get an expression in which bracketed terms having two identical points vanish: Œ0152Œ1345 Œ0153Œ1245 C Œ0154Œ1235 D 0; Œ0216Œ2346 Œ0236Œ1246 C Œ0246Œ1236 D 0; Œ0315Œ2345 C Œ0325Œ1345 C Œ0345Œ1235 D 0; Œ0416Œ2346 C Œ0426Œ1346 Œ0436Œ1246 D 0:
(14.39)
It is easy to prove that the bracketed terms of image points can be written in the form Œx i x j x k D wi wj wk ŒKŒX 0 X i X j X k , where [K] is the matrix of the intrinsic parameters [140]. Now, if we substitute in Eq. 14.39 all the brackets which have the point X 0 with image points, and if we then organize all the products of brackets as a 4 4 matrix, we end up with the singular matrix: 0 1 0 Œ125Œ1345 Œ135Œ1245 Œ145Œ1235 BŒ216Œ2346 0 Œ236Œ1246 Œ246Œ1236C B C: (14.40) @Œ315Œ2345 Œ325Œ1345 0 Œ345Œ1235A Œ416Œ2346 Œ426Œ1346 Œ436Œ1246
0
Here, the scalars wi wj wk ŒK of each matrix entry cancel each other. Now, after taking the determinant of this matrix and rearranging the terms conveniently, we obtain the following useful bracket polynomial: h ih ih ih i Œ125Œ346 1236 1246 1345 2345 h ih ih ih i Œ126Œ345 1235 1245 1346 2346 C h ih ih ih i Œ135Œ246 1236 1245 1346 2345 h ih ih ih i Œ136Œ245 1235 1246 1345 2346 C h ih ih ih i Œ145Œ236 1235 1246 1346 2345 h ih ih ih i Œ146Œ235 1236 1245 1345 2346 D 0: (14.41) Surprisingly, this bracketed expression is exactly the shape constraint for six points given by Quan [159], i1 I1 C i2 I2 C i3 I3 C i4 I4 C i5 I5 C i6 I6 D 0;
(14.42)
where i1 D Œ125Œ346; i2 D Œ126Œ345; : : : ; i6 D Œ146Œ235; I1 D Œ1236Œ1246Œ1345Œ2345; I2 D Œ1235Œ1245Œ1346Œ2346; : : : ; I6 D Œ1236Œ1245Œ1345Œ2346
380
14 Invariants Theory in Computer Vision and Omnidirectional Vision
Fig. 14.4 The action of grasping a box
are the relative linear invariants in P 2 and P 3 , respectively. Using the shape constraint, we are now ready to generate invariants for different purposes. Let us illustrate this with an example (see Fig. 14.4). According to the figure, a configuration of six points indicates whether or not the end-effector is grasping properly. To test this situation, we can use an invariant generated from the constraint of Eq. 14.41. In this particular situation, we recognize two planes, thus Œ1235 D 0 and Œ2346 D 0. By substituting the six points into Eq. 14.41, we can cancel out some brackets, thereby reducing the equation to h ih ih ih i Œ125Œ346 1236 1246 1345 2345 h ih ih ih i Œ135Œ246 1236 1245 1346 2345 D 0 h ih i h ih i Œ125Œ346 1246 1345 Œ135Œ246 1245 1346 D 0
(14.43) (14.44)
or Inv D D
.X1 ^X2 ^X 4 ^X5 /I41 .X 1 ^X3 ^X4 ^X6 /I41 .X1 ^X2 ^X 4 ^X6 /I41 .X 1 ^X 3 ^X4 ^X5 /I41 .x 1 ^x 2 ^x 5 /I31 .x 3 ^x4 ^x 6 /I31 : .x 1 ^x 3 ^x 5 /I31 .x 2 ^x4 ^x 6 /I31
(14.45)
In this equation, any bracket of P 3 after the projective mapping becomes .X1 ^X2 ^X4 ^X5 /I41 W1 W2 W4 W5 f.x 2 x 1 /^.x4 x 1 /^.x5 x 1 /gI31 : (14.46)
14.5 3D Projective Invariants from Multiple Views
381
The constraint (14.41) ensures that the Wi Wj Wk Wl constants are always cancelled. Furthermore, we can interpret the invariant ‘Inv’, the equivalent of the 1D cross-ratio, as a ratio of volumes in P 3 , and as a ratio of triangle areas in P 2 : Inv D
A125 A346 V1245 V1346 D : V1246 V1345 A135 A246
(14.47)
In other words, we can interpret this invariant in P 3 as the relation of 4-vectors, or volumes, which in turn are built by points lying on a quadric. After they are projected in P 2 , they represent an invariant relating areas of triangles encircled by conics. Then, utilizing this invariant, we can check whether or not the grasper is holding the box correctly. Note that by using the observed 3D points in the image, we can compute this invariant and see if the relation of the triangle areas corresponds well with the parameters for firm grasping. In other words, if the points of the grasper are located some distance away from the object, the invariant will have a different value than will be the case if the points X 1 ; X 5 of the grasper are nearer to the points X 2 ; X 3 of the object.
14.5 3D Projective Invariants from Multiple Views In the previous section, the projective invariant was explained within the context of homogeneous projective coordinates derived from a single image. Since, in general, objects in 3D space are observed from different positions, it would be convenient to be able to extend the projective invariant in terms of the linear constraints imposed by the geometry of two, three, or more cameras.
14.5.1 Projective Invariants Using Two Views Let us consider a 3D projective invariant derived from Eq. 14.41: Inv3 D
ŒX1 X2 X3 X4 ŒX4 X5 X2 X6 : ŒX1 X2 X4 X5 ŒX3 X4 X2 X6
(14.48)
The computation of the bracket 1 Œ1234 D .X1 ^X2 ^X3 ^X4 /I 1 4 D ..X1 ^X2 /^.X3 ^X4 //I 4
of four points from R4 , mapped onto camera-images with optical centers A0 and B0 , suggests the use of a binocular model based on incidence algebra techniques, as discussed in Chap. 9. Defining the lines
382
14 Invariants Theory in Computer Vision and Omnidirectional Vision B L12 D X1 ^X2 D .A0 ^LA 12 / \ .B0 ^L12 /; B L34 D X3 ^X4 D .A0 ^LA 34 / \ .B0 ^L34 /;
B where lines LA ij and Lij are mappings of the line Lij onto the two image planes, results in the expression
Œ1234 D ŒA0 B0 A01234 B01234 :
(14.49)
A Here, A01234 and B01234 are the points of intersection of the lines LA 12 and L34 or B and L34 , respectively. These points, lying on the image planes, can be expanded using the mappings of three points Xi , say, X1 ; X2 ; X3 , to the image planes. In other words, considering Aj and Bj , j D 1; 2; 3, as projective bases, we can expand the vectors
LB 12
A01234 D ˛1234;1 A1 C ˛1234;2 A2 C ˛1234;3 A3 ; B01234 D ˇ1234;1 B1 C ˇ1234;2 B2 C ˇ1234;3 B3 : Then, using Eq. 9.97 from Chap. 9, we can express Œ1234 D
3 X
FQij ˛1234;i ˇ1234;j D ˛T1234 FQ ˇ1234 ;
(14.50)
i;j D1
where FQ is the fundamental matrix given in terms of the projective basis embedded in R4 , and ˛1234 D .˛1234;1 ; ˛1234;2 ; ˛1234;3 / and ˇ 1234 D .ˇ1234;1 ; ˇ1234;2 ; ˇ1234;3 / are corresponding points. The ratio Inv3F D
.˛T 1234 FQ ˇ 1234 /.˛T 4526 FQ ˇ 4526 / .˛T 1245 FQ ˇ 1245 /.˛T 3426 FQ ˇ 3426 /
(14.51)
is therefore seen to be an invariant using the views of two cameras [32]. Note that Eq. 14.51 is invariant for whichever values of the 4 components of the vectors Ai ; Bi ; Xi , etc., are chosen. If we attempt to express the invariant of Eq. 14.51 in terms of what we actually observe, we may be tempted to express the invariant in terms of the homogeneous Cartesian image coordinates a0i s, b0i s and the fundamental matrix F calculated from these image coordinates. In order to avoid this, it is necessary to transfer the computations of Eq. 14.51 carried out in R4 to R3 . Thus, if we define FQ by FQkl D .Ak 4 /.Bl 4 /Fkl and consider the relationships ˛ij D
(14.52)
A0i 4 B0 aij and ˇ ij D B i 4 bij , we can claim Aj 4 4 j
˛i k FQkl ˇ i l D .A0i 4 /.B0i 4 /ai k Fkl bil :
(14.53)
14.5 3D Projective Invariants from Multiple Views
383
If F is subsequently estimated by some method, then FQ as defined in Eq. 14.52 will also act as a fundamental matrix or bilinear constraint in R4 . Now, let us look again at the invariant Inv3F . As we demonstrated earlier, we can write the invariant as Inv3F D
.aT 1234 F b1234 /.aT 4526 F b4526 /1234 4526 ; .aT 1245 F b1245 /.aT 3426 F b3426 /1245 3426
(14.54)
where pqrs D .A0pqrs 4 /.B0pqrs 4 /. Therefore, we see that the ratio of the terms aT F b, which resembles the expression for the invariant in R4 but uses only the observed coordinates and the estimated fundamental matrix, will not be an invariant. Instead, we need to include the factors 1234 , etc., which do not cancel. They are formed as follows (see [13]): Since a03 , a04 , and a01234 are collinear, we can write a01234 D 1234 a04 C .1 1234 /a03 . Then, by expressing A01234 as the intersection of the line joining A01 and A02 with the plane through A0 ; A03 ; A04 , we can use the projective split and equate terms, so that 1245 .3426 1/ .A01234 4 /.A04526 4 / D : .A03426 4 /.A01245 4 / 4526 .1234 1/
(14.55)
Note that the values of are readily obtainable from the images. The factors B0pqrs 4 are found in a similar way, so that if b01234 D 1234 b04 C .1 1234 /b03 , etc., the overall expression for the invariant becomes Inv3F D
.aT1234 F b1234 /.aT4526 F b4526 / .aT1245 F b1245 /.aT3426 F b3426 / 1245 .3426 1/ 1245 .3426 1/ : 4526 .1234 1/ 4526 .1234 1/
(14.56)
In conclusion, given the coordinates of a set of six corresponding points in two image planes, where these six points are projections of arbitrary world points in a general position, we can form 3D projective invariants, provided we have some estimate of F .
14.5.2 Projective Invariant of Points Using Three Uncalibrated Cameras The technique used to form the 3D projective invariants for two views can be straightforwardly extended to give expressions for invariants of three views. Considering four world points, X1 ; X2 ; X3 ; X4 , or two lines, X1 ^ X2 , and, X3 ^ X4 , projected onto three camera planes, we can write B X1 ^X2 D .A0 ^LA 12 / \ .B0 ^L12 / C X3 ^X4 D .A0 ^LA 34 / \ .C0 ^L34 /:
384
14 Invariants Theory in Computer Vision and Omnidirectional Vision
Once again, we can combine the above expressions so that they give an equation for the 4-vector X1 ^X2 ^X3 ^X4 , B A C X1 ^X2 ^X3 ^X4 D ..A0 ^LA 12 / \ .B0 ^L12 //^..A0 ^L34 / \ .C0 ^L34 // C D .A0 ^A1234 /^..B0 ^LB 12 / \ .C0 ^L34 //:
(14.57)
C Then, by rewriting the lines LB 12 and L34 in terms of the line coordinates, we get 3 3 P P C B C LB l12;j LB l34;j LC 12 D j and L34 D j . As shown in Chap. 9, the components j D1
j D1
of the trifocal tensor (which takes the place of the fundamental matrix for three views) can be written in geometric algebra as C TQijk D Œ.A0 ^Ai /^..B0 ^LB j / \ .C0 ^Lk //;
(14.58)
so that by using Eq. 14.57 we can derive 3 X
ŒX1^X2^X3^X4 D
C B C l34;k D TQ .˛1234 ; LB TQijk ˛1234;i l12;j 12 ; L34 /: (14.59)
i;j;kD1
The invariant Inv3 can then be expressed as Inv3T D
C Q B C TQ .˛1234 ; LB 12 ; L34 /T .˛4526 ; L25 ; L26 / : C Q B C TQ .˛1245 ; LB 12 ; L45 /T .˛3426 ; L34 ; L26 /
(14.60)
Note that the factorization must be done so that the same line factorizations occur in both the numerator and denominator. We have thus developed an expression for invariants in three views that is a direct extension of the expression for invariants using two views. In calculating the above invariant from observed quantities, we note, as before, that some correction factors will be necessary: Eq. 14.60 is given above in terms of R4 quantities. Fortunately, this correction is quite straightforward. By extrapolating from the results of the previous section, we simply consider the ˛0 s terms in Eq. 14.60 as unobservable quantities, and conversely the line terms, such C as LB 12 ; L34 , are indeed observed quantities. As a result, the expression must be modified, by using to some extent the coefficients computed in the previous section. Thus, for the unique four combinations of three cameras their invariant equations can be expressed as Inv3T D
C B C T .a1234 ; l B 12 ; l 34 /T .a4526 ; l 25 ; l 26 / 1245 .3426 1/ : C B C .1234 1/ T .a1245 ; l B 12 ; l 45 /T .a3426 ; l 34 ; l 26 / 4526
(14.61)
14.5 3D Projective Invariants from Multiple Views
385
14.5.3 Comparison of the Projective Invariants This section presents simulations for the computation of invariants (implemented in Maple) using synthetic data, as well as computations using real images. The computation of the bilinearity matrix F and the trilinearity focal tensor T was done using a linear method. We believe that˚for test purposes, this method is reliable. Four different sets of six points Si D X i1 ; X i 2 ; X i 3 ; X i 4 ; X i 5 ; X i 6 , where i D 1; : : : ; 4, were considered in the simulation, and the only three possible invariants were computed for each set fI1;i ,I2;i ,I3;i g. Then, the invariants of each set were represented as 3D vectors (vi D ŒI1;i ; I2;i ; I3;i T ). For the first group of images, we computed four of these vectors that corresponded to four different sets of six points, using two images for the F case and three images for the T case. For the second group of images, we computed the same four vectors , but we used two new images for the F case or three new images for the T case. The comparison of the invariants was done using Euclidean distances of the vectors d.vi ; vj / D ˇ v v ˇ 1 1 ˇ i j ˇ 2 , which is the same method used in [82]. jjvi jjjjvj jj
Since in d.vi ; vj / we normalize the vectors vi and vj , the distance d.vi ; vj / for any of them lies between 0 and 1, and the distance does not vary when vi or vj is multiplied by a non-zero constant. Figure 14.5 shows a comparison table where each .i; j /th entry represents the distance d.vi ; vj / between the invariants of set Si , which are the points extracted from the first group of images, and those of set Sj , Invariants using F
Invariants using T
0.000 0.590 0.670 0.460 0 0.515 0.68 0.59 0 0.69
0.000 0.590 0.310 0.630 0 0.63 0.338 0.134 0.67 0.29
0.063 0.650 0.750 0.643 0.67 0.78 0.687 0.86 0.145 0.531
0.044 0.590 0.326 0.640 0 0.63 0.376 0.192 0.67 0.389
0.148 0.600 0.920 0.724 0.60 0.96 0.755 0.71 0.97 0.596
0.031 0.100 0.352 0.660 0.031 0.337 0.67 0.31 0.67 0.518
0.900 0.838 0.690 0.960 0.276 0.693 0.527 0.98 0.59 0.663
0.000 0.640 0.452 0.700 0.063 0.77 0.545 0.321 0.63 0.643
Fig. 14.5 Distance matrices showing performance of invariants after increasing Gaussian noise ( D 0:005; 0:015; 0:025; and 0:04)
386
14 Invariants Theory in Computer Vision and Omnidirectional Vision
the points from the second group of images. In the ideal case, the diagonal of the distance matrices should be zero, which means that the values of the computed invariants should remain constant regardless of which group of images they were used for. The entries off the diagonal are comparisons for vectors composed of different coordinates (vi D ŒI1;i ; I2;i ; I3;i T ), and thus are not parallel. Accordingly, these entries should be larger than zero, and if they are very large, the value of d.vi ; vj / should be approximately 1. The figure clearly shows that the performance of the invariants based on trilinearities is much better than that of the invariants based on bilinearities, since the diagonal values for T are in general closer to zero than is the case for F , and since T entries off the diagonal are, in general, bigger values than is the case for F entries. In the case of real images, we used a sequence of images taken by a moving robot equipped with a binocular head. Figure 14.6 shows these images for the left and right eye, respectively. We took image couples, one from the left and one from the right, for the invariants using F , and two from the left and one from the right for the invariants using T . From the image, we selected thirty-eight points semiautomatically, and from these we chose six sets of points. In each set, the points are in a general position. Three invariants of each set were computed, and comparison tables were constructed in the same manner as for the tables of the previous experiment (see Fig. 14.7). The data show once again that computing the invariants using a trilinear approach is much more robust than using a bilinear approach, a result which is also borne out in theory.
Fig. 14.6 Image sequence taken during navigation by the binocular head of a mobile robot (left camera images are shown in upper row; right camera images in lower row)
14.6 Visually Guided Grasping
387 using F
0.04 0.79 0.646 0.130 0.023 0.2535 0.278 0.0167 0.723 0.039
using T 0.021 0.779 0.346 0.930 0.016 0.305 0.378 0.003 0.83 0.02
0.679 0.268 0.606 0.808 0.039
0.89 0.89 0.862 0.91 0.808 0.039
0.759 0.780 0.678 0.908 0.008
0.81 0.823 0.97 0.811 0.791 0.01
Fig. 14.7 Distance matrices show the performance of the computed invariants using bilinearities (top) and trilinearities (bottom) for the image sequence
14.6 Visually Guided Grasping This section presents a practical use of projective invariants using three views. The results will show that despite a certain noise sensitivity in the projective invariants, they can be used for various tasks regardless of the camera calibration or coordinate system. We will apply simple geometric rules using meet or join operations, invariants, and points at infinity to the task of grasping, as depicted in Fig. 14.8a. The grasping procedure uses only image points and consists basically of four steps.
14.6.1 Parallel Orienting Let us assume that the 3D points of Fig. 14.8 are observed by three cameras A; B; C . The mapped points in the three cameras are foAi g, fg Ai g, foBi g, fg Bi g, and foCi g, fg Ci g. In the projective 3D space P 3 , the three points at infinity Vx ,Vy ,Vz for the orthogonal corners of the object can be computed as the meet of two parallel lines. Similarly, in the image planes, the points at infinity, vx ,vy ,vz , are also computed as the meet of the two projected parallel lines: V x D .O 1 ^O 2 / \ .O 5 ^O 6 / ! vjx D .oj1 ^oj2 / \ .oj5 ^oj6 /; V y D .O 1 ^O 5 / \ .O 2 ^O 6 / ! vjy D .oj1 ^oj5 / \ .oj2 ^oj6 /; V x D .O 1 ^O 4 / \ .O 2 ^O 3 / ! vjz D .oj1 ^oj4 / \ .oj2 ^oj3 /; (14.62) where j 2 fA; B; C g. The parallelism in the projective space P 3 can be checked in two ways. First, if the orthogonal edges of the grasper are parallel with the edges of the object, then
388
14 Invariants Theory in Computer Vision and Omnidirectional Vision
a
b
c
d
Fig. 14.8 Grasping an object: (a) arbitrary starting position, (b) parallel orienting, (c) centering, (d) grasping and holding
.G 1 ^G 8 /^V x D 0; .G 1 ^G 9 /^V y D 0; .G 1 ^G 2 /^V z D 0: (14.63) In this case, the conditions of Eq. 14.63, using the points obtained from a single camera, can be expressed as Œg i1 g i8 vix D 0;
Œg i1 g i9 viy D 0;
Œg i1 g i2 viz D 0;
(14.64)
The second way to check the parallelism in the projective space P 3 is to note whether the perpendicular planes of the grasper and those of the object are parallel. If they are, then ŒG 1 G 8 O 1 O 2 D 0; ŒG 15 G 16 O 5 O 8 D 0; ŒG 12 G 13 O 3 O 4 D 0: (14.65) In this case, the conditions of Eq. 14.65 can be expressed in terms of image coordinates by using either the points obtained from two cameras (the bilinear constraint) or those obtained from three cameras (the trifocal tensor): x Tjg
1 g8 o1 o2
Fij x ig1 g8 o1 o2 D 0;
x Tjg g o o Fij x ig15 g16 o5 o8 15 16 5 8 x Tjg
12 g13 o3 o4
D 0;
Fij x ig12 g13 o3 o4 D 0:
(14.66)
14.6 Visually Guided Grasping
389
Tijk x ig1 g8 o1 o2 l jg1 g8 l ko1 o2 D 0; Tijk x ig15 g16 o5 o8 l jg15 g16 l ko5 o8 D 0; Tijk x ig12 g13 o3 o4 l jg12 g13 l ko3 o4 D 0:
(14.67)
If the trinocular geometry is known, it is always more accurate to use Eqs. 14.67.
14.6.2 Centering After an initial movement the grasper should be parallel to and centered in front of the object (see Fig. 14.8b). The center points of the grasper and object can be computed as follows: C o D .O 1 ^O 6 / \ .O 2 ^O 5 /;
C g D .G 1 ^G 16 / \ .G 8 ^G 9 /: (14.68)
We can then check whether the line crossing through these center points eventually encounters the point at infinity V z , which is the intersecting point of the parallel lines O j1 ^O j4 and O j2 ^O j3 . For this, we use the constraint that posits that a point is true if it lies on a line such that C o ^C g ^V z D 0:
(14.69)
This equation, computed using the image points of a single camera, is given by Œc io c ig viz D 0:
(14.70)
14.6.3 Grasping We can evaluate the exactitude of grasping when the plane of the grasper touches the plane of the object. This can be done by checking the following coplanar plane condition: ŒC o C g o1 o2 D 0:
(14.71)
Since we want to use image points, we can compute this bracket straightforwardly by using the points obtained from either two or three cameras, employing either the bilinear or the trilinear constraint, respectively: x Tjco cg o
1 o2
Fij x ico cg o1 o2 D 0;
Tijk x ico cg o1 o2 l jco cg l ko1 o2 D 0:
(14.72)
If the epipolar or trinocular geometry is known, it is always more accurate to use Eq. 14.72.
390
14 Invariants Theory in Computer Vision and Omnidirectional Vision
14.6.4 Holding the Object The final step is to hold the object correctly (see Fig. 14.8d). This can be checked using the invariant in terms of the trifocal tensor given by Eq. 14.61. In this particular problem, an example of a perfect condition would be if the invariant had an approximate value of 34 , which would then change to perhaps 68 or 56 when the grasper is distanced a bit from the control point X 2 ; X 3 . Note that the invariant elegantly relates volumes, indicating a particular relationship between the points of the grasper and those of the object.
14.7 Camera Self-Localization We will now use Eq. 14.34 to compute the 3D coordinates for a moving, uncalibrated camera. For this problem, we first select as a projective basis five fixed points in 3D space, X 1 ; X 2 ; X 3 ; X 4 ; X 5 , and we consider the unknown point X 6 to be the optical center of the moving camera (see Fig. 14.9). Assuming that the camera does not move on a plane, the projection of the optical center X 6 of the first camera position should correspond to the epipole in any of the subsequent views. We can now compute the moving optical center using points from either two cameras, IxF D
X6 .ı T 2346 F 2346 /.ı T 1235 F 1235 / 2345 2345 1236 1236 D T ; (14.73) W6 .ı 2345 F 2345 /.ı T 1236 F 1236 / 2346 2346 1235 1235
or three cameras,
Fig. 14.9 Computing the view centers of a moving camera
14.8 Projective Depth 0.8
391 "F" "T"
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
0
0.05 0.1 0.15 0.2 0.25 0.3 0.35
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
"F" "T"
0
0.05 0.1 0.15 0.2 0.25 0.3 0.35
0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 0
"F" "T"
0
0.05 0.1 0.15 0.2 0.25 0.3 0.35
Fig. 14.10 Computing performance of any three view centers using F (higher spikes) and T (lower spikes); range of additive noise, 0–0.4 pixels
IxT D D
X6 W6 ABC B C B C .T ABC ijk ˛2346;i l 23;j l 46;k /.T mnp ˛1235;m l 12;n l 35;p / 2345 1236
: ABC B C B C 2346 1235 .T ABC qrs ˛2345;q l 23;r l 45;s /.T t uv ˛1236;t l 12;u l 36;v / (14.74)
Similarly, by permuting the six points, as in Eq. 14.35, we compute IyF ; IyT and IzF ; IzT . The compensating coefficients for the invariants Iy and Iz vary due to the permuted points. We also simulated the computation of the invariants by increasing noise. Figure 14.10 shows the deviation of the true optical center for three consecutive positions of a moving camera, using two views and three views. The figure demonstrates that trinocular computation renders more accurate results than binocular computation. The Euclidean coordinates of the optical centers are calculated by applying a transformation, which relates the projective basis to its a priori Euclidean basis.
14.8 Projective Depth In a geometric sense, projective depth can be defined as the relation between the distance from the view center of a 3D point X i and the focal distance f , as depicted in Fig. 14.11. We can derive projective depth from a projective mapping of 3D points. According to the pinhole model explained in Chap. 9, the coordinates of any point in the image plane are obtained from the projection of the 3D point to the three optical 1 2 3 planes A ; A ; A . They are spanned by a trivector basis i ; j ; k and the coefficients tij . This projective mapping in a matrix representation reads as 2 3 2 13 2 x A t11 2 5X D 4 x D 4 y 5 D 4 A t21 3 A t31 1
t12 t22 t32
t13 t23 t33
3 X t14 6 7 Y7 t24 5 6 4Z 5 t34 1 3
2
392
14 Invariants Theory in Computer Vision and Omnidirectional Vision
Perspective
Projective
Fig. 14.11 Geometric interpretation of projective depth
2 f D 40 0
0 f 0
2 3 r11 0 6 r21 05 6 4r31 1 0
r12 r22 r32 0
r13 r23 r33 0
32 3 X tx 6Y 7 ty 7 76 7; tz 5 4 Z 5 1 1
(14.75)
where the projective scale factor is called . Note that the projective mapping is further expressed in terms of f rotation and translation components. Let us attach the world coordinates to the viewcenter of the camera. The resultant projective mapping becomes 2 f x D 4 0 0
0 f 0
0 0 1
2 3 3 X 0 6 7 Y7 05 6 4 Z 5 PX: 0 1
(14.76)
We can then straightforwardly compute D Z:
(14.77)
The method for computing the projective depth () of a 3D point appears simple using invariant theory, that is, using Eq. 14.34. For this computation, we select a basis system, taking four 3D points in general position X 1 ; X 2 ; X 3 ; X 5 , and the optical center of camera at the new position as the fourth point X 4 , and X 6 as the 3D point which has to be reconstructed. This process is shown in Fig. 14.12. Since we are using mapped points, we consider the epipole (mapping of the current view center) to be the fourth point and the mapped sixth point to be the point with unknown depth. The other mapped basis points remain constant during the procedure.
14.9 Shape and Motion
393
Fig. 14.12 Computing the projective depths of n cameras
According to Eq. 14.34, the tensor-based expression for the computation of the third coordinate, or projective depth, of a point X j (D X 6 ) is given by B C T a ; l ; l 124j 12 4j T Zj j D D C Wj T a1245 ; l B 12 ; l 45 T
C a1235 ; l B ; l 12 35 1245 123j : (14.78) B C 124j 1235 a123j ; l 12 ; l 3j
In this way, we can successively compute the projective depths ij of the j points relating to the i -camera. We will use ij in Sect. 14.9, in which we employ the join-image concept and singular value decomposition (SVD) for singular value decomposition 3D reconstruction. Since this type of invariant can also be expressed in terms of the quadrifocal tensor [117], we are also able to compute projective depth based on four cameras.
14.9 Shape and Motion The orthographic and paraperspective factorization method for shape and motion using the affine camera model was developed by Tomasi et al. [154, 186]. This method works for cameras viewing small and distant scenes, and thus for all scale factors of projective depth ij D 1. In the case of perspective images, the scale factors ij are unknown. According to Triggs [187], all ij satisfy a set of consistency reconstruction equations of the so-called join-image. One way to compute ij is by using the epipolar constraint. If we use a matrix representation, this is given by Fi k ij x ij D e i k ^kj x kj ;
(14.79)
394
14 Invariants Theory in Computer Vision and Omnidirectional Vision
which, after computing an inner product with e i k ^ xkj , gives the relation of projective depths for the j -point between camera i and k: 0kj D
kj .e i k ^x kj /Fi k x ij D : ij jje i k ^xkj jj2
(14.80)
Considering the i -camera as a reference, we can normalize kj for all k-cameras and use 0kj instead. If this is not the case, we can normalize between neighbor images in a chained relationship [187]. In Sect. 14.8, we presented a better procedure for the computing of ij involving three cameras. An extension of Eq. 14.80, however, in terms of the trifocal or quadrifocal tensor is awkward and unpractical.
14.9.1 The Join-Image The join-image J is nothing more than the intersections of optical rays and planes with points or lines in 3D projective space, as depicted in Fig. 14.13. The interrelated geometry can be linearly expressed by the fundamental matrix and trifocal and quadrifocal tensors. The reader will find more information about these linear constraints in Chap. 9.
Fig. 14.13 Geometry of the join-image
14.9 Shape and Motion
395
In order to take into account the interrelated geometry, the projective reconstruction procedure should bring together all the data of the individual images in a geometrically coherent manner. We do this by considering the points X j for each i -camera, ij x ij D Pi X j ; (14.81) as the i -row points of a matrix of rank 4. For m cameras and n points, the 3m n matrix J of the join-image is given by 0
11 x 11 B x B 21 21 B x B 31 31 B J DB : B B : B @ : m1 x m1
12 x 12 22 x 22 32 x 32 : : : m2 x m2
13 x 13 23 x 23 33 x 33 : : : m3 x m3
::: ::: ::: ::: ::: ::: :::
1 1n x 1n 2n x 2n C C 3n x 3n C C C C: : C C : C A : mn x mn
(14.82)
For the affine reconstruction procedure, the matrix is of rank 3. The matrix J of the join-image is therefore amenable to a singular value decomposition for the computation of shape and motion [154, 186].
14.9.2 The SVD Method The application of SVD to J gives T ; J3mn D U3mr Srr Vnr
(14.83)
T and U3mr constitute the orthonormal base for where the columns of matrix Vnr the input (co-kernel) and output (range) spaces of J . In order to get a decomposition in motion and shape of the projected point structure, Srr can be absorbed into both T matrices, Vnr and U3mr , as follows:
1 P1 BP C B 2C C B B P3 C 1 1 C B T 2 2 Srr Vnr D B : C .X 1 X 2 X 3 :::X n /4n : (14.84) D U3mr Srr C B B : C C B @ : A Pm 3m4 0
J3mn
Using this method to divide Srr is not unique. Since the rank of J is 4, we should use the first four biggest singular values for Srr . The matrices Pi correspond to the
396
14 Invariants Theory in Computer Vision and Omnidirectional Vision
b 20 15 10 5 0 −5
−2
0
2
4
6
8
10
12 −2
0
2
4
6
8
10
12
Fig. 14.14 Reconstructed house using (a) noise-free observations and (b) noisy observations
projective mappings or motion from the projective space to the individual images, and X j represents the point structure or shape. We can test our approach by using a simulation program written in Maple. Using the method described in Sect. 14.8, we first compute the projective depth of the points of a wire house observed with nine cameras, and we then use SVD to obtain the house’s shape and motion. The reconstructed house, after the Euclidean readjustment for the presentation, is shown in Fig. 14.14. We note that the reconstruction preserves the original form of the model quite well. In the following section, we will show how to improve the shape of the reconstructed model by using the geometric expressions \ (meet) and ^ (join) from the algebra of incidence along with particular tensor-based invariants.
14.9.3 Completion of the 3D Shape Using Invariants Projective structure can be improved in one of two ways: (1) by adding points on the images, expanding the join-image, and then applying the SVD procedure; or (2) after the reconstruction is done, by computing new or occluded 3D space points. Both approaches can use, on the one hand, geometric inference rules based on symmetries, or on the other, concrete knowledge about the object. Using three real views of a similar model house with its rightmost lower corner missing (see Fig. 14.15b), we computed in each image the virtual image point of this 3D point. Then we reconstructed the scene, as shown in Fig. 14.15c. We also tried using geometric incidence operations to complete the house, employing space points as depicted in Fig. 14.15d. The figures show that creating points in the images yields a better reconstruction of the occluded point. Note that in the reconstructed image we transformed the projective shape into a Euclidean one for the presentation of the results. We also used lines to connect the reconstructed points but only so as to make the form of the house visible. Similarly, we used the same procedures to reconstruct the house using nine images; see Fig. 14.16a–d.
14.9 Shape and Motion
397
b 20 15 10 5 0 -5
-5
0
5
c
d
25 20 15 10 5 0 -5
20 15 10 5 0 -5
-5
0
5
10
15
20
25 -2
0
2
4
6
18 16 14 12 8 10
10
15
20
25 -2
0
24
6
18 16 14 12 8 10
18 16
14
-5
0
5
10
15
20
25 -2
0
2
4
6
8
12 10
Fig. 14.15 3D reconstruction using three images: (a) one of the three images, (b) reconstructed incomplete house using three images, (c) extending the join-image, (d) completing in the 3D space
b 25 20 15 10 5 0 -5
-5
0
10
15
20
25 -2
0
2 4
6
8 10
d
c
25 20 15 10 5 0 -5
25 20 15 10 5 0 -5
-5
5
18 16 14 12
0
5
10
15
20
25 -2
0
2
4
6
8
10
12
14
16
-5
0
5
10
15
20
25 -2
0
2
4
6
16 14 12 8 10
18
Fig. 14.16 3D reconstruction using nine images: (a) one of the nine images, (b) reconstructed incomplete house using nine images, (c) extending the join-image, (d) completing in the 3D space
The figure shows that the resulting reconstructed point is virtually the same in both cases, which allows us to conclude that for a limited number of views, the joinimage procedure is preferable, but for the case of several images, an extension of the point structure in the 3D space is preferable.
398
14 Invariants Theory in Computer Vision and Omnidirectional Vision
14.10 Omnidirectional Vision Landmark Identification Using Projective Invariants In Chap. 9, we explained the projective invariants using omnidirectional vision. In this section, we present a robust technique for landmark identification using the projective and permutation p 2 -invariants. A visual landmark (VM ) is a set of visual sub-landmarks (v) in an image frame, where a sub-landmark is a quintuple formed with five coplanar points in P 2 [44] that are in the general position. Each sub-landmark is represented by a point-permutation invariant vector of points in the general position. The landmark identification consists of two distinct phases: the learning and recognition phases.
14.10.1 Learning Phase In the learning phase, potential landmarks are extracted and analyzed, and if they fulfill some constraints, they are stored in the memory. The extraction of features is performed using the Harris corner detector. Once the features have been extracted, we apply the algorithm of the previous section to project the features onto the sphere. With the features on the sphere we are able to calculate their projective and p 2 invariants; using the features, we also start the formulation of legal quintuples, that is, quintuples of points in P 2 that are in the general position. These legal quintuples represent sub-landmark candidates; a set of validated sub-landmarks is used to build a landmark. The necessary validations to consider a sub-landmark as legal are explained next. Collinearity Test Although collinearity is preserved under perspective transformations, quasi-collinearity is not. Thus, three quasi-collinear points in one image could be collinear in another one; therefore, this fact has to be taken into account. Now, since the bracket (9.27) is not a reliable indicator for near singularity (i.e., quasi-collinearity) [71], we define instead, as in [24], the matrix having as columns the scalars of three vectors (v0 ; v1 ; v2 2 G4;1 ): 0 v0 e 1 B v0 e 3 B B M D B v0 e 2 Bv e @ 0 3 1
v1 e 1 v1 e 3 v1 e 2 v1 e 3 1
v2 e 1 1 v2 e 3 C C v2 e 2 C C: v2 e 3 C A
(14.85)
1
The closeness of its smallest singular value to zero measures how collinear the three points are. The algorithm uses a threshold tc obtained experimentally. If the smallest singular value is less than the threshold, the triplet is rejected. Coplanarity Test The quintuples that passed the collinearity test are examined to verify the coplanarity of their constituent points. Let v and v0 be the same quintuple
14.10 Omnidirectional Vision Landmark Identification Using Projective Invariants
399
observed in different image frames. If they are coplanar, then they must have the same projective and permutation p 2 -invariants; thus, D 0 , where and 0 are the calculated quintuples in two frames. The last condition cannot be used since numerical inaccuracies may be present due to noise or small errors in corner detection. Instead we use jv v0 j to ; (14.86) where to is a threshold value. Convex Hull Test A projective transformation that exists between two images preserves the convex hull of a coplanar point set [81]. A pair of matched quintuples puts five points in correspondence. For the match to be correct, there is a necessary but not a sufficient condition, namely two convex hulls must be in correspondence. This invariance has the following conditions [135]: 1. Corresponding points must lie either on or inside the convex hull. 2. The number of points on the convex hull must be the same. 3. For points lying on the convex hull, neighboring relations are preserved. The convex hull can be used to correct problems with the point correspondences. If the convex hull detects two false point-pair correspondences, then they are mutually exchanged. If more than two errors are detected, then the quintuple is rejected. Visual Landmark Construction To minimize the effect of numerical instabilities during the recognition time, we select outliers [44]. To identify those outliers, we first calculate the mean vector vN [7] as follows: 1X vi ; n n1
vN D
(14.87)
i D0
where the vector candidate vi represents the i th sub-landmark candidate. Then the vectors are selected based on the distance of the candidate vi with respect to the mean vector vN : di D jvi vN j : (14.88) The vectors with di greater than twice the mean of all di ’s are selected as a sublandmark of a landmark VMj . If such a vector does not exist, then no landmark is added.
14.10.2 Recognition Phase In the recognition phase, during navigation, a process similar to the learning phase is applied to find legal quintuples. Next we compare the detected quintuple to the stored sub-landmarks. The comparison between projective and point-permutation invariant vectors is done using the Euclidean distance of the vectors.
400
14 Invariants Theory in Computer Vision and Omnidirectional Vision
The matched sub-landmark is used as a guide for the detection of other sublandmarks that will strengthen our landmark hypothesis. Once the hypothesis is verified, the landmark is considered as recognized.
14.10.3 Omnidirectional Vision and Invariants for Robot Navigation Experiments have been carried out in an indoor corridor environment using a mobile robot equipped with an omnidirectional vision system (Fig. 14.17). In this section, we show the results obtained with these experiments. The threshold values of the experiments have been set experimentally, in accordance with the landmark identification results. The values of the threshold in the experiments were tc D 0:0316 and to D 0:8.
Fig. 14.17 Mobile robot and omnidirectional vision system
14.10 Omnidirectional Vision Landmark Identification Using Projective Invariants
401
Features
Robot
Fig. 14.18 Robot scenario and simulation of features extraction
14.10.4 Learning Phase In this phase, we first extract the features using the Harris corner detector from two different frames F and F 0 , and also we find their correspondences. Once we have the features and their correspondences, we begin with the creation of legal quintuples (sub-landmark candidates) (Fig. 14.18). First, we project those features onto the sphere (Sect. 9.8.1). Then, for each quintuple we apply the collinearity test (Sect. 14.10.1). If it passes the test, we calculate the projective and permutation p 2 -invariants of the quintuple (Sect. 9.8.1), and with them we check the coplanarity (Sect. 14.10.1). Next, we apply the convex hull test to the quintuples (Sect. 14.10.1). In Fig. 14.19 we show an example of four sub-landmarks forming a visual landmark. The projective and permutation p 2 -invariants of these sub-landmarks are v0 D 2:0035e1 C 2:0038e2 C 2:0000e3 C 2:0039e4 C 2:0041e5; v1 D 2:7340e1 C 2:3999e2 C 2:5529e3 C 2:7517e4 C 2:5122e5; v2 D 2:5096e1 C 2:5040e2 C 2:7769e3 C 2:7752e4 C 2:3379e5; v3 D 2:7767e1 C 2:7461e2 C 2:0196e3 C 2:0059e4 C 2:0040e5: Finally, with the quintuples that pass such a test, we construct a visual landmark, as explained in Sect. 14.10.1.
14.10.5 Recognition Phase Once the robot has passed the learning phase, it navigates through the corridor and applies the same procedure for the extraction of legal quintuples to find sub-landmarks. The extracted sub-landmarks are compared with the stored ones.
402
14 Invariants Theory in Computer Vision and Omnidirectional Vision
Fig. 14.19 Sequence of images acquired in the learning phase. The crosses show the extracted features to build their respective p 2 -invariants. The images (from top to bottom) represent the sublandmark v0 ; : : : ; v3 , respectively
In Fig. 14.20 we show an example of the four sub-landmarks recognized. The values of those sub-landmarks are v00 D 2:0804e1 C 2:0892e2 C 2:0069e3 C 2:0096e4 C 2:0002e5; v01 D 2:5744e1 C 2:0611e2 C 2:3071e3 C 2:5270e4 C 2:2623e5; v02 D 2:5771e1 C 2:4412e2 C 2:3406e3 C 2:7491e4 C 2:7916e5; v03 D 2:7266e1 C 2:660632 C 2:0407e3 C 2:0138e4 C 2:0071e5:
14.10.6 Quantitative Results To evaluate the proposed framework quantitatively, we have set up an experiment that consists of 30 navigation trials to measure the sub-landmark recognition accuracy. The results are shown in Table 14.1, where the abbreviations mean (SL) sub-landmark, (CR) correct recognitions, (MR) missed recognitions (when one sublandmark was reported as another one), (FP) false positives (when a sub-landmark was reported erroneously), (FN) false negatives (when a sub-landmark was missed). As we can observe, the algorithm achieves a 90:833% rate of correct landmark recognition.
14.11 Conclusions
403
Fig. 14.20 Sequence of images acquired in the recognition phase. Observe that the images were taken with the robot navigating in the opposite direction to the learning phase. The red crosses show the extracted features to build their respective p 2 -invariants. The images (from top to bottom) represent the sub-landmark v00 ; : : : ; v03 , respectively Table 14.1 Landmark recognition results CR MR FP SL
#
%
#
v0 v1 v2 v3 Mean
25 28 27 29 27:25
83:3 93:3 90 96:6 90:833
2 6:67 1 3:33 1 3:33 0 0 1 3:33
%
#
%
2 6:67 0 0 1 3:33 1 3:33 1 3:33
FN #
%
3 1 2 1 1:75
10 3:33 6:67 3:33 5:83
14.11 Conclusions In this chapter, we presented the applications of projective invariants using n-uncalibrated cameras. We first showed how projective invariants can be used to compute the view center of a moving camera. We also developed geometric rules for a task of visually guided grasping. Our purpose here was to motivate the reader to apply invariants from a geometric point of view and take advantage of the power of the algebra of incidence.
404
14 Invariants Theory in Computer Vision and Omnidirectional Vision
Next, using a trifocal tensor-based projective invariant, we developed a method for the computation of projective depths, which in turn were used to initiate an SVD procedure for the projective reconstruction of shape and motion. Further, we applied the rules of incidence algebra to complete the reconstruction, for example, in the critical case of occluded points. The main contribution of this chapter is that we were able to demonstrate a simple way to unify current approaches for the computation and application of projective invariants using n-uncalibrated cameras. We formulated Pascal’s theorem as a projective invariant of conics. This invariant was then used to solve the camera calibration problem. Simple illustrations, such as camera self-localization and visually guided grasping, show the potential of the use of projective invariants, points at infinity, and geometric rules developed using the algebra of incidence. The most important application given is the projective reconstruction of shape and motion. Remarkably, the use of trifocal, tensor-based, projective invariants allows for the computation of the projective depths required for the initialization of the SVD procedure to compute shape and motion. In this way, we were able to link the use of n-views-based projective invariants with SVD projective reconstruction methods. The bracket algebra involved in the computation of the presented projective invariants is noise-sensitive. We believe that to compute as we did is a promising approach for high-level geometric reasoning, especially if better ways to cope with the noise can be found. One promising approach, which should be explored further, might be to express the bracket equations as polynomials and then to look for their Gr¨obner basis. This helps to calculate the exact number of real solutions that satisfy certain geometric constraints arising from physical conditions. This chapter also presents advances in theoretical issues concerning omnidirectional vision. We define our computational model without referencing any coordinate system, using only the geometric relationships between its geometric objects (i.e., the point, plane, and sphere). We have shown that we can define our model in a more general and simpler way as in the case of using matrices or tensors. This allows an easier implementation in more complex applications. As an interesting application, we presented how to recover the projective invariants from a catadioptric image using the inverse projection of the UM and how to use them to construct the p 2 -invariants. As a real application of this technique, we present a visual landmark identification application using projective and permutation p 2 -invariants for autonomous robot navigation.
Chapter 15
Registration of 3D Points Using GA and Tensor Voting
This chapter presents a noniterative algorithm that combines the power of expression of geometric algebra with the robustness of tensor voting to find the correspondences between two sets of 3D points with an underlying rigid transformation.
15.1 Problem Formulation Using the geometric algebra of 3D space G3;0;0 , the rigid motion of a 3D point x D xe1 C ye2 C ze3 can be formulated as e x 0 D RxR C t;
(15.1)
where R is a rotor and t D tx e1 C ty e2 C tz e3 . For simplicity, we will represent R as rotor of the form R D q0 C qx e23 C qy e31 C qz e12 :
(15.2)
By left-multiplication with R, the equation of rigid motion becomes Rx 0 D xR C Rt;
(15.3)
.q0 C qx e23 C qy e31 C qz e12 /.x 0 e1 C y 0 e2 C z0 e3 / D .xe1 C ye2 C ze3 /.q0 C qx e23 C qy e31 C qz e12 / C .q0 C qx e23 C qy e31 C qz e12 /.tx e1 C ty e2 C tz e3 /:
Developing products, we get q0 x 0 e1 C qx x 0 e231 C qy x 0 e3 qz x 0 e2 C q0 y 0 e2 qx y 0 e3 C qy y 0 e312 C qz y 0 e1 C q0 z0 e3 C qx z0 e2 qy z0 e1 C qz z0 e123 D xq0 e1 C yq0 e2 C zq0 e3 C Cxqx e123 C yqx e3 zqx e2 xqy e3 C yqy e231 C zqy e1 C xqz e2 yqz e1 C zqz e312 C q0 tx e1 C qx tx e231 C qy tx e3 qz tx e2 C Cq0 ty e2 qx ty e3 C qy ty e312 C qz ty e1 C q0 tz e3 C qx tz e2 qy tz e1 C qz tz e123 : (15.4) E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 15, c Springer-Verlag London Limited 2010
405
406
15 Registration of 3D Points Using GA and Tensor Voting
Rearranging terms according to their multivector basis, we obtain the following four equations: e1 W q0 x 0 C qz y 0 qy z0 D q0 x C qy z qz y C q0 tx C qz ty qy tz ; e2 W q0 y 0 C qx z0 qz x 0 D q0 y C qz x qx z C q0 ty C qx tz qz tx ; e3 W q0 z0 C qy x 0 qx y 0 D q0 z C qx y qy x C q0 tz C qy tx qx ty ; e123 W qx x 0 C qy y 0 C qz z0 D qx x C qy y C qz z C qx tx C qy ty C qz tz : These equations can be rearranged to express linear relationships in the joint difference and sum spaces: e1 W q0 .x 0 x/ qy .z C z0 / C qz .y C y 0 / C .qy tz q0 tx qz ty / D 0;
(15.5)
e2 W q0 .y 0 y/ C qx .z C z0 / qz .x C x 0 / C .qz tx q0 ty qx tz / D 0;
(15.6)
e3 W q0 .z0 z/ qx .y C y 0 / C qy .x C x 0 / C .qx ty q0 tz qy tx / D 0;
(15.7)
e123 W qx .x 0 x/ C qy .y 0 y/ C qz .z0 z/ .qx tx C qy ty C qz tz / D 0:
(15.8)
These equations clearly represent four 3D planes in the entries of the rotor and the translator (the unknowns). Thus, in order to estimate the correspondences due to a rigid transformation, we can use a set of tentative correspondences to populate the joint spaces f.x 0 x/; .zCz0 /; .yCy 0 /g, f.y 0 y/; .zCz0 /; .xCx 0 /g, f.z0 z/; .yCy 0 /, .x Cx 0 /g, and f.x 0 x/; .y 0 y/; .z0 z/g. If four planes appear in these spaces, then the points lying on them are related by a rigid transformation. However, we show in the following section that the first three planes of Eqs. 15.5, 15.6, and 15.7 are related by a powerful geometric constraint. So it is enough to find the plane described by Eq. 15.8 and verify that it satisfies this constraint. We now show how this is done and how this geometric constraint helps in eliminating multiple matches too.
15.1.1 The Geometric Constraint Let .xi ; x0i / and .xj ; x0j / be two points in correspondence through a rigid transformation. Then these points satisfy Eq. 15.5: q0 dx qy sz C qz sy C qk D 0; q0 dx0
qy sz0
C
qz sy0
C qk D 0;
(15.9) (15.10)
where dx D xi0 xi , sz D z0i C zi , sy D yi0 C yi ; dx0 D xj0 xj , sz0 D z0j C zj , sy0 D yj0 C yj ; and qk D qy tz q0 tx qz ty . If we subtract these equations, we get q0 vx qy vsz C qz vsy D 0;
(15.11)
where vx D dx dx0 , vsz D sz sz0 , and vsy D sy sy0 . Using the definition of R, this equation can be rewritten as kvx ay vsz C az vsy D 0;
(15.12)
15.1 Problem Formulation
407
where k D cos. 2 /= sin. 2 /. Using a similar procedure, for Eqs. 15.6 and 15.7, we end up with the following system of equations kvx C az vsy ay vsz D 0;
(15.13)
kvy C ax vsz az vsx D 0; kvz C ay vsx ax vsy D 0;
(15.14) (15.15)
where vy , vz , and vsx are defined accordingly. Note that we now have a system of equations depending on the unitary axis of rotation Œax ; ay ; az . Since we can obtain the axis of rotation as the normal of the plane described by Eq. 15.8, then we have only one unknown: k. These equations can be mixed to yield the following three constraints: vy .ay vsz az vsy / vx .az vsx ax vsz / D 0;
(15.16)
vz .ay vsz az vsy / vx .ax vsy ay vsx / D 0; vz .az vsx ax vsz / vy .ax vsy ay vsx / D 0:
(15.17) (15.18)
These equations depend only on the points themselves and on the plane spanned by them. Thus, if we populate the joint space described by Eq. 15.8 with a set of tentative correspondences and detect a plane in this space, we can verify if this plane corresponds to an actual rigid transformation by verifying that the points that lie on this plane satisfy Eqs. 15.16, 15.17, and 15.18. Note that these constraints never become undefined, because the factor k was removed. So this test can always be applied to confirm or reject a plane that seems to represent a rigid transformation. Furthermore, since these constraints have been derived from the original plane equations (15.5), (15.6), and (15.7) they are, in a sense, expressing the requirement that these points lie simultaneously on all three planes. On the other hand, Eqs. 15.16, 15.17, and 15.18 have an interesting geometric interpretation. They are in fact expressing a double cross product that is only satisfied for true correspondences. To see this, note that if A D Œax ; ay ; az T , V D Œvx ; vy ; vz T , and Vs D Œvsx ; vsy ; vsz T , then Eqs. 15.16, 15.17, and 15.18 can be rewritten as a vector equation of the form V .A Vs / D 0:
(15.19)
This equation holds only due to an inherent symmetry that is only present for true correspondences; in other words, these equations can be used to reject false matches too. To prove this, first remember the well-known fact that any rigid motion in 3D is equivalent to a screw motion (rotation about the screw axis followed by a translation along it). Hence, without loss of generality, we can consider the case where the screw axis is aligned with the z-axis. In this case, the screw motion consists of a rotation about z followed by a translation tz along it. Therefore, vz D dz dz0 D z0i zi z0j C zj D .zi C tz / zi .zj C tz / C zj D 0: (15.20)
408
15 Registration of 3D Points Using GA and Tensor Voting
c x’i
a
x’i
b
x
Vs si
V
si x’i
x’i xi
di
xj
xi
di di
dj x’j
φ
Vs
f
Vs
V
θ y φ
x
e
d
di
xi
xj
x’j
x
sj
x’j
y
y
θ
xi
φ
θ x’j
θ
xi
xj
x’i
y
θ
V
x
x’j
sj
y
x’j
di xi
dj
xj
θ
y
dj φ x
si
x’i sj
θ φ
x
Fig. 15.1 (a) and (d) Two correspondences belonging to the same transformation and the geometry of the plane of rotation. (b) and (e) Two correspondences belonging to different transformations (different angle) and their corresponding geometry. (c) and (f) Multiple correspondence cases and their geometry
Also, note that since A D Œ001T , then the first cross product of Eq. 15.19, A Vs D Œvsy ; vsx ; 0T , hence the vsz -component of Vs , is irrelevant in this case and can be safely disregarded. Thus, we can analyze this problem in 2D by looking only at the x- and y- components of V and Vs . Accordingly, the difference and sum vectors will only have two components, di D Œdx ; dy T , dj D Œdx0 ; dy0 T , si D Œsx ; sy T , and sj D Œsx0 ; sy0 T . The situation is illustrated in Fig. 15.1a,d. Since the angle between xi and xi0 is the same as the angle between xj and 0 xj , then the parallelograms spanned by the sum and difference of these points (the dashed lines in Fig. 15.1d) are equal up to a scale factor. That is, there is a scale factor k such that kjjsi jj D jjsj jj and kjjdi jj D jjdj jj. From whence jjsi jj=jjdi jj D jjsj jj=jjdj jj. In turn, this means that the triangle formed by the vectors di , dj , and V is proportional to the triangle si , sj , Vs . And since, by construction, si ? di and sj ? dj , then V ? Vs . Now, let us return to Eq. 15.19. The cross product AVs has the effect of rotating Vs by 90ı since A D Œ0; 0; 1T in this case. But since Vs ? V , then the vector A Vs will be parallel to V and, hence, their cross product will always be 0, which is consistent with the analytic derivation.
15.2 Tensor Voting
409
This symmetry is broken if we have points that belong to different transformations, as shown in Fig. 15.1b,e (we assume the worst case, in which points xi and xj were applied the same translation but a different rotation angle . If a different translation is present, the planes of motion will be different for both points, breaking the symmetry). Note how the angle between V and Vs is not orthogonal (Fig. 15.1e). In a similar way, when we have multiple correspondences, that is, xi matches both xi0 and xj0 , i ¤ j (Fig. 15.1c), the symmetry is also broken and V is not orthogonal to Vs (see Fig. 15.1f). Hence, the constraint expressed by Eq. 15.19 can be used to reject multiple matches too. Following this procedure, we were able to cast the problem of finding the correspondences between two sets of 3D points due to rigid transformation, into a problem of finding a 3D plane in a joint space that satisfies three geometric constraints. In order to find a 3D plane from a set of points that may contain a large proportion of outliers, several methods can be used. We have decided to use tensor voting because it has proven to be quite robust and because it can be used to detect general surfaces too, which in turn enables us to easily extend our method to nonrigid motion estimation. We will not explain tensor voting here. The reader should resort to [134] for a general introduction to this subject.
15.2 Tensor Voting Tensor voting is a methodology for the extraction of dense or sparse features from n-dimensional data. Some of the features that can be detected with this methodology include lines, curves, points of junction, and surfaces. The tensor voting methodology is grounded in two elements: tensor calculus for data representation and tensor voting for data communication. Each input site propagates its information in a neighborhood (the information itself is encoded as a tensor and is defined by a predefined voting field). Each site collects the information cast there by its neighbors and analyzes it, building a saliency map for each feature type. Salient features are located at local extrema of these saliency maps, which can be extracted by nonmaximal suppression. For the present work, we found that sparse tensor voting was enough to solve the problem. Since we are only concerned with finding 3D planes, we limit our discussion to the detection of this type of feature. We refer the interested reader to [134] for a complete description of the methodology.
15.2.1 Tensor Representation in 3D In tensor voting, all points are represented as second-order symmetric tensors. To express a tensor S , we choose to take the associated quadratic form, and diagonalize
410
15 Registration of 3D Points Using GA and Tensor Voting
Fig. 15.2 Graphic representation of a second-order 3D symmetric tensor
λ2
e1
e2 λ3
e3
λ1
it, leading to a representation based on the eigenvalues 1 , 2 , and 3 and the eigenvectors e1 , e2 , and e3 . Therefore, we can write the tensor S as
S D e1 e 2 e 3
32 T3 e1 1 0 0 4 0 2 0 5 4 e T 5 : 2 eT3 0 0 3 2
(15.21)
Thus, a symmetric tensor can be visualized as an ellipsoid where the eigenvectors correspond to the principal orthonormal directions of the ellipsoid and the eigenvalues encode the magnitude of each of the eigenvectors (see Fig. 15.2). For the rest of this chapter, we use the convention that the eigenvectors have been arranged so that 1 > 2 > 3 . In this scheme, points are encoded as ball tensors (i.e., tensors with eigenvalues 1 D 2 D 3 1), curvels as plate tensors (i.e., tensors with 1 D 2 D 1, and 3 D 0, tangent direction given by e3 ), and surfels as stick tensors (i.e., 1 = 1, 2 D 3 D 0, normal direction given by e1 ). A ball tensor encodes complete uncertainty of direction; a plate tensor encodes uncertainty of direction in two axes, but complete certainty in the other one; and a stick tensor encodes absolute certainty of direction. Tensors that lie between these three extremes encode differing degrees of direction certainty. The pointness of any given tensor is represented by 3 , the curve-ness is represented by 2 3 , and the surfaceness by 1 2 . Also, note that a second-order tensor only encodes direction, but not orientation, that is, two vectors v and v will be encoded as the same second-order tensor.
15.2.2 Voting Fields in 3D We have just seen how the various types of input data are encoded in tensor voting. Now we describe how these tensors communicate between them. The input usually consists of a set of sparse points. These points are encoded as ball tensors if no information is available about their direction (i.e., identity matrices with 1 D 2 D 3 D 1). If only a tangent is available, the points are encoded as plate tensors (i.e., tensors with 1 D 2 D 1, and 3 D 0). Finally, if information about the normal of the point is given, it is encoded as a stick tensor (i.e., tensors with only one nonzero eigenvalue: 1 = 1, 2 D 3 D 0). Then, each encoded input point, or token, communicates with its neighbors using either a ball voting field (if no orientation is
15.2 Tensor Voting
411
x
nx
Fig. 15.3 The osculating circle and the corresponding normals of the voter and votee
l np
θ
s
p (voter)
a
c
y x
Fig. 15.4 The fundamental voting field. (a) Shape of the direction field when the voter p is located at the origin and its normal is parallel to the y-axis (i.e., np D y). (b) Strength at each location of the previous case. White denotes high strength, black denotes no strength. (c) 3D display of the strength field, where the strength has been mapped to the z coordinate
present), a plate voting field (if local tangents are available), or a stick voting field (when the normal is available). The voting fields themselves consist of various types of tensors ranging from stick to ball tensors. These voting fields have been derived from a 2D fundamental voting field that encodes the constraints of surface continuity and smoothness, among others. To see how the fundamental voting field was derived, suppose that we have a voter p with an associated normal np . At each votee site x surrounding the voter p, the direction of the fundamental field nx is determined by the normal of the osculating circle at x that passes through p and x and has normal np at p (see Fig. 15.3). The saliency decay function of the fundamental field DF .s; ; / at each point depends on the arc length s D sinl and curvature D 2 sin between p and x (see l Fig. 15.3) and is given by the following Gaussian function: DF .s; ; / D e
2 2 s Cc 2
;
(15.22)
where is a scale factor that determines the overall rate of attenuation and c is a constant that controls the decay with high curvature. Note that the strength of the field becomes negligible beyond a certain distance; in this way, each voting field has an effective neighborhood associated with it given by . The shape of the fundamental field can be seen in Fig. 15.4. In this figure, the direction and strength fields are displayed separately. The direction field shows the eigenvectors with the largest associated eigenvalues for each tensor surrounding the voter (center). The strength
412
15 Registration of 3D Points Using GA and Tensor Voting
field shows the value of the largest eigenvalue around the voter: white denotes a strong vote, black denotes no intensity at all (zero eigenvalue). Finally, both orientation and strength are encoded as a stick tensor. In other words, each site around this voting field, or votee, is represented as a stick tensor with varying strength and direction. Communication is performed by the addition of the stick tensor present at the votee and the tensor produced by the field at that site. To exemplify the voting process, imagine we are given an input point x and a voter located at the point p with associated normal np . The input point (votee) is first encoded as a ball tensor (1 D 2 D 3 D 1). Then the vote generated by p on x is computed. This vote is, in turn, a stick tensor. To compute this vote, Eq. 15.22 is used to compute the strength of the vote (1 ). The direction of the vote (e1 ) is computed through the osculating circle between the voter and the votee (using the voter’s associated normal np ). The other eigenvalues (2 and 3 ) are set to zero and then the stick tensor is computed using Eq. 15.21. Finally, the resulting stick tensor vote is added with an ordinary matrix addition to the encoded ball tensor at x. Since, in general, the stick vote has only one nonzero eigenvalue, the resulting addition produces a non-identity matrix with one eigenvalue much larger than the others (1 > 2 and 1 > 3 ). In other words, the ball tensor at x becomes an ellipsoid in the direction given by the stick tensor. The larger the first eigenvalue of the voter is, the more pronounced this ellipsoid becomes. To speed things up, however, these calculations are not done in practice. Instead, the set of votes surrounding the stick voting field is precomputed and stored using a discrete sampling of the space. When the voting is performed, these precomputed votes are just aligned with the voter’s associated normal np and the actual vote is computed by linear interpolation. Note that the stick voting field (or fundamental voting field) can be used to detect surfaces. In order to detect joints, curves, and other features, different voting fields must be employed. These other 3D voting fields can be generated by rotating the fundamental voting field about the x-, y-, and z-axes, depending on the type of field we wish to generate. For example, if the voter is located at the origin and its normal is parallel to the y-axis, as in Fig. 15.4, then we can rotate this field about the y-axis to generate the 3D stick voting field, as shown in Fig. 15.5. In a more formal way, let us define the general rotation matrix R , where , , and stand for the angles of rotation about the x, y and z axes, respectively, and let VS .x/ stand for the tensor vote cast by a stick voting field in 3D at site x. Then VS .x/ can be defined as VS .x/ D
Z
R 0
Vf
.R1 p/RT d .
D D 0/;
(15.23)
where Vf .x/ stands for the vote cast by the fundamental voting field in 2D at site x. The stick voting field can be used when the normals of the points are available. However, when no orientation is provided, we must use the ball voting field. This field is produced by rotating the stick field about all the axes and integrating the
15.2 Tensor Voting
a
413
y
x z
Fig. 15.5 The stick voting field in 3D. (a) The direction of the 3D stick voting field when the voter is located at the origin and its normal is parallel to the y-axis. Only the e1 eigenvectors are shown at several positions. (b) The strength of this field. White denotes high strength, black denotes no strength
a
c
y
x
Fig. 15.6 The ball voting field. (a) Direction field when the voter is located at the origin. (b) Strength of this field. White denotes high strength, black denotes no strength. (c) 3D display of the strength field with the strength mapped to the z-axis
contributions at each site surrounding the voter. For example, for the field depicted in Fig. 15.4, the 2D ball voting field is generated by rotating this field about the z-axis (as shown in Fig. 15.6). This 2D ball voting field is further rotated about the y-axis to generate the 3D ball voting field. In other words, the 2D ball voting field Vb .x/ at site x can be defined as Z Vb .x/ D R Vf .R1 p/RT d . D D 0/; (15.24) 0
and the 3D ball voting VB .x/ field can, thus, be further defined as Z VB .x/ D R Vb .R1 p/RT d . D D 0/;
(15.25)
or, alternatively, as Z VB .x/ D
(15.26)
0
Z
R 0
0
Vf
.R1 p/RT dd .
D 0/:
414
15 Registration of 3D Points Using GA and Tensor Voting
Note that by rotating the stick voting field about all axes and adding up all vote contributions, the shape of the votes in the ball voting field varies smoothly from nearly stick tensors at the edge (1 D 1; 2 D 3 D 0) to ball tensors near the center of the voter (1 D 2 D 3 D 1). Thus, this field consists of ellipsoid-type tensors of varying shape. This is the reason why this field is not simply a “radial stick tensor field” with stick tensors pointing radially away from the center. However, the added complexity of rotating the stick tensor voting field to generate this field does not impact the implementation. As discussed previously, in practice, this field is precomputed in discrete intervals and linearly interpolated when necessary. Finally, the ball voting field can be used to infer the preferred orientation (normals) at each input point if no such information is present to begin with. After voting with the ball voting field, the eigensystem is computed at each input point, and the eigenvector with the greatest eigenvalue is taken as the preferred normal direction at that point. With the normals at each input point thus computed, a further stick voting step can be used to reinforce the points which seem to lie in a surface. Surface detection is precisely what we need in order to solve the original 3D point registration problem. We now describe the process of surface detection used in our approach.
15.2.3 Detection of 3D Surfaces Now that we have defined the tensors used to encode the information and the voting fields, we can describe the process used to detect the presence of a 3D surface in a set of points. We limit our discussion of this process to the stages of tensor voting that are relevant to our work. We refer the interested reader to [134] for a full account on feature inference through tensor voting. As described previously, the input to our algorithm is a 3D space populated by a set of putative correspondences between two point sets. To avoid confusion, we refer to the points in the joint space simply as “tokens.” Each of these tokens is encoded as a unitary ball tensor (i.e., with the identity matrix I33 ). Then we place a ball voting field at each input token and cast votes to all its neighbors. Since the strength of the ball voting field becomes negligible after a certain distance (given by the free parameter in Eq. 15.22), we only need to cast votes to the tokens that lie within a small neighborhood about each input token. To cast a vote, we simply add the tensor present at the votee with the tensor produced by the ball voting field at that position. This process constitutes a sparse ball voting stage. Once this stage is finished, we can examine the eigensystem left at each token, and thus extract the preferred normals at each site. The preferred normal direction is given by the eigenvector e1 , and the saliency of this orientation is given by 1 2 . After this step, each token has an associated normal. The next step consists of using the 3D stick voting field to cast votes to the neighbors so that the normals are reinforced. In order to cast a stick vote, the 3D stick voting field is first placed on the voter and oriented to match its normal. Once again, the extent of this field is limited by the parameter , so we only need to cast votes to the tokens that lie within a
15.2 Tensor Voting
415
small neighborhood of the voter. After the votes have been cast, the eigensystem at each token is computed to obtain the new normal orientation and strength at each site. This process constitutes a sparse stick voting stage. In ordinary tensor voting, the eigensystem at each token is used to compute different saliency maps: pointness 3 , curveness 2 3 , and surfaceness 1 2 . Then, the derivative of these saliency maps is computed and nonmaximal suppression is used to locate the most salient features. After this step, the surfaces are polygonized using a marching cubes algorithm (or similar). However, our objective in this case was not the extraction of polygonized surfaces, but simply the location of the most salient surface. Hence, a simple thresholding technique was used instead. The token with the greatest saliency is located and the threshold is set to a small percentage of this saliency. Thus, for example, tokens with a small 1 relative to this token are discarded. In a similar fashion, tokens with a small surfaceness (1 2 ) with respect to this token are also deleted. After the sparse stick voting is performed, only the tokens that seem to belong to surfaces (i.e., 1 is not small and 1 2 is high) cast votes to its neighbors to further reinforce the surfaceness of the tokens. Input tokens that do not belong to surfaces are discarded (set to 033 ). This process is repeated a fixed number of times with increasing values of in order to make the surface(s) grow. In this way, there is a high confidence that the tokens that have not been discarded after the repeated application of the sparse stick voting stage belong to a surface.
15.2.4 Estimation of 3D Correspondences Given two sets of 3D points X1 and X2 , we are expected to find the correspondences between these two sets assuming a rigid transformation has taken place, and we have an unspecified number of outliers in each set. No other information is given. In the absence of better information, we populate the joint space .x 0 x/; .y 0 y/, .z0 z/ by matching all points from the first set with all the points from the second set. Note that this allows us to detect any motion regardless of its magnitude, but the amount of outliers present in the joint space is multiplied 100-fold by this matching scheme. The tokens in the joint space thus populated are then processed with tensor voting in order to detect the most salient plane, as described in the previous section. The plane thus detected is further tested against the constraint given by Eq. 15.19. This constraint requires the specification of two different tokens. In practice, we use the token on the plane with the highest saliency and test it against the rest of the points on the plane. If any pair of tokens does not satisfy the constraint, we remove it from the plane. If not enough points remain after this pruning is completed, we reject the plane. Remember that the enforcement of this constraint also avoids the presence of false or multiple matches. So the output is a one-to-one set of correspondences. As can be easily noted, Eq. 15.8 collapses for the case of pure translation. However, in this case, all points that belong to a rigid transformation will tend to cluster
416
15 Registration of 3D Points Using GA and Tensor Voting
together in a single token in the joint space. This cluster is easy to detect after the sparse ball voting stage because these tokens will have a large absolute saliency value at this stage. If such a cluster is found, we stop the algorithm and produce the correspondences based on the tokens that were found clustered together. Following this simple procedure, we can detect any rigid transformation. Note, however, that using the simple matching scheme mentioned earlier, the number of outliers in the joint space is multiplied by 100-fold. When the number of outliers in the real space is large (for example, on the order of 90%), this can complicate the detection of surfaces in the joint space. When a situation like this arises, we have adopted a scheme where several rotation angles and axes are tested in a systematic fashion in order to make the detection process simpler. In this variation of the algorithm, the set X1 is first rotated according to the current axis and angle to be tested, and the joint space specified by Eq. 15.8 is populated again. We then run the detection process using tensor voting. If a plane with enough support is found, we stop the algorithm and output the result. Otherwise, the next angle and rotation axis are tested until a solution is found, or all the possibilities have been tested. The whole algorithm for the detection of correspondences between two 3D point sets under a single rigid transformation is sketched below. Finally, note that this variation of the algorithm is only needed when large numbers of outliers are present in the input, as stated previously. Algorithm 1. Initialize the rotation angle ˛ D 0ı and axis A D Œ0; 0; 1T . 2. Rotate the set X1 according to ˛ and A. Populate the voting space with tokens generated from the candidate correspondences. 3. Initialize all tokens to ball tensors. 4. Perform sparse ball voting and extract the preferred normals. 5. Check for the presence of a set of tokens clustered about a single point in space. If this cluster is found, finish and output the corresponding translation detected. 6. Perform sparse stick voting using the preferred normals. Optionally, repeat this step a fixed number of times to eliminate outliers. After each iteration, increase the reach of the votes slightly, so as to make the plane grow. 7. Obtain the equation of the plane described by the tokens with the highest saliency. Enforce the constraint of Eq. 15.19 and delete the tokens that do not satisfy it. 8. If a satisfactory plane is found, output the correspondences. Otherwise, increment ˛ and A, and repeat steps 2 to 7 until all angles and axes of rotation have been tested. Finally, we recognize that the scheme proposed here is far from perfect. The exhaustive search of all angles and rotation axes can be quite time-consuming, and appears to be a little simplistic. Unfortunately, the density of the plane we are seeking varies with the angle of the rotation applied to the set of points. That is, the density of this plane is minimum (the plane spans the full voting space) when the rotation is 180ı , and it becomes infinite when we have pure translation (all the points of the plane cluster in a single location in space). Hence, there does not seem to be
15.3 Experimental Analysis
417
some type of heuristic or constraint we can apply to prune the search. An alternative to this is to use the other three search spaces as described in Eqs. 15.5, 15.6, and 15.7 and perform tensor voting to detect these planes to help improve the search method. This is a matter for future research. However, this disadvantage is only apparent if the magnitude of the transformation is unbounded. Algorithms like the ICP require that the transformation be relatively small. If we use the same limitation in our method, we do not need this exhaustive search, and our method works without the iterative scheme. Finally, we have shown in [161] that our algorithm has a complexity of O.n2 / in the worst case, where n is the number of tokens in the voting space (which never occurs in practice because this implies that each token casts a vote to every other token).
15.3 Experimental Analysis When dealing with real data, the input points usually have some amount of noise. This noise in turn affects the shape and thickness of the plane that has to be detected in the joint space. Thus, instead of producing an ideal plane in the joint space, points with noise yield a “fuzzy” plane that has small variations over its surface. However, even in spite of this, tensor voting can be used successfully to detect these “fuzzy” surfaces. Also, the constraint given by Eq. 15.19 has to be relaxed in order to avoid rejecting points that do not seem to be in correspondence due to the noise. This relaxation is accomplished by checking that the equation yields a small absolute value, instead of zero.
15.3.1 Correspondences Between 3D Points by Rigid Motion We performed the following experiments. First, we followed the position of a robotic arm in 3D in a sequence of stereo pairs (Fig. 15.7 shows only the left images of the sequence). In this case, the problem of 3D reconstruction is considered already solved, and the input to our algorithm is the 3D points of this reconstruction. In practice, ordinary camera calibration and stereo matching (through cross-correlation) were performed to achieve the 3D reconstruction. The model of the object was picked by hand from the first reconstruction and then we computed the position of the arm in the subsequent reconstructions using an optimized version of our algorithm (namely, it only consisted of two stages: sparse ball voting and sparse stick voting, no iterations were used). Note that the sequence of images does not form a video; hence, the features cannot be tracked between successive frames due to the relatively large differences between the snapshots. After the motion was computed, the position of the arm in 3D was reprojected on the images (drawn in white) as shown in Fig. 15.7.
418
15 Registration of 3D Points Using GA and Tensor Voting
Fig. 15.7 Sequence of (left) images from a stereo camera showing the position of the reprojected arm (in white lines). This is not a video
a
d
b
e
c
g
f
Fig. 15.8 (a)–(c) Sets to be aligned. (d)–(f) Sets after alignment. (g) Closeup of the surface of the model
In a second experiment, we made a reconstruction of a styrofoam model of a head using a stereo camera. The two reconstructions are shown in Fig. 15.8a–c. The aligned sets can be seen in Fig. 15.8d–f. In this case, however, another optimization was used. Since the sets are close to each other, and the points provide enough structure, we used tensor voting to compute the preferred normals at each site. The computation of the normals proceeds as in standard sparse tensor voting. First, we initialized each point to a ball tensor. Then, we placed a normal ball voting field on each point and cast votes to all the neighbors. Then, the preferred normal at each site is obtained by computing the eigensystem at each point and selecting the eigenvector with the greatest eigenvalue. A closeup of the surface of the model
15.3 Experimental Analysis
419
and some of the normals found by this method is shown in Fig. 15.8g. We used this information to prune the candidate matches to those that shared a relatively similar orientation only. Also, note that in this case, there are nonrigid differences between both sets. This can be noted in places like the chin, where the alignment could not be made simply because the size of this section of the reconstruction differs slightly between both sets (the overall height of the head is slightly larger in the second set). Hence, it is impossible to match all the points at the same time. However, even in spite of this, our algorithm does yield a reasonable solution. In practice, two different surfaces are formed in the joint space, one corresponds to a rigid transformation that matches the forehead and the nose, and the other corresponds to the transformation that aligns the chins of both models. We have chosen to display the first solution, where the upper part of the head is correctly aligned—this solution also corresponds to the largest surface in the joint space. The main point of this experiment is to show that our algorithm still works even when the input cannot be perfectly aligned with a single rigid transformation. Finally, in our last experiment, we aligned a model of a Toyota car taken with a laser range scanner and aligned it with a noisy reconstruction performed with a stereo camera. The noisy target is shown in Fig. 15.9a, the model and the target are shown in Fig. 15.9b, and the final alignment in Fig. 15.9c. The procedure is the same as in the previous case. Again, since the data sets provided structure, we used it to our advantage by computing the preferred normals using tensor voting and pruning the candidate matches as described previously (Fig. 15.9d).
15.3.2 Multiple Overlapping Motions and Nonrigid Motion Another advantage our method has over ICP and similar methods is the ability to simultaneously detect multiple overlapping motions. This is also true for the 3D case. In this case, each different motion simply produces another plane in the voting space. There are limitations to the motions that can be differentiated, though. A quick analysis of Eq. 15.8 reveals that if two different motions share the same axis of rotation and same overall translation, then they will span the same 3D plane in the voting space. However, in these circumstances, it suffices to analyze the other three voting spaces (Eqs. 15.5, 15.6, and 15.7) to disambiguate this case. To illustrate this, we present a synthetic example where three overlapping motions with different axes of rotation, angles, and translations were generated in a 10 10 10 cube centered at the origin (see Fig. 15.10). Our algorithm is applied as described in the algorithm of Sect. 8. However, after the first plane was detected, we removed its tokens from the voting space and the process was repeated until no more planes were found. This is, of course, the naive implementation of the solution. However, the algorithm can be modified to account for the presence of multiple planes. In case, only the final stage, where the constraint from Eq. 15.19 is enforced, would be executed separately for each set of points.
420
15 Registration of 3D Points Using GA and Tensor Voting
a
b
c
d
Fig. 15.9 (a) Target for alignment, note the noisy surface. (b) Model displayed over the target. (c) Model and data after alignment. (d) Closeup of the surface of the model showing some of the normals computed with tensor voting
a
b
6 4 2 0 −2 −4 −6 −5 0 5
0 −2 −4 −6
2
4
6
c
6 4 2 0 −2 −4 −6
5 0
−5 0 5
−5
d 5 4 3 2 1 0 −1 −2 −3 −4
6 4 2 0 −2 −6
−4
−2
0
2
4
6
5 4 3 2 1 0 −1 −2 −3 2 0 −2 −4
−4
−2
0
2
4
6
Fig. 15.10 (a) Three overlapping rigid motions in 3D. (b)–(d) The different motions as detected by our algorithm
15.3.3 Extension to Nonrigid Motion While it can still be argued that, with some work, the Hough transform might also be used to detect the same plane we obtain through tensor voting, there is another
15.3 Experimental Analysis
a
d
421
b
c
e
Fig. 15.11 (a) Nonrigid motion applied to a 3D plane. (b) and (c) The curved surface that was generated in the voting space from two different view points. (d) and (e) The resulting correspondences found with our algorithm seen from two different viewpoints
advantage to using the latter over the former: Tensor voting enables us to find general surfaces. This means that we can also detect certain nonrigid motions that produce nonplanar surfaces in the voting spaces. To illustrate this, we generated a synthetic plane and then applied a twist transformation to it (see Fig. 15.11a). This transformation produces a curved surface in the voting space (clearly visible in the center of Fig. 15.11b–c), a closeup of the surface is also presented in Fig. 15.12. The surface is easily detected using tensor voting and the resulting correspondences, from two different viewpoints, can be seen in Fig. 15.11d–e. In order to detect this surface, we had to modify our algorithm as follows. The first two stages (sparse ball voting and sparse stick voting) are performed as usual. However, in the last stage, Eq. 15.19 was not enforced globally, but only locally around each active token. In other words, we enforced the presence of rigid transformations only on a local level. It must be remembered that Eq. 15.19 depends on two points. Therefore, for each token that was verified, we used the closest active neighbor. We illustrate this in Fig. 15.12. In this figure, the token xi is being verified using its closest neighbor, xj . The normals of the tokens are also shown. A simpler version of this algorithm was also used to solve the correspondence problem in catadioptric images. In this case, the 2D images are mapped to a 3D sphere. In this space, the corresponding corners in the 2D images form a curved 3D surface that is easily detected using tensor voting. The resulting correspondences can be seen in Fig. 15.13.
422
15 Registration of 3D Points Using GA and Tensor Voting
xi
xj
Fig. 15.12 A closeup of the surface corresponding to an elastic motion. The constraints of Eq. 15.19 are only verified locally between the closest point pairs. In this figure, token xi is verified with its closest neighbor, xj . Other pairs to be verified are also highlighted in the figure
Fig. 15.13 Solving the correspondences problem in catadioptric images via a spherical mapping. White circles were successfully registered into the black and white circles
15.4 Conclusions This chapter presented a noniterative algorithm that combines the power of expression of geometric algebra with the robustness of tensor voting to find the correspondences between two sets of 3D points with an underlying rigid transformation.
15.4 Conclusions
423
This algorithm was also shown to work with excessive numbers of outliers in both sets. We have also used geometric algebra to derive a set of constraints that serves a double purpose: on the one hand, it lets us decide whether or not the current plane corresponds to a rigid motion; and on the other hand, it allows us to reject multiple matches and enforce the uniqueness constraint. The algorithm does not require an initialization (though it can benefit from one). It works equally well for large and small motions. And it can be easily extended to account for multiple overlapping motions and even certain nonrigid transformations. It must be noted that our algorithm can detect multiple overlapping motions, whereas the current solutions only work for one global motion. We have also shown that our algorithm can work with data sets that present small nonrigid deformations. In the unconstrained case, with a large number of outliers (83–90%), the algorithm can take several minutes to finish. However, in most real-life applications, these extreme circumstances are not found, and a good initialization can be computed. When these conditions are met, our algorithm can be rather fast. Another aspect of our 3D algorithm is that we must rotate one of the sets of points in order to make more dense the plane we are looking for in the voting space. We are currently exploring other ways to make this more efficient. However, in spite of this apparent disadvantage, our algorithm works even without initialization, unlike other algorithms like ICP. Furthermore, our algorithm can be used to initialize subsequent refinement stages with ICP, thus solving the problem of having a good initialization for that algorithm.
Chapter 16
Applications in Neuralcomputing
In this chapter, we present a series of experiments in order to demonstrate the capabilities of geometric neural networks. We show cases of learning of a high nonlinear mapping and prediction. In the second part, experiments of multi-class classification, object recognition, and robot trajectory interpolation using CSVM are included.
16.1 Experiments Using Geometric Feedforward Neural Networks We begin with an analysis of the XOR problem by comparing a real-valued MLP with bivector-valued MLPs. In the second experiment, we used the encoder–decoder problem to analyze the performance of the geometric MLPs using different geometric algebras. In the third experiment, we used the Lorenz attractor to perform the step-ahead prediction.
16.1.1 Learning a High Nonlinear Mapping The power of using bivectors for learning is confirmed with the test using the XOR function. Figure 16.1 shows that geometric nets GMLP0;2;0 and GMLP2;0;0 have a faster convergence rate than either the MLP or the P-QMLP – the quaternionic multilayer perceptron of Pearson [146], which uses the activation function given by Eq. 10.5. Figure 16.1 shows the MLP with two- and four-dimensional input vectors. Since the MLP(4), working also in 4D, cannot outperform the GMLP, it can be claimed that the better performance of the geometric neural network is due not to the higher dimensional quaternionic inputs but rather to the algebraic advantages of the geometric neurons of the net.
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 16, c Springer-Verlag London Limited 2010
425
426
16 Applications in Neuralcomputing
XOR 0.6 ’MLP(2)’ ’MLP(4)’ ’Cl2_0-MLP ’ ’Cl2_2-MLP ’ ’P-QMLP’
0.5
Error
0.4 0.3 0.2 0.1 0
0
100 200 300 400 500 600 700 800 900 1000 Epoch
Fig. 16.1 Learning XOR using the MLP(2), MLP(4), GMLP0;2;0 , GMLP2;0;0 , and P-QMLP
16.1.2 Encoder–Decoder Problem The encoder–decoder problem is also an interesting benchmark test to analyze the performance of three-layer neural networks. The training input patterns are equal to the output patterns. The neural network learns in its hidden neurons a compressed binary representation of the input vectors, in such a way that the net can decode it at the output layer. We tested real- and multivector-valued MLPs using sigmoid transfer functions in the hidden and output layers. Two different kinds of training sets, consisting of one input neuron and of multiple input neurons, were used (see Table 16.1). Since the sigmoids have asymptotic values of 0 and 1, the used output training values were numbers near 0 or 1. Figure 16.2 shows the mean square error (MSE) during the training of the G00 (working in G0;0;0 , a real-valued 8–8–8 MLP, and the geometric MLPs–G30 working in G3;0;0 , G03 in G0;3;0 , and G301 in the degenerC ated algebra G3;0;1 (algebra of the dual quaternions). For the one-input case, the multivector-valued MLP network is a three-layer network with one neuron in each layer, that is, a 1–1–1 network. Each neuron has the dimension of the used geometric algebra. For example, in the figure G03 corresponds to a neural network working in G3;0;0 with eight-dimensional neurons. For the case of multiple input patterns, the network is a three-layer network with three input neurons, one hidden neuron, and one output neuron, that is, a 3–1–1 network. The training method used for all neural nets was the batch momentum
16.1 Experiments Using Geometric Feedforward Neural Networks
427
Table 16.1 Test of real- and multivector-valued MLPs using (top) one input and one output, and (middle) three inputs and one output Input 0.97 0.03 0.03 0.03 0.03 0.03 0.03 0.03 Output 0.97 0.03 0.03 0.03 0.03 0.03 0.03 0.03 Input Output
0.03 0.97 0.03 0.03 0.03 0.97 0.03 0.03
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
Input Output
0.03 0.03 0.97 0.03 0.03 0.03 0.97 0.03
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
Input Output
0.03 0.03 0.03 0.97 0.03 0.03 0.03 0.97
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
Input Output
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
0.97 0.03 0.03 0.03 0.97 0.03 0.03 0.03
Input Output
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
0.03 0.97 0.03 0.03 0.03 0.97 0.03 0.03
Input Output
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
0.03 0.03 0.97 0.03 0.03 0.03 0.97 0.03
Input Output
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
0.03 0.03 0.03 0.97 0.03 0.03 0.03 0.97
Input
0.97 0.03 0.03 0.03 0.97 0.03 0.03 0.03 0.97 0.03 0.03 0.03 0.97 0.03 0.03 0.03
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
0.03 0.97 0.03 0.03 0.03 0.97 0.03 0.03 0.03 0.97 0.03 0.03 0.03 0.97 0.03 0.03
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03 0.03
0.03 0.03 0.03 0.97 0.03 0.03 0.03 0.97 0.03 0.03 0.03 0.97 0.03 0.03 0.03 0.97
Output Input
Output Input
Output
learning rule. We can see that in both experiments, the real-valued MLP exhibits the worst performance. Since the MLP has eight inputs and the multivector-valued networks have effectively the same number of inputs, the geometric MLPs are not being favored by a higher dimensional coding of the patterns. Thus, we can attribute the better performance of the multivector-valued MLPs solely to the benefits of the Clifford geometric products involved in the pattern processing through the layers of the neural network.
428
16 Applications in Neuralcomputing
Fig. 16.2 MSE for the encoder–decoder problem with (top) one input neuron and (bottom) three input neurons
16.1.3 Prediction Let us show another application of a geometric multilayer perceptron to distinguish the geometric information in a chaotic process. In this case, we used the well-known Lorenz attractor ( D 3, r D 26:5, and b D 1), with initial conditions of [0,1,0] and a sample rate of 0.02 s. A 3–12–3 MLP and a 1–4–1 GMLP0;2;0 were trained in an interval of 12–17 s to perform a 8 step-ahead prediction. The next 750
16.2 Experiments Using Clifford Support Vector Machines
a
429
Lorenz Attractor 7 ’MLP ’ ’GMLP_0,2’
6
Error
5 4 3 2 1 0
0
500
1000
1500
2000
2500
Epoch Lorenz Attractor
b
’expected’ ’GMLP_0,2’
Lorenz Attractor
c
’expected’ ’MLP ’
1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6
1 0.95 0.9 0.85 0.8 0.75 0.7 0.65 0.6
0.4
0.45
0. 5
0.55
0.6
0 0 .6 0 .5 0 .5 5 0. .45 0 4 0. .35 3
0. 65
0.4
0.45
0.5
0.55
0.6
0 0 .6 0 .6 5 0 .5 0. . 5 5 4 0 0.4 5 0. .35 3
Fig. 16.3 (a) Training error, (b) prediction by GMLP0;2;0 and expected trend, (c) prediction by MLP and expected trend
samples unseen during training were used for the test. Figure 16.3a shows the error during training; note that the GMLP0;2;0 converges faster than the MLP. It is interesting to compare the prediction capability of the nets. Figure 16.3b,c shows that the GMLP0;2;0 predicts better than the MLP. By analyzing the covariance parameters of the MLP [0.96815, 0.67420,0.95675] and of the GMLP0;2;0 [0.9727, 0.93588, 0.95797], we see that the MLP requires more time to acquire the geometry involved in the second variable, and that is why the convergence is slower. The result is that the MLP loses its ability to predict well in the other side of the looping (see Fig. 16.3b). In contrast, the geometric net is able to capture the geometric characteristics of the attractor from an early stage, so it cannot fail, even if it has to predict at the other side of the looping.
16.2 Experiments Using Clifford Support Vector Machines In this section, we present three interesting experiments. The first one shows a multiclass classification using CSVM with a simulated example. Here, we also discuss the number of variables computed by each approach and a time comparison between CSVM and two approaches to do multi-class classification using real SVM.
430
16 Applications in Neuralcomputing
The second is about object multi-class classification with two types of training data: phase (a) artificial data and phase (b) real data obtained from a stereo vision system. We also compared the CSVM against MLPs (for multi-class classification). The third experiment presents a multi-class interpolation.
16.2.1 3D Spiral: Nonlinear Classification Problem We extended the well known 2D spiral problem to the 3D space. This experiment should test whether the CSVM would be able to separate five 1D manifolds embedded in R3 . On this application, we used a quaternion-valued CSVM with Geometric algebra G0;2;0 .1 This allows us to have quaternion inputs and outputs, and, therefore, with one output quaternion we can represent as many as 24 classes. The functions were generated as follows: f1 .t/ D Œx1 .t/; y1 .t/; z1 .t/; D Œz1 cos./ sin./; z1 sin./ sin./; z1 cos./; f2 .t/ D Œx2 .t/; y2 .t/; z2 .t/; D Œz2 cos./ sin./; z2 sin./ sin./; z2 cos./; f3 .t/ D Œx3 .t/; y3 .t/; z3 .t/; D Œz3 cos./ sin./; z3 sin./ sin./; z3 cos./; f4 .t/ D Œx4 .t/; y4 .t/; z4 .t/; D Œz4 cos./ sin./; z4 sin./ sin./; z4 cos./; f5 .t/ D Œx5 .t/; y5 .t/; z5 .t/; D Œz5 cos./ sin./; z5 sin./ sin./; z5 cos./; where in Matlab code D linspace.0:2 pi; 32/, z1 D 4 linspace.0; 10; 32/ C 1, z2 D 4 linspace.0; 10; 32/ C 10, z3 D 4 linspace.0; 10; 32/ C 20, z4 D 4 linspace.0; 10; 32/ C 30 and z5 D 4 linspace.0; 10; 32/ C 40. To depict these vectors, they were normalized by 10. In Fig. 16.4, one can see that the problem is highly nonlinearly separable. The CSVM uses for training 50 input quaternions of each of the five functions, since these have three coordinates we use simply the bivector part of the quaternion, namely xi D xi .t/ 2 3 C yi .t/ 3 1 C zi .t/ 1 2 Œ0; xi .t/; yi .t/; zi .t/. The CSVM used the kernel given by (10.53). Note that the CSVM indeed managed to separate the five classes. Comparisons Using the 3D Spiral According to [103] the most-used methods to do multi-class classification are one-against-all [27], one-against-one [110], DAGSVM [153], and some methods to solve multi-class in one step, known as alltogether methods [192]. Table 16.2 shows a comparison of number of computing variables per approach, considering also CSVM. The experiments shown in [103] indicate that “one-against-one and DAG methods are more suitable for practical use than the other methods”; we have chosen 1
The dimension of this geometric algebra is 22 D 4.
16.2 Experiments Using Clifford Support Vector Machines
431 Support Vectors
6 4 2 0 -2 -4 -6 -3
10 5 -2
-1
0
1
2
3
0
Fig. 16.4 3D spiral with five classes. The marks represent the support multivectors found by the CSVM
Table 16.2 Number of variables per approach Approach NQP NVQP CSVM One-against-all One-against-one DAGSVM A method by considering all data at once
1 K K.K 1/=2 K.K 1/=2 1
DN N 2 N=K 2 N=K K N
TNV DN K N N.K 1/ N.K 1/ K N
NQP D Number of quadratic problems to solve. NVQP D Number of variables to compute per quadratic problem. TNV D Total number of variables. D D Training input data dimension. N D Total number of training examples. K D Number of classes.
to implement the one-against-one and the earliest implementation for SVM multiclass classification one-against-all approach to do comparisons between them and our proposal CSVM. Table 16.3 shows three cases where CSVM was compared with three different approaches for multi-class classification, see [104,106]. The comparisons were made using the 3D spiral toy example and the quaternion CSVM shown in the last section. The number of classes was increased on each experiment; we started with K D 3 classes and 150 training inputs for each class. Since the training inputs have three coordinates we use simply the bivector part of the quaternion for the CSVM approach, namely xi D xi .t/ 2 3 C yi .t/ 3 1 C zi .t/ 1 2 Œ0; xi .t/; yi .t/; zi .t/, therefore CSVM computes D N D 3150 D 450 variables. The one-against-all and one-against-one approaches compute 450 and 300 variables respectively; however, the training times of CSVM and the one-against-one
432
16 Applications in Neuralcomputing Table 16.3 Time training per approach (seconds) K D 3, N D 150 K D 5, N D 250 Approach (Variables) (Variables)
K D 16, N D 800 (Variables)
CSVM C D 1; 000 One-against-all (C I ) D (1,000, 23 ) One-against-one (C I ) D (1,000, 22 ) DAGSVM (C I ) D (1,000, 23 )
22.83 (3,200) 152.5 (12,800) 42.73 (12,000) 48.55 (12,000)
0.10 (450) 0.18 (450) 0.11 (300) 0.11 (300)
2.12 (750) 10.0 (1,250) 3.42 (1,000) 4.81 (1,000)
K D number of classes, N D number of training examples. Used kernels K.xi ; xj / D e jjxi xj jj with parameters taken from D f2; 20 , 21 ; 22 ; 23 g and costs C D f1; 10; 100; 1; 000; 10; 000g. From these 5 5 D 25 combinations, the best result was selected for each approach.
approach are very similar in the first experiment. Note that when we increase the number of classes the performance of CSVM is much better than the other approaches because the number of variables to compute is greatly reduced. For K D 16 and N D 800, the CSVM uses 3,200 variables; in contrast the others equal or are more than 12,000, that is, four times more. In general, we can see that the multiclass SVM approaches described in [118, 189, 192] focus on real valued data without any assumption on intrinsic geometric characteristics as ours do. Their training methods are based on either a sequential optimization of k-class functions or minimizing the misclassification rate, which often involves a very demanding optimization. In contrast, the CSVM classifier extends in a natural manner the binary real valued SVM to a MIMO CSVM without increasing the complexity of the optimization. Next we improved the computational efficiency of all these algorithms, namely we accelerated the computation of the Gramm matrix by utilizing the “decomposition method” [104] and the “shrinking technique” [106]. We can see in Table 16.4 that the CSVM using a quarter of the variables is still faster with around a quarter of the processing time of the other approaches. The classification performance of the four approaches is presented in Table 16.5; see [104, 106]. We used during training and test 50 and 20 vectors per class respectively. We can see that the CSVM for classification has the best performance overall.
16.2.2 Object Recognition In this subsection, we present an application of Clifford SVM for multi-class object classification. We use only one CSVM with a quaternion as input and a quaternion as output that allow us to have up to 24 D 16 classes. Basically, we packed in a feature quaternion one 3-D point (which lies the in surface of the object) and the
16.2 Experiments Using Clifford Support Vector Machines
433
Table 16.4 Time training per approach (seconds) using the acceleration techniques K D 3, N D 150 K D 5, N D 250 K D 16, N D 800 Approach (Variables) (Variables) (Variables) CSVM C D 1; 000 One-against-all (C I ) D (1,000, 23 ) One-against-one (C I ) D (1,000, 22 ) DAGSVM (C I ) D (1,000, 23 )
0.07 (450) 0.11 (450) 0.09 (300) 0.10 (300)
0.987 (750) 8.54 (1,250) 2.31 (1,000) 3.98 (1,000)
10.07 (3,200) 131.24 (12,800) 30.86 (12,000) 38.88 (12,000)
K D number of classes, N D number of training examples. Used kernels K.xi ; xj / D e jjxi xj jj with parameters taken from D f2; 20 , 21 ; 22 ; 23 g and costs C D f1; 10; 100; 1; 000; 10; 000g. From these 5 5 D 25 combinations, the best result was selected for each approach.
Table 16.5 Performance in training and niques Ntrain D 150 Ntest D 60 Approach KD3 CSVM C D 1; 000 One-against-all (C I ) D (1,000, 23 ) One-against-one (C I ) D (1,000, 23 ) DAGSVM (C I ) D (1,000, 23 )
98.66 (95) 96.00 (90) 98.00 (95) 97.33 (95)
test using the acceleration techNtrain D 250 Ntest D 100 KD5
Ntrain D 800 Ntest D 320 K D 16
99.2 (98) 98.00 (96) 98.4 (99) 98.4 (97)
99.87 (99.68) 99.75 99.06 99.87 (99.375) 99.87 (99.68)
NTrain D number of training vectors, Ntest D number of test vectors. K D classes number. Percent of accuracy in training (above) and (below in brackets) the percent of accuracy in test.
magnitude of the distance between this point and the point which lies in the main axis of the object in the same level curve. Figure 16.5 depicts the four features taken by the object: Xi D ıi s C xi 2 3 C yi 3 1 C zi 1 2 Œıi ; .xi ; yi ; zi /T :
(16.1)
For each object, we trained the CSVM using a set of several feature quaternions obtained from different level curves; this means that each object is represented by several feature quaternions and not only one. Due to this, the order in which the feature quaternions are given to the CSVM is important: we begin to sample data
434
16 Applications in Neuralcomputing
(x,y,z) n ] (x,y,z)
+m ]
(x,y,z) ]
1 0.8
1
0.6
0.8
0.4
0.6 0.4
0.2
0.2
0 20
0 20 20
10 0 −10
−10
−20 −20
0
10
10 0
−10
−20 −20
10
0
20
10
(x,y,z)n] (x,y,z)
(x,y,z) ]
1 0.8
1
0.6
0.8 0.6
0.4
0.4
0.2 0 1
+m ]
0.2 0.8
0.6 0.4
0.2 0 0
0.2
0.4
0.6
0.8
1
0 1
0.8
0.6
0.4
0.2
0 0
0.2
0.4
0.6
0.8
1
Fig. 16.5 Geometric characteristics of two training objects. The magnitude is ıi , and the 3D coordinates .xi ; yi ; zi / to build the feature vector are Œıi ; .xi ; yi ; zi /
from the bottom to the top of the objects, and we give to the CSVM the training and test data maintaining this order. We processed the sequence of input quaternions one by one and accumulated their outputs by a counter that computes after the sequence which class fires the most and thus decides which class the object belongs to; see Fig. 16.6a. Note carefully, that this experiment is anyway a challenge for any algorithm for object recognition, because the feature signature is sparse. We will show later that, using this kind of feature vectors, the CSVM’s performance is superior to that of MLP’s one and the real valued-SVM-based approaches. Of course, if you spend more effort trying to improve the quality of the feature signature, the CSVM’s performance will increase accordingly. It is important to note that all the objects (synthetic and real) were preprocessed in order to have a common center at the same scale. Therefore, our learning process can be seen as centered and scale invariant. Phase (a) Synthetic Data In this experiment, we used data training obtained from synthetic objects; the training set is shown in Fig. 16.7. Note that we have six different objects, which means a six-classes classification problem, and we solve it with only one CSVM making use of its multi-output characteristic. In general,
16.2 Experiments Using Clifford Support Vector Machines
435
b
COUNTER
OUTPUTS
CSVM
INPUTS
WINNER CLASS
a
Fig. 16.6 (a) After we get the outputs, these are accumulated using a counter to calculate which class the object belongs. (b) Lower line represents CSVM, while the increasing line represents the real SVM one vs. all approach
1
1
0.8
0.5
1 0.8
0.6
0.6
0
0.4
0.4 0.2
−0.5
0 50
−1
0 0.5
50 0
0.2
0
0
−0.5
−50 −50 −5
0
5
−1 10
−0.5
0
15
2
1
0.5
0
−2
−4
−2
0
2
4
20 −4 −3 −2 −1
1 0.8 0.6 0.4 0.2 0 20
0
f
1 0.8
0.6 0.4 2 0.2 0 3 1 0.8 4 0.6 0.4 −5 0 0.2 0 0 5 1
10
0
−10
−20 −20
−10
0
10
20
0.2 0.4
0.1 0.6 0.8
Fig. 16.7 (a)–(f) Training synthetic object set
for the “one versus all” approach, one needs n SVMs (one for each class). In contrast, the CSVM needs only one machine because its quaternion output allows us to have 16 class outputs; see Fig. 16.6b. For the input-data coding, we used the non-normalized 3D point which is packed into the 23 ; 31 ; 12 basis of the feature quaternion, and the magnitude was packed in the scalar part of the quaternion (see Eq. 16.1). Figure 16.8 shows the 3D points sampled from the objects. We compared
436
16 Applications in Neuralcomputing
f ***
***
***
* *
*
*** *
* *
***
***
* *
*** *
***
*
* * ***
Fig. 16.8 (a)–(f) Sampling of the training synthetic object set
Table 16.6 Object-recognition performance in percent .%.// during training using the acceleration techniques CSVM 1-vs-all 1-vs-1 DAGSVM Object NTS C D 1200 MLP (a) (b) (c) C S F W D C
86 84 84 86 80 84
93.02 89.28 85.71 91.86 93.75 86.90
48.83 46.42 40.47 46.51 50.00 48.80
87.2 89.28 83.33 90.69 87.5 82.14
90.69 (90.47) 84.52 91.86 91.25 83.33
90.69 (90.47) 83.33 (93.02) 90.00 84.52
C D cylinder, S D sphere, F D fountain, W D worm, D D diamond, C D cube. NTS D number of training vectors. Used kernels K.xi ; xj / D e jjxi xj jj with parameters taken from D f21 , 22 ; 23 ; 24 ; 25 g and costs C D f150; 1; 000; 1; 100; 1; 200; 1; 400; 1; 500, 10; 000g. From these 8 5 D 40 combinations, the best result was selected for each approach. (a) (24 ,1,500), (b) (23 ,1,200) and (c) (24 ,1,400).
the performance of the following approaches: CSVM, a 4-7-6 MLP and the real valued-SVM-based approaches one-against-one, one-against-all, and DAGSVM. The results in Tables 16.6 and 16.7 (see [104, 106]) show that CSVM has better generalization and less training errors than the MLP approach and the real valuedSVM-based approaches one-against-one, one-against-all and DAGSVM. Note that all methods were sped up using the acceleration techniques [104,106], this extra procedure enhanced the accuracy even more of the CSVM for classification (see also Tables 16.3–16.4). The authors think that the MLP presents more training and generalization errors because the way we represent the objects (as feature quaternion
16.2 Experiments Using Clifford Support Vector Machines
437
Table 16.7 Object-recognition accuracy in percent (%./) during test using the acceleration techniques CSVM 1-vs-all 1-vs-1 DAGSVM Object NTS C D 1200 MLP (a) (b) (c) C S F W D C
52 66 66 66 58 66
94.23 87.87 90.90 89.39 93.10 92.42
80.76 45.45 51.51 57.57 55.17 46.96
90.38 83.33 83.33 86.36 93.10 89.39
(96.15) (84.84) 86.36 83.33 93.10 90.90
96.15 86.36 84.84 86.36 93.10 89.39
C D cylinder, S D sphere, F D fountain, W D worm, D D diamond, C D cube NTS= number of test vectors. K.xi ; xj / D e jjxi xj jj , D f21 ; 22 ; 23 ; 24 ; 25 g, C D f150; 1; 000, 1100; 1200; 1400; 1500; 10000g. (a) (24 ,1,500), (b) (23 ,1,200) and (c) (24 ,1,400).
Fig. 16.9 Stereo vision system and experiment environment
sets) makes the MLP to get stuck in local minima very often during the learning phase whereas the CSVM is guaranteed to find the optimal solution to the classification problem because it solves a convex quadratic problem with global minima. With respect to the real-valued SVM-based approaches, the CSVM takes advantage of the Clifford product that enhances the discriminatory power of the classificator itself unlike the other approaches, which are based solely on inner products. Phase (b) Real Data In this phase of the experiment, we obtained the training data using our robot “Geometer”; it is shown in Fig. 16.9. We take two stereoscopic views of each object: one frontal view and one 180ı rotated view (w.r.t. the frontal view); after that, we applied the Harry’s filter on each view in order to get the object’s corners and, then, with the stereo system, the 3D points .xi ; yi ; zi / which laid on the object surface and to calculate the magnitude ıi for the quaternion equation (16.1). This process is illustrated in Fig. 16.10, and the whole training object set is shown in Fig. 16.11. We follow the method explained before and proceed like in phase a.1 (for synthetic objects), because we obtained better results than in phase a.2, that is, we take the non-normalized 3D point for the bivector basis 23 ; 31 ; 12 of the feature quaternion in (16.1).
438
16 Applications in Neuralcomputing
Fig. 16.10 (a) Frontal left and right views and their sampling views, (b) 180ı rotated and sampling views. We use big white crosses for the depiction
Fig. 16.11 (a)–(f) Training real object set, stereo pair images. We include only the frontal views
After the training, we tested with a set of feature quaternions that the machine did not see during its training and we used the approach of “winner take all” to decide which class the object belongs to. The results of the training and test are shown in Table 16.8 and in Fig. 16.12. We trained the CSVM with an equal number of training data for each object, that is, 90 feature quaternions for each object, but we tested with a different number of data for each object. Note that we have two pairs of objects that are very similar to each other; the first pair is composed of
16.2 Experiments Using Clifford Support Vector Machines Table 16.8 Experimental results using real data Object Label NTS NES CTS Cube (Fig. 16.11a) 90 50 38 Œ1; 1; 1; 1 Prism (Fig. 16.11b) 90 43 32 Œ1; 1; 1; 1 Half sphere (Fig. 16.11c) 90 44 29 Œ1; 1; 1; 1 Rock (Fig. 16.11d) 90 75 63 Œ1; 1; 1; 1 Plastic bottle 1 (Fig. 16.11e) 90 65 39 Œ1; 1; 1; 1 Plastic bottle 2 (Fig. 16.11f) 90 67 41 Œ1; 1; 1; 1
439
% 76.00 74.42 65.90 84.00 60.00 61.20
NTS: number of training samples, NES: number of test samples, CTS: Number of correctly classified test samples
Fig. 16.12 Robot takes the recognized object
the half sphere shown in Fig. 16.11c and the rock in Fig. 16.11d; in spite of their similarities, we got very good accuracy in the test phase for both objects: 65.9% for the half sphere and 84% for the rock. We think we got better results for the rock because this object has a lot of texture that produces many corners, which in turn capture better the irregularities, therefore, we have more test feature quaternions for the rock than for the half sphere (75 against 44 respectively). The second pair composed of similar objects is shown in Fig. 16.11e,f; these are two equal plastic bottles of juice, but one of them (Fig. 16.11f) is burned, which makes the difference between them and gives the CSVM enough distinguishing features to make two object classes, shown in Table 16.8. We got 60% of correct classified test samples for the bottle in Fig. 16.11e against 61% for the burned bottle in Fig. 16.11f. The lower learn rates in the last objects (Fig. 16.11c,e, and f) is because the CSVM is mixing the classes a bit due to the fact that the feature vectors are not large and do not reach enough.
440
a
16 Applications in Neuralcomputing
3
5
b
4
2
3 1
2 f(t) = y
f(t) = x
0 −1 −2
−3
−4
c
−4 0
10
20 30 Time (t)
40
50
2
1
1
0
0
−1
−1
20 30 Time (t)
40
50
−2
3
3
−4
−4
−5 −5
10
3
2
Y
Y
−5 0
d
3
−2
e
0 −1 −2
3
−5
1
0 X
5
0 X
5
−5 −5
0 X
5
3 2 1
Y
0 −1 −2 3 −4 −5 −5
Fig. 16.13 (a) and (b) Continuous curves of training output data for axes x and y (50 points). (c) 2D result of combing axes x and y (50 points). (d) 2D result by testing with 100 input data. (e) 2D result by testing with 400 input data. (f) Experiment environment
16.2.3 Multi-Case Interpolation A real-valued SVM can carry out regression and interpolation for multiple inputs and one real output. Surprisingly, a Clifford-valued SVM can have multiple inputs and 2n outputs for an n-dimensional space or Rn . For handling regression we use 1:0 > " > 0, where the diameter of the tube surrounding the optimal hyperplane is
16.2 Experiments Using Clifford Support Vector Machines
441
2 . For the case of interpolation, we use " D 0. We have chosen an interesting task where we use a CSVM for interpolation in order to code a certain kind of behavior we want a visually guided robot to perform. The robot should autonomously draw a complicated 2D pattern. This capacity should be coded internally in longterm memory (LTM), so that the robot reacts immediately without the need for reasoning. Similar to a capable person who reacts in milliseconds with incredible precision to accomplish a very difficult task, for example, a tennis player or tango dancer. For our purpose, we trained offline a CSVM using two real-valued functions. The CSVM used the geometric algebra G3C (quaternion algebra). Two inputs used two components of the quaternion input and two outputs (two components of the quaternion output). The first input u and first output x coded the relation x D a sin.3 u/ cos.u/ for one axis. The second input v and second output y coded the relation y D a sin.3 v/ sin.v/ for another axis; see Fig. 16.13a,b. The 2D pattern can be drawn using these 50 points generated by functions for x and y; see Fig. 16.13c. We tested if the CSVM can interpolate well enough using 100 and 400 unseen input tuples fu; vg; see Fig. 16.13d,e, respectively. Once the CSVM was trained we incorporated it as part of the LTM of the visually guided robot shown in Fig. 16.13f. In order to carry out its task, the robot called the CSVM for a sequence of input patterns. The robot was able to draw the desired 2D pattern as we see in Fig. 16.14a–d. The reader should bear in mind that this experiment
Fig. 16.14 (a), (b), (c) Image sequence while robot is drawing. (d) Robot’s drawing. Result by testing with 400 input data
442
16 Applications in Neuralcomputing
was designed using the equation of a standard function, in order to have a ground truth. Anyhow, our algorithm should be also able to learn 3D curves that do not have explicit equations.
16.3 Conclusion This chapter generalizes the real-valued MLPs and SVMs to Clifford-valued MLPs SVMs, and they are used for classification, regression, and interpolation. The CSVM accepts multiple multivector inputs and multivector outputs like a MIMO architecture that allows us to have multi-class applications. We can use CSVM over complex, quaternion, or hyper-complex numbers according to our needs. The application section shows experiments in pattern recognition and visually guided robotics that illustrate the power of the algorithms and help the reader understand the Clifford SVM and use it in various tasks of complex and quaternion signal and image processing, pattern recognition and computer vision using high-dimensional geometric primitives. The extension of the real-valued SVM to the Clifford SVM appears promising particularly in geometric computing and their applications like graphics, augmented reality, robot vision, and humanoids.
Chapter 17
Neural Computing for 2D Contour and 3D Surface Reconstruction
In geometric algebra, there exist specific operators named versors to model rotations, translations, and dilations, and are called rotors, translators, and dilators respectively. In general, a versor G is a multivector that can be expressed as the geometric product of nonsingular vectors: G D ˙v1 v2 ; : : : ; vk :
(17.1)
In conformal geometric algebra, such operators are defined by (17.2), (17.3) and (17.4), R being the rotor, T the translator, and D the dilator. 1
R D e 2 b; t e1 T D e 2 ; D D e
log. /^ 2
(17.2) (17.3) E
;
(17.4)
where b is the bivector dual to the rotation axis, is the rotation angle, t 2 E 3 is the translation vector, is the factor of dilation, and E D e ^ e0 . Such operators are applied to any entity of any dimension by multiplying the entity by the operator from the left, and by the reverse of the operator from the Q to right. Let X i be any entity in CGA; then to rotate it we compute X 01 D RX 1 R, 0 0 Q Q translate X 2 D T X 2 T , and to dilate X 3 D D X 3 D .
17.1 Determining the Shape of an Object To determine the shape of an object, we can use a topographic mapping that uses selected points of interest along the contour of the object to fit a low-dimensional map to the high-dimensional manifold of this contour. This mapping is commonly achieved by using self-organized neural networks such as Kohonen’s self-organizing maps (SOM) or neural gas (NG) [136]; however, if we desire a better topology preservation, we should not specify the number of neurons of the network a priori
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 17, c Springer-Verlag London Limited 2010
443
444
17 Neural Computing for 2D Contour and 3D Surface Reconstruction Input image
Compute the vector field GGVF
GGVF
Compute streamlines
Determine the inputs to the net
Inputs Conformal Geometric Algebra
Training of the net GNG
Set of versors
Apply versors to a selected point to define the object shape
Segmented image or 3D object
Fig. 17.1 A block diagram of our approach
(as specified for neurons in SOM or NG, together with its neighborhood relations), but allow the network to grow using an incremental training algorithm, as in the case of the growing neural gas (GNG) [63]. In this work, we follow the idea of growing neural networks and present an approach based on the GNG algorithm to determine the shape of objects by means of applying versors of the CGA, resulting in a model easy to handle in postprocessing stages; a scheme of our approach is shown in Fig. 17.1. The neural network has versors associated to its neurons, and its learning algorithm determines the parameters that best fit the input patterns, allowing us to get every point on the contour by interpolation of such versors. Additionally, we modify the acquisition of input patterns by adding a preprocessing stage that determines the inputs to the net; this is done by computing the generalized gradient vector flow (GGVF) and analyzing the streamlines followed by particles (points) placed on the vertices of small squares defined by dividing the 2D/3D space in such squares/cubes. The streamline or the path followed by a particle that is placed on x D .x; y; z/ coordinates will be denoted as S.x/. The information obtained with GGVF is also used in the learning stage, as explained further.
17.1.1 Automatic Sample Selection Using GGVF In order to select the input patterns automatically, we use the GGVF [195], which is a dense vector field derived from the volumetric data by minimizing a certain energy functional in a variational framework. The minimization is achieved by solving linear partial differential equations that diffuses the gradient vectors computed from the volumetric data. To define the GGVF, the edge map is defined at first as f .x/ W ˝ ! R:
(17.5)
For the 2D image, it is defined as f .x; y/ D jrG.x; y/ I.x; y/j2 , where I.x; y/ is the gray level of the image on pixel .x; y/, G.x; y/ is a 2D Gaussian function (for robustness in the presence of noise), and r is the gradient operator.
17.1 Determining the Shape of an Object
445
Fig. 17.2 Example of the dense vector field called GGVF (only representative samples of a grid are shown rather than the entire vector field). (a) Samples of the vector field for a 2D image, (b) samples of the vector field for volumetric data, (c) example of streamlines for particles arranged in a 32 32 grid according to the vector field shown in (a), (d) points selected as input patterns
With this edge map, the GGVF is defined to be the vector field v.x; y; z/ D Œu.x; y; z/; v.x; y; z/; w.x; y; z/, which minimizes the energy functional ED where
Z Z
g.jrf j/r 2 v h.jrf j/.v rf /;
g.jrf j/ D e
jrf j
and h.jrf j/ D 1 g.jrf j/
(17.6)
(17.7)
and is a coefficient. An example of such a dense vector field obtained in a 2D image is shown in Fig. 17.2a, while an example of the vector field for a volumetric data is shown in Fig. 17.2b. In Fig. 17.2d, the points were selected according to Eq. 17.8. Observe the large range of capture of the forces in the image. Due to this large capture range, if we put particles (points) on any place over the image, they can be guided to the contour of the object. The automatic selection of input patterns is done by analyzing the streamlines of points on a 3D grid topology defined over the volumetric data. It means that the algorithm follows the streamlines of each point of the grid, which will guide the point to the more evident contour of the object; then, the algorithm selects the point where the streamline finds a peak in the edge map and gets its conformal representation, X D x C 12 x2 e1 C e0 , to make the input pattern set. In addition to X (conformal position of the point), the inputs have the vector v D Œu; v; w, which is the value of the GGVF in such a pixel and will be used in the training stage as a parameter determining the amount of energy the input has to attract neurons. This information will be used in the training stage together with the position x for learning the topology of the data. Summarizing, the input set I will be I D fk D Xk ; vk jx 2 S.x0 / and f .x / D 1g: (17.8) where X is the conformal representation of x ; x 2 S.x0 / means that x is on the path followed by a particle placed in x0 , and f .x / is the value of the edge map in position x (assuming it is binarized). As some streamlines can carry to the same point or very close points, we can add constraints to avoid very close samples; one very simple restriction is that the candidate to be included in the input set must be at least at a fixed distance, dthresh , of any other input.
446
17 Neural Computing for 2D Contour and 3D Surface Reconstruction
Figure 17.2c shows the streamlines according to the vector field shown in Fig. 17.2a, and the input patterns selected as described before are shown in Fig. 17.2.
17.1.2 Learning the Shape Using Versors It is important to note that although we will be explaining the algorithm using points, the versors can be applied to any entity in GA that we had selected to model the object. The network starts with a minimum number of versors (neural units) and new units are inserted successively. The network is specified by – A set of units (neurons) named N , where each nl 2 N has its associated versor Mnl ; each versor is the transformation that must be applied to a point to place it in the contour of the object. The set of transformations will ultimately describe the shape of the object. – A set of connections between neurons defining the topological structure. Also, take into account that – There are two learning parameters, ew and en , for the winner neuron and for the direct neighbors of it; such parameters remain constant throughout the process. – Each neuron nl will be composed of its versor Mnl , the signal counter scl , and the relative signal frequency rsf l . The signal counter, scl , is incremented for the neuron nl every time it is the winner neuron. The relative signal frequency, rsf l , is defined as scl rsf l D P : (17.9) 8nj scj This parameter will act as an indicator to insert new neural units. With these elements, we define the learning algorithm of the GNG to find the versors that will define the contour as follows: 1. Let P0 be a fixed initial point over which the transformations will be applied. This point corresponds to the conformal representation of p0 , which can be a random point or the centroid defined by the inputs. The initial transformations will be t expressed as M D e 2 e1 in the conformal geometric algebra. The vector t initially is a random displacement. 2. Start with the minimal number of neurons, which have associated random motors M as well as a vector vl D Œul ; vl ; wl whose magnitude is interpreted as the capacity of learning for such a neuron. 3. Select one input from the inputs set I and find the winner neuron; then find the neuron nl having the versor Ml that moves the point P0 closer to such an input: Mwin D min 8M
q
.X MP0 MQ /2 :
(17.10)
17.1 Determining the Shape of an Object
447
4. Modify Mwin and all others versors of neighboring neurons Ml in such a way that the modified M will represent a transformation moving the point P0 nearer the input. Note that each motor is composed of a rotation R and a translation T . The rotation is computed as in (19.1), where is the angle between the actual position a and a0 D a C v (v is the GGVF vector value in such position); the bivector dual to the rotation axis is computed as b D IE b; the rotors and translators are defined as ew .v ;vwin / 2
Rwin D e
b;
(17.11)
t 2win e1
T D e ; twin D ew .v ; vwin / .x p0 /;
(17.12) (17.13)
for the winner neuron, and Rn D e
en .v ;vn / 2
b;
(17.14)
t2n e1
T D e ; tn D en .v ; vn / .x p0 /;
(17.15) (17.16)
for its direct neighbors, to obtain M D T R. Finally, the new motor is Mlnew D MMlold :
(17.17)
is a function defining the amount a neuron can learn according to its distance from the winner one (defined as in equation (17.18)), and .v ; vl / is defined as in (17.19): D e
Q win M P M Q 2 .Mwin P0 M l 0 l/ 2
;
.v ; vl / D kv vl k2 ;
(17.18) (17.19)
which is a function defining a quantity of learning depending on the strength of the input, , to teach and the capacity of the neuron to learn, given in v and vl , respectively. Also, update new new new T vnew ; win D uwin vwin wwin new new new T new ; vn D un vn wn new where unew win D .uwin C ew uwin /; vwin D .vwin C ew vwin /; new wwin D .wwin C ew wwin /; unew n D .un C en un /; new vnew D .v C e v /; w D .wn C en wn /. n n n n n
(17.20) (17.21)
448
17 Neural Computing for 2D Contour and 3D Surface Reconstruction
5. With a certain number, , of iterations, determine the neuron with the highest value rsf l . Then, if any of the direct neighbors of that neuron is at a distance larger than cmax , do – Determine neighboring neurons ni and nj . – Create a new neuron nnew between ni and nj whose associated M and vl will be vi C vj Mi C Mj ; vl new D : (17.22) Mnnew D 2 2 The new units will have the values scnew D 0 and rsf new D 0. – Delete the old edge connecting ni and nj and create two new edges connecting nnew with ni and nj . 6. Repeat steps 3 to 5 if the stopping criterion is not achieved. The stop criterion is when a maximum number of neurons is reached, or when the learning capacity of neurons approaches zero (is less than a threshold cmin ), whichever happens first will stop the learning process. Upon training the network, we find the set of M defining positions on a trajectory; such positions minimize the error measured as the average distance between X and the result of M P0 MQ : P D
q 2 M P0 MQ X 8 N
;
(17.23)
where M moves P0 closer to input X , and N is the number of inputs.
17.2 Experiments Figure 17.3 shows the result when the algorithm is applied to a magnetic resonance image (MRI); the goal is to obtain the shape of the ventricle. Figure 17.3a shows the original brain image and the region of interest (ROI); Fig. 17.3b shows the computed vector field for the ROI; Fig. 17.3c shows the streamlines in the ROI defined for particles placed on the vertices of a 32 32 grid; Fig. 17.3d shows the initial shape as defined for the two initial random motors Ma ; Mb ; Fig. 17.3e shows the final shape obtained; and, finally, Fig. 17.3f shows the original image with the segmented object. Figure 17.4 shows an image showing that our approach can also be used for automated visual inspection tasks; the reader can observe that such an image contains a very blurred object. This image is for the inspection of hard disk head sliders. Figure 17.4a shows the original image and the region of interest (ROI); Fig. 17.4b shows the computed vector field of the ROI; Fig. 17.4c shows the streamlines defined for particles placed on the vertices of a 32 32 grid; Fig. 17.4d shows the
17.2 Experiments
449
~
e
d Mb =MbP0Mb Inputs
~ Ma=MaP0Ma
Fig. 17.3 (a) Original image and region of interest (ROI), (b) zoom of the dense vector field of the ROI, (c) zoom of the streamlines in ROI, (d) inputs and initial shape, (e) final shape defined according to the 54 estimated motors, (f) image segmented according to the results
Fig. 17.4 Application in visual inspection tasks: (a) Original image and the region of interest (ROI), (b) zoom of the dense vector field of the ROI, (c) zoom of the streamlines in ROI, (d) Inputs and initial shape according to the two initial random transformations Ma and Mb , (e) final shape defined according to the 15 estimated motors (original image with the segmented object)
inputs selected according to the streamlines and the initial shape as defined for the two initial random motors Ma ; Mb ; Fig. 17.4e shows the final shape obtained overlapped with the original image, showing that the algorithm gives good results if it is used for segmentation.
450
17 Neural Computing for 2D Contour and 3D Surface Reconstruction
Fig. 17.5 Result obtained when using the active contour approach to segment the object in the same image as in Fig. 17.4. (a) Initialization of snake inside the object, (b) final result obtained with initialization showed in (a), (c) initialization of snake outside the object, (d) final result obtained with initialization showed in (c), (e) initialization of snake over the contour, (f) final result obtained with initialization shown in (e)
Figure 17.5 shows the application of the ggvf-snakes algorithm in the same problem. It is important to note that although the approaches of Fig. 17.4 and Fig. 17.5 use GGVF information to find the shape of an object, the estimated final shape is better using the neural approach than the one using active contours; the second approach (see Fig. 17.5) fails to segment the object whether the initialization of the snake is given inside, outside, or over the contour we are interested in. Additionally, the fact of expressing such a shape as a set of motors allows us to have a model best suited for use in further applications that can require the deformation of the model, specially if such a model is not based on points but on other GA entities, because we do not need to change the motors (remember that they are applied in the same way to any other entity). The proposed algorithm was applied to different sets of medical images. Figure 17.6 shows some images of such sets. The first row of each figure shows the original image and the region of interest, while the second row shows the result of the proposed approach. Table 17.1 shows the errors obtained with our approach using and not using the GGVF information. We can observe that the inclusion of the GGVF information improves the approximation of the surface. To compare our algorithm, we use the GNG with and without the GGVF information, as well as a growing version of SOM, also using and not using the GGVF information. These algorithms were applied to a set of 2D medical images (some obtained with computer tomography (CT) and some with magnetic resonance (MR)). Figure 17.7a shows the average errors when GSOM stops for different examples: segmenting a ventricle, a blurred object, a free-form curve, and a column disk. Note that with the GGVF information, the error is reduced. This means that using
17.2 Experiments
451
Fig. 17.6 First row (upper row): original image and the region of interest. Second row: result of segmentation Table 17.1 Errors obtained by the algorithm with and without the GGVF information. 1 : error without GGVF; 2 : error with GGVF Example 1 2 Example 1 2 Ventricle 1 3.29 2.51 Eye 1 7.63 6.8 Eye 2 3.43 2.98 Column disk 1 4.65 4.1 Tumor 1 3.41 2.85 Tumor 2 2.95 2.41 Free-form curve 2.84 1.97 Column disk 2 2.9 2.5
the GGVF information, as we propose, allows a better approximation of the object shape to be obtained. Figure 17.7b shows the average errors obtained for several examples but using the GNG with and without the GGVF information. Note that, again, the GGVF contributes to obtaining a better approximation of the object’s surface. Also note that the average errors obtained with the GNG algorithm are smaller than the errors obtained with the GSOM, as can be seen in Fig. 17.7c, and that both are improved with GGVF information, although GNG gives better results. It is necessary to mention that the whole process is quick enough; in fact, the computational time required for all the images shown in this work took only a few seconds. The computation of the GGVF is the most time-consuming task in the algorithm, but it only takes about 3 s for 64 64 images, 20 s for 256 256 images, and 110 s for 512 512 images. This is the reason why we decided not to compute it for the whole image, but for a selected region of interest. The same criterion was applied to 3D examples. Figure 17.8a shows the patient head with the tumor whose surface we need to approximate; Fig. 17.8b shows the vectors of the dense GGVF on a 3D grid arrangement of size 32 32 16; Fig. 17.8c shows the inputs determined by GGVF and edge map, and also shows the initialization of the net GNG; Fig. 17.8d–f shows some stages of the adaptation process, while the net determining the set of transfor-
452
17 Neural Computing for 2D Contour and 3D Surface Reconstruction
a
GSOM average error with and without GGVF information
9 Using GGVF
8
Not using GGVF
7
Error
6 5 4 3 2 1 0
auto Ventricle
auto Eye
auto auto Column Tumor disk 72
user Eye
auto Tumor MR
auto Curve
Example
b
Average error in GNG with and without GGVF
c 9 Average errors for GSOM and GNG with/without GGVF GNG error with GGVF
4 3.5
Error using GGVF
8
Error without GGVF
7
3 Errors
Errors
GSOM error with GGVF GSOM error without GGVF
6
2.5 2 1.5
5 4 3
1
2
0.5 0
GNG error without GGVF Área de trazado
1 auto Ventricle
auto Eye
auto Column disk
auto Tumor 72
Cases
auto Tumor MR
auto Curve
0
Ventricle
Eye
Column disk
Tumor 72
Tumor MR
Curve
Examples
Fig. 17.7 (a) Average errors for different examples using the GSOM algorithm with and without GGVF information, (b) average errors for different examples using the GNG algorithm with and without GGVF information, (c) comparison between the errors obtained with GSOM and GNG with and without using GGVF information. Note that both are improved with the GGVF information, although GNG gives better results
mations M (Fig. 17.8f is the final shape after training has finished with a total of 170 versors M (associated with 170 neural units)). Figure 17.9 shows other 3D examples, corresponding to a pear, and the surface is well approximated. Figure 17.9b shows the inputs and the initialization of the net with nine neural units (the topology of the net is defined as a sort of pyramid around the centroid of input points); while Fig. 17.9c shows the result after the net has reached the maximum number of neurons, which was fixed at 300; finally, Fig. 17.9d shows the minimization of the error according to (17.23). Another useful application of the algorithm using the gradient information of the GGVF during the training of the GNG neural net in the geometric algebra framework is the transformation of one model obtained at time t1 into another obtained at time t2 (a kind of morphing of 3D surfaces).
17.2 Experiments
453
Fig. 17.8 The algorithm for a 3D object’s shape determination. (a) 3D model of the patient’s head containing a tumor in the marked region, (b) vectors of the dense GGVF on a 3D grid arrangement of 32 32 16, (c) inputs determined by GGVF and edge map and the initialization of the net GNG, (d)–(e) two stages during the learning, (f) final shape after training has finished with a total of 170 versors M (associated with 170 neural units)
Fig. 17.9 3D object shape definition for the case of a pear. (a) Inputs to the net selected using GGVF and streamlines, (b) inputs and the initialization of the net with nine neural units, (c) result after the net has been reached the maximum number of neurons (300 neurons), (d) error measurement using Eq. 17.23
Figure 17.10a shows the initial shape, which will be transformed into the one showed in Fig. 17.10b; Fig. 17.10c–f shows some stages during the process. Note that Fig. 17.10f looks like Fig. 17.10b, as expected. In the case shown in Fig. 17.11, we have one 3D model with an irregular shape that will be transformed into a shape similar to a pear; Fig. 17.11a shows the initial
454
17 Neural Computing for 2D Contour and 3D Surface Reconstruction
Fig. 17.10 The 3D surface shown in (a) is transformed into the one shown in (b). Different stages during the evolution are shown in (c) to (f), where (f) is the final shape (that is, the final shape of a) after finishing the evolution of the net, which should look like (b)
Fig. 17.11 The 3D surface shown in (a) is transformed into the one shown in (b). Different stages during the evolution are shown in (c) to (f), where (f) is the final shape (that is, the final shape of a) after finishing the evolution of the net, which should look like (b)
shape that will be transformed into the one shown in Fig. 17.11b; Fig. 17.11c–f shows some stages during the process. Again, the resulting volume looks like the one expected (Fig. 17.11b). To illustrate the application of the presented algorithm in cases having models based on entities different than the points, in Fig. 17.12 we shown models based on spheres [160]. The goal is the same: morphing the model shown in Fig. 17.12a,d into the ones shown in Fig. 17.12b,e, respectively. The results are shown in Fig. 17.12c,f.
17.3 Conclusion
455
Fig. 17.12 The 3D models based on spheres shown in (a) and (d) are transformed into the ones shown in (b) and (e), respectively, resulting in the models shown in (c) and (f), respectively
17.3 Conclusion In this chapter the authors showed how to incorporate geometric algebra techniques in an artificial neural network approach to approximate 2D contours or 3D surfaces. In addition, it demonstrated the use of the dense vector field named the generalized gradient vector flow (GGVF) not only to select the inputs to the neural network GNG, but also as a parameter guiding its learning process. This network was used to find a set of transformations expressed in the conformal geometric algebra framework, which move a point by means of a versor along the contour of an object, in this way, defining the shape of the object. This has the advantage that versors of the conformal geometric algebra can be used to transform any entity exactly in the same way: multiplying the entity from the left by M and from the right by f M. Some experiments show the application of the proposed method in medical image processing and also for automated visual inspection tasks. The results obtained show that by incorporating the GGVF information, we can automatically get the set of inputs to the net, and we also improve its performance. When dealing with the 3D case, we presented two different applications: surface approximation and the transformation of a model at time t1 onto another at time t2 , obtaining good results even using models based on spheres of the conformal geometric algebra.
Part VI
Applications II: Robotics and Medical Robotics
Chapter 18
Rigid Motion Estimation Using Line Observations
18.1 Introduction This chapter is dedicated to the estimation of 3D Euclidean transformation using motor algebra. Two illustrations of estimation procedures are given: the first uses a batch approach for the estimation of the unknown 3D transformation between the coordinate reference systems of a robot neck, or arm, and of a digital camera. This problem is called the hand–eye problem, and it is solved using a motion-of-lines model. The second illustration uses a recursive estimation method based on Kalman filter techniques. After introducing the motor-extended Kalman filter (MEKF), we estimate 3D rigid motion using the MEKF and 3D lines gained by a visual robot system. These two approaches show that the task of estimating 3D rigid motion is easier using motor algebra because the motion-of-lines model is linear.
18.2 Batch Estimation Using SVD Techniques In this section, we illustrate the use of motor algebra for solving an exemplary task of visually guided robotics. We choose the so-called hand–eye calibration problem. This kind of task may be found in the area of visually guided robotics, where cameras are attached to robot arms or mounted on a vehicle and they have to be directed toward a goal. On the one hand, the cameras capture visual cues in the 3D visual space and employ their own world reference coordinate system. On the other hand, the robot arm or vehicle moves relative to a reference coordinate system. If we compute the intrinsic and extrinsic parameters through the camera’s movements, we can find the geometrical relationship between the camera position and the world coordinates. The position of the robot arm or vehicle, on the other hand, is always known, owing to the angular position of the step motors of the device, which are permanently controlled by the computer. The problem of hand–eye calibration arises when we try to determine the group transformation between the reference coordinates of the mechanical device and the coordinate frame of the camera.
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 18, c Springer-Verlag London Limited 2010
459
460
18 Rigid Motion Estimation Using Line Observations
Pan Rotate
Tilt Body-Eye Transform
Fig. 18.1 Abstraction of the hand–eye system
An abstraction of the geometry of a camera mounted on a robot arm is depicted in Fig. 18.1. The classical way to describe the hand–eye problem in mathematical terms is by using transformation matrices. This problem was originally formulated by Shiu and Ahmad [174] and Tsai and Lenz [188] as a matrix equation of a Euclidean transformation: AX D XB;
(18.1)
1 where the matrices A D A1 A1 2 and B D B1 B2 express the elimination of the transformation between the hand–base and the world. From Eq. 18.1, the following matrix equation and a vector equation can be derived by splitting the Euclidean transformation into rotation and translation components:
RA RX D RX RB ;
(18.2)
.RA I /tX D RX tB tA :
(18.3)
In the literature, we find a variety of methods to estimate the rotation matrix, RX , from Eq. 18.2 (see the survey by Wang [191]), most of which first estimate the rotation matrix decoupled from the translation component. Tsai and Lenz showed that to solve the problem at least two motions are required, with rotations having
18.2 Batch Estimation Using SVD Techniques
461
no parallel axes [188]. The most relevant approaches employ either the axis and the angle of rotation [174, 188], quaternions [38], dual quaternions [75], or a canonical matrix representation [121]. Horaud and Dornaika [100] were the first to apply a nonlinear method to compute RX and tX simultaneously. In their work, they showed the instability of the computation of the matrices Ai , given the projective matrices Mi D SAi D SRAi S tAi , where S is the matrix of the camera-intrinsic parameters. Let us assume that the matrix of the intrinsic parameters S remains constant during camera motion and that one extrinsic calibration A2 is known. By introducing Ni D SRAi and ni D S tAi and replacing X D A2 Y , we get the hand–eye unknown Y . Thus, Eq. 18.1 can be reformulated as A1 (18.4) 2 A1 Y D YB: Now, if A1 2 A1 is written as a function of the projection parameters, it is possible to obtain an expression fully independent of the intrinsic parameters S , that is, A1 2 A1
N 1 N1 D T2 03
R N21 .n1 n2 / D T 1 03
t : 1
(18.5)
Taking into consideration the selected matrices and relations, this result allows us to reconsider the formulation of the hand–eye problem employing the standard Eq. 18.1, which can then be solved using all the known methods, including the one presented in this chapter. Another relevant contribution to the hand–eye problem was made by Chen [36]. Employing a geometric point of view, he formulated the hand–eye calibration problem using screw theory. Using this method, Chen discovered that the hand–eye transformation is fully independent of the pitch and the angle of the camera and hand motions. The unknown transformation simply depends upon the parameters that relate to the screw axis lines of the hand motion and of the camera motion. In this chapter, we use a unitary motor that is completely isomorphic with the unit screw, and we present an algorithm, different from the one used by Horaud and Dornaika [100], to compute rotation and translation simultaneously in a linear manner.
18.2.1 Solving AX D XB Using Motor Algebra In terms of motors, system equation (18.1) can be expressed as M AM X D M X M B ;
(18.6)
fX ; MA D MXMBM
(18.7)
or as
462
18 Rigid Motion Estimation Using Line Observations
where M A D A C I A 0 , M B D B C I B 0 , and M X D R C I R 0 . Next, we simplify this equation to show the motor relation between the motor line axis of the camera, LA , and the motor line axis of the hand, LB . We can isolate the scalar part of M A by using the grade operator hM A i0 hM A i, according to Eq. 3.94, and we can then use the previous equation to get 1 fX C M X M fB M fX / .M X M B M 2 1 f B /M fX D M X hM B iM fX D M X .M B C M 2 f X hM B i D hM B i: D MXM
hM A i D
(18.8)
Thus, we are able to equalize the scalar parts of M A and M B , and then, using Eq. 3.76, we obtain cos
dA A CI 2 2
D cos
dB B CI 2 2
;
(18.9)
which may also be written by separating the real and dual terms: B A D cos ; 2 2 A B dA sin D dB sin : 2 2
cos
(18.10)
In this way, we are sure that only the bivector terms of M A and M B will contribute to the computation of the unknown M X . Finally, using Eq. 3.76, the hand–eye equation reduces to
dA dB A B fX sin CI LA D M X sin CI LB M 2 2 2 2 dB B fX : CI M X LB M D sin 2 2
(18.11)
If A and B do not abandon the range from 0ı to 360ı, we can neglect the sines to get the simplified expression fX ; LA D M X LB M
(18.12)
which shows that in this kind of problem formulation, the rotation and pitch of M A and M B are always equal throughout all the hand movements and, therefore, can be ignored in the computations. It will suffice to focus on only the rotation axes of the involved motors, that is, Eq. 18.12 is reduced to just the motion of the line axis of the hand LB toward the line axis of the camera LA . This characteristic, known
18.2 Batch Estimation Using SVD Techniques
463
Fig. 18.2 The hand–eye system as the motion of related axes lines
as the congruence theorem, has been pointed out by Chen [36]. However, thanks to the use of motor algebra, the proof of this theorem has now been reduced to a single step, shown by Eq. 18.8. This simplification of the hand–eye problem is depicted in Fig. 18.2. Since the hand–eye problem pertains to the motion of lines, Eq. 4.18 can be used to estimate the unknown 3D transformation that relates to the screw line axes of both the hand and the camera: e C I.RbR e 0 C Rb0 R e e C R 0bR/; LA D a C I a0 D RbR
(18.13)
464
18 Rigid Motion Estimation Using Line Observations
where a; a0 ; b, and b0 are spanned by bivectors. Separating real and dual terms, e a D RbR; 0 e 0 C Rb0 R e e C R 0 bR; a D RbR
(18.14)
e 0 R D 0, e 0CR and multiplying from the right with motor R, using the relation RR we get the following multivector relationships: aR Rb D 0; 0
0
.a R Rb / C .aR 0 R 0 b/ D 0:
(18.15)
These equations can be expressed in a matrix consisting of the scalar and bivector terms and the outer product as follows:
ab a0 b0
Œa C b Œa0 C b0
031 033 a b Œa C b
R R0
D 0;
(18.16)
where the matrix, which we will call D, is a 68 matrix, and the vector of unknowns .R; R 0 /T is eight-dimensional. The notation Œa C b stands for the vector crossproduct as an antisymmetric matrix. Recall that we have two constraints on the unknowns, so that the result is a unit motor with the properties eD1 RR
and
RR 0 D 0:
(18.17)
So far, we have six equations and two constraints. However, because the unit bivectors a and b are perpendicular to the bivectors a0 and b0 , respectively, we conclude that two of the equations are necessarily redundant. This is not a surprise, really, because we already know that at least two lines are required to estimate 3D motion from their pertinent correspondences [167]. The result is that at least two motions of the hand–eye system are required in order to compute two lines from the involved screws. Chen [36] clearly noted this fact and analyzed the uniqueness of the problem. He proved geometrically that even in the case of two parallel rotation axis lines, it is still possible to compute all the parameters up to the pitch [12].
18.2.2 Estimation of the Hand–Eye Motor Using SVD We reduced the hand–eye problem to Eq. 18.16, which depends only on 3D bivectors. Now, for the estimation of the unknown rigid motion, we employ the singular value decomposition (SVD) method [158], which is actually a vector approach for finding singular values. We are able to use this method since we are dealing only with bivectors. If we were dealing with a simultaneous estimation of multivectors of different grade, we would, of course, extend the SVD method to encompass a multivector concept.
18.2 Batch Estimation Using SVD Techniques
465
Let us consider that n 2 motions are available. Employing SVD, we build the following 6n 8 matrix:
T C D D T1 D T2 : : : D Tn :
(18.18)
Because 3D motion has six degrees of freedom, in the case of noise-free data, the greatest possible value of this matrix will be rank 6. Let us analyze this matrix in more detail. In the case of noise-free data, for which equations were calculated basically using geometric and algebraic concepts, we would expect that the null space necessarily contains at least the solution (R, R 0 ). The solution .041 ; R/ (“pure rotation”) is the trivial one, and thus we are able to reaffirm that the matrix indeed has a maximum rank of 6. In the particular case where all the b-axis lines are mutually parallel, one degree of freedom remains constant; in this case, the matrix will be of rank 5. For the solution of Eq. 18.18, we use the SVD method. This procedure decomposes the matrix C into three matrices, as follows: C D U˙V T . The columns of the U and V matrices correspond to the left and right singular vectors, respectively, and ˙ is a diagonal matrix with singular values. Since the rank of the matrix C is 6, the last two right singular vectors, v7 and v8 , correspond to the two vanishing singular values that span the null space of C. For convenience, these will now be expressed in terms of two 4 1 vectors: vT7 D .u1 ; v1 /T and vT8 D .u2 ; v2 /T . Since .R; R 0 /T is a null vector of C , specifically C .R; R 0 /T D 0, then it must be expressed as a linear combination of v7 and v8 , as follows:
R R0
u1 u D˛ Cˇ 2 : v1 v2
Now, taking into account the two degrees of freedom imposed by Eq. 18.17, we obtain two quadratic equations in ˛ and ˇ:
˛ 2 uT1 v1
˛ 2 uT1 u1 C 2˛ˇuT1 u2 C ˇ 2 uT2 u2 D 1;
(18.19)
C
(18.20)
˛ˇ.uT1 v2
C
uT2 v1 /
C
ˇ 2 uT2 v2
D 0:
If we take into consideration that ˛ ¤ 0 and ˇ ¤ 0, and without loss of generality we assume that uT1 v1 ¤ 0, we can set D ˛=ˇ and substitute in Eq. 18.20 to obtain two solutions for . We then substitute the relation ˛ D ˇ into Eq. 18.19 to obtain the following quadratic expression: ˇ 2 .2 uT1 u1 C .2uT1 u2 / C uT2 u2 / D 1;
(18.21)
which yields two solutions of opposite sign (this sign variation is simply an effect of
T
T sign invariance of the solution). Either term, R; R 0 or R; R 0 , will satisfy the motion equations and the involved constraints.
466
18 Rigid Motion Estimation Using Line Observations
Since the equation also contains squared, we should also consider the viability of the other two solutions. Here, we can see that the second solution for causes the factor on the left-hand side of Eq. 18.21 to disappear. This corresponds to the solution .041 ; R/ and clearly does not satisfy the first constraint of Eq. 18.17. The algorithm, then, can be summarized as consisting of the following procedure: 1. Consider n hand motions .bi ; b0i / and their corresponding camera motions .ai ; a0i /. Check to see if their scalar terms are equal (Chen’s invariance theorem). By extracting the line directions and moments of the screw axes lines, construct the matrix C as in Eq. 18.18. 2. Apply the SVD procedure to C and check to see that two singular values are almost equal to zero. In the case of noisy data, keep the four largest singular values. Then select the related right singular vectors v7 and v8 . 3. Set the coefficients for ˛ 2 uT1 v1 C ˛ˇ.uT1 v2 C uT2 v1 / C ˇ 2 uT2 v2 D 0, then solve, finding two solutions for . 4. Use these two values of to solve the equation ˇ 2 .2 uT1 u1 C .2uT1 u2 / C uT2 u2 / D 1, and then select the solution with the largest value to compute ˛ and then ˇ. 5. The final solution will be ˛v7 C ˇv8 .
18.3 Experimental Results In this section, we test this algorithm to compare its performance with a two-step algorithm similar to the one introduced by Chou and Kamel [38]. These authors estimated quaternion rotation, q, directly from the equation aq D qb and then they computed the rotation matrix RX and solved the translation tX component using the vector Eq. 18.3. The experiments were carried out using a computer simulation. First, n hand motions .Rb ; tb / were created and Gaussian noise with a relative standard deviation of 1% was added so as to simulate the inaccuracy of angle readings. To simulate the hand–eye scenario, we generated camera motions .Ra ; ta /, similarly adding Gaussian noise of varying standard deviation. In this case, the noise was added as an absolute value to the rotation-axis direction and as a relative value to both the angle and the translation. In order to compute the estimated rotor R and the translation component t between the hand and camera, the algorithm was run 1,000 times for each value of added noise. The quantification of both algorithms was done according to the root mean square (RMS) of the absolute errors in the rotation-unit rotor e and the RMS of the relative errors in the translation kt tO k=ktk. kR Rk For the first test, a set of 20 hand motions was prepared using different rotation axes and large rotation angles and a translation that varied from 10 to 20 mm. Figure 18.3 compares the results of our algorithm, labeled MOTOR, and those of the two-step algorithm, labeled SEPARATE.
18.3 Experimental Results
467
MOTOR
˘
SEPARATE
C
0.012
C
C
C
C
0.008
C
ER
C C
0.004 C C
0.000
C ˘
˘
0.00
˘
˘
0.02
˘
˘
0.04
˘
˘
0.06
˘
˘
0.08
0.10
Rel. noise std. dev. in measurements 0.050 C
0.040
MOTOR
˘
SEPARATE
C
C C C
0.030
˘
C
RET
˘
C ˘
0.020 C ˘
0.010
˘ C
0.000
˘
C
C ˘
0.00
˘
˘ ˘
0.02
0.04
0.06
0.08
0.10
Rel. noise std. dev. in measurements Fig. 18.3 Behavior of the proposed algorithm (MOTOR) and of a two-step algorithm (SEPARATE) with noise variation. ER stands for error in rotation and RET for relative error in translation
The upper graph shows RMS rotation error, and the lower graph shows RMS relative to the translation error. Our algorithm substantially outperforms the twostep algorithm. By computing the rotation simultaneously with the translation, we were able to get a much better estimation of the rotation than was the case using a separate computation.
468
18 Rigid Motion Estimation Using Line Observations
0.006 MOTOR
˘
SEPARATE
C
C C
C ˘
0.004
˘
˘
C ˘
C ˘
ER ˘ C
0.002 C ˘ C ˘
0.000
C ˘
0.00
0.02
0.04
Rel. noise std. dev. in measurements 0.005 C
0.004
MOTOR
˘
SEPARATE
C
C
C ˘ C
0.003
˘ ˘
˘
RET
C ˘
0.002
C ˘ ˘ C
0.001 C ˘
0.000
C ˘
0.00
0.02
0.04
Rel. noise std. dev. in measurements Fig. 18.4 Performance of both algorithms in the absence of translation
In the second test, we wanted to explore the estimation performance of both algorithms using zero translation. As expected, in this context the behavior of both algorithms is almost the same (see Fig. 18.4). This effect is easy to explain if we consider Eq. 18.16. Since the translation is zero, the dual parts of the measurements .a0 ; b0 / become zero. Therefore, the left lower block of the matrix in Eq. 18.16 will disappear, which obliges the separate computation of R and R 0 .
18.3 Experimental Results 0.02
0.016
469
˘
C
MOTOR
˘
SEPARATE
C
0.012 C
ER
C
C C
0.008
C C
C
C
C
˘
0.004
˘
˘
˘
˘
˘
˘
˘
˘
10
12
14
16
18
20
0 2
4
6
8
Number of motions 0.4 ˘ MOTOR
˘
SEPARATE
C
0.3
0.2 RET 0.1
C C ˘
C
C
C
C
˘
C
C
C
˘
˘
˘
˘
˘
˘
C ˘
6
8
10
12
14
16
18
20
0 2
4
Number of motions Fig. 18.5 Errors in rotation (upper) and translation (below) as a function of the number of hand and camera motions
In the last experiment, we were interested in the performance of both algorithms when the noise level is kept constant and the number of motions is gradually increased. Generally speaking, one would expect much better estimation with the use of a greater number of hand and camera motions. The noise level was kept at 5%, and the number of motions varied from 2 to 20. Figure 18.5 shows that from the fourth motion onward our algorithm had a superior performance.
470
18 Rigid Motion Estimation Using Line Observations
18.4 Discussion This chapter proposes the Clifford, or geometric, algebra for computations in visually guided robotics. In looking for other suitable ways of representing the motion of geometric primitives, it turns out that the algebra of motors is well suited to express 3D kinematics, since its use allows us to linearize the nonlinear 3D rigid motion transformation. In the literature, it has been shown that the invariance of the angle and the pitch of the screws of the camera and hand help to reduce the complexity of the hand–eye calibration problem. We used this fundamental idea to simplify the hand–eye problem to a problem of motion of lines. For this case, we used the algebra of motors, which is indicated for problems involving the algebra of lines. The resultant simplified parameterization of the problem enabled us to establish a linear homogeneous system for calculating the motor parameters. By computing the null space using SVD and considering a few of the constraints of dual rotors, we were able to devise a simple algorithm, one that obviates the need for nonlinear computations. In this work, it can be seen that the algebraic structure of the resultant equations helps us better understand the performance of the algorithm. The next section of this chapter is devoted to the application of Kalman filter techniques for the estimation of 3D rigid motion using 3D line observations. First, we outline Kalman filter techniques and describe in detail the MEKF, then we apply these techniques for 3D rigid motion estimation using observed lines captured by a robotic visual system.
18.5 Recursive Estimation Using Kalman Filter Techniques The Kalman filter is a linear recursive algorithm that is unbiased and is of minimum variance. It is employed to estimate optimally the unknown state of a linear dynamic system using noisy data taken at discrete real-time intervals. The extended Kalman filter (EKF) approach modifies the standard Kalman filter (used for linear systems) in order to treat noisy nonlinear systems. It starts with an initial guess, then updates this predicted state continually with new measurements. Unfortunately, if disturbances are so large that linearization is inadequate to describe the system, the filter will not converge to a reasonable estimate. First, we give a brief outline of the Kalman filter and of the EKF; then, using this background we explain the rotor and motor EKFs. For a more complete explanation, the reader is referred to [180].
18.5.1 The Kalman Filter Let us describe a dynamic system using a linear-difference-state equation, as follows: X i D ˚ i= i 1X i 1 C W i :
(18.22)
18.5 Recursive Estimation Using Kalman Filter Techniques
471
The state of the system at ti is given by the n-dimensional vector X i . The term ˚ i= i 1 is an n n transition matrix and W i is the random error with the known first- and second-order characteristics: EŒW i D 0; i D 0; 1; : : : EŒW i W Tj D Qi ıij ; In;
(18.23) (18.24)
where ıij is the Kronecker delta function. The matrix Qi is assumed to be the EKF. Suppose that at each time ti there is an m-dimensional vector of measurement Z i available that is linearly related to the state and which is corrupted by the additive noise V i . This is the so-called observation equation: Z i D Hi X i C V i ;
(18.25)
where Hi is a known m n observation matrix and the vector V i a random error with known statistics EŒV i D 0; EŒV i V Tj D C i ıij ;
i D 0; 1; : : :
(18.26) (18.27)
where the matrix C i is assumed to be non-negative definite [180]. Further, assume that the random processes W i and V i are uncorrelated, that is, for each i; j , EŒW i V Tj D O;
(18.28)
where O is the zero matrix. Given the preceding models (18.22) and (18.25), we shall determine an estimate XO i of the state at ti that is a linear combination of an estimate XO i 1 at ti 1 and the data Z i measured at time ti . By defining an unknown (n m) gain matrix Ki , the estimate XO i is given by XO i D ˚ i= i 1 XO i 1 C Ki ŒZ i Hi ˚ i= i 1 XO i 1 :
(18.29)
The matrix Ki is determined so that the estimate has the minimal variance, That is, XO i is chosen so as to minimize its mean squared error: EMIN D fEŒ.XO i X i /T .XO i X i /gMIN :
(18.30)
Equation 18.30 is equivalent to the minimization of the trace of the state error covariance matrix P i , that is, EMIN D ftrace P i gMIN D ftrace EŒ.XO i X i /.XO i X i /T gMIN : (18.31)
472
18 Rigid Motion Estimation Using Line Observations
By substituting Eq. 18.25 into Eq. 18.29, and then substituting Eqs. 18.29 and 18.22 into Eq. 18.31, it can be shown that the trace of the matrix P i will be minimized by choosing the following optimal gain matrix Ki , Ki D P i= i 1 HTi .Hi P i= i 1HTi C C i /1 ;
(18.32)
where P i= i 1 is the error covariance matrix P i= i 1 D ˚ i= i 1P i ˚ Ti= i 1 C Qi ;
(18.33)
XO i= i 1 D ˚ i= i 1 XO i :
(18.34)
of the predicted state
With this optimal gain matrix Ki , the matrix P i reduces to P i D P i= i 1 Ki Hi P i= i 1 D .I Ki Hi /P i= i 1 :
(18.35)
Equations 18.29, 18.33, 18.32, and 18.35 constitute the Kalman filter equations for the model of the system (18.22) and (18.25) of the measurement. From Eq. 18.32, we see that as the measurement error covariance matrix C i approaches zero, the gain matrix Ki weights the residual second term of Eq. 18.29 more heavily: lim Ki D H1 i : C i !O
(18.36)
On the other hand, as the estimated state error covariance P i approaches zero, the gain Ki weights the residual second term of Eq. 18.29 less heavily: lim Ki D O: P i !O
(18.37)
Another way of thinking about the control of the gain of the Kalman filter by Ki is to consider that as the measurement error covariance matrix C i approaches zero, the actual measurement Z i is “trusted” more and more, while the predicted state ˚ i= i 1 XO i is trusted less and less. On the other hand, as the estimated state error covariance P i approaches zero the actual measurement Z i is trusted less and less, while the predicted state ˚ i= i 1 XO i (the dynamic model) is trusted more and more.
18.5.2 The Extended Kalman Filter As described previously, the Kalman filter addresses the general problem of trying to estimate the state X i of a discrete-time controlled process that is governed by a linear stochastic difference equation. But what happens if the process and/or the
18.5 Recursive Estimation Using Kalman Filter Techniques
473
relation between the measurement and the state is nonlinear? Some of the most interesting and successful applications of Kalman filtering are concerned with just these types of situations. A Kalman filter that linearizes about the current predicted state XO i= i 1 and measurement Z i is called an extended Kalman filter, or EKF. In computer vision, the measurement model is usually described by a nonlinear observation equation f i .Z 0;i ; X i / D 0, where the parameter Z 0;i is the accurate measurement. In practice, such a measurement is affected by random errors. We assume that the measurement system is disturbed by additive white noise Z i D Z 0;i C V i ;
(18.38)
where the statistics of noise V i is given by Eqs. 18.26 and 18.27. To apply the Kalman filter technique, we must expand the nonlinear observation equation into a first-order Taylor series about .Z i ; XO i= i 1/, @f i .Z i ; XO i= i 1 / f i .Z 0;i ; X i / D f i .Z i ; XO i= i 1 / C .Z 0;i Z i / @Z 0;i @f .Z i ; XO i= i 1 / .X i XO i= i 1 / C O 2 D 0: (18.39) C i @X i By ignoring the second-order term O 2 , the linearized measurement equation (18.39) becomes Y i D Hi X i C N i ;
(18.40)
where Y i is the new measurement vector, N i is the noise vector of the new measurement, and Hi is the linearized transformation matrix. The components of Eq. 18.40 are given by @f i .Z i ; XO i= i 1 / O Y i D f i .Z i ; XO i= i 1 / C X i= i 1 ; @X i @f i .Z i ; XO i= i 1 / Hi D ; @X i @f i .Z i ; XO i= i 1 / Ni D .Z 0;i Z i /; @Z 0;i EŒN i D 0; EŒN i N Ti D C i= i 1 D
T @f i .Z i ; XO i= i 1 / @f i .Z i ; XO i= i 1/ Ci ; @Z 0;i @Z 0;i
where C i is given by the statistics of the measurement (18.27). This linearized equation (18.40) is a general form for the nonlinear model. We use this form for our particular nonlinear measurement model in Sect. 18.6.
474
18 Rigid Motion Estimation Using Line Observations
18.5.3 The Rotor-Extended Kalman Filter This section describes an EKF algorithm to estimate rotors or quaternions. In the static case, the measurements of points free of error satisfy the conditions R i C1 D R i ; p00;i C1
(18.41)
D R.R i C1 /p 0;i C1 ;
(18.42)
where fp0;i g and fp 00;i g are sets of points before and after rotation, respectively, and 0 R i is the rotation quaternion for the i th pair of points p0;i and p0;i . R.R/ is the matrix representation of R, 0
r12 C r22 r32 r42 2.r2 r3 C r1 r4 /
B R.R/ D @ 2.r2 r3 r1 r4 /
2.r2 r4 C r1 r3 /
2.r2 r4 r1 r3 /
r12 r22 C r32 r42 2.r4 r3 C r1 r2 / 2.r4 r3 r1 r2 /
r12 r22 r32 C r42
1 C ; A; (18.43)
where rj for j D 1; 2; 3; 4 represents the four components of R which satisfy the condition k R k D 1:
(18.44)
Let us assume that the measurements fpi C1 g and fp0i C1 g of fp0;i C1 g and fp00;i C1 g are corrupted by noise fni C1 g and fn0i C1 g, respectively, such that p i C1 D p0;i C1 C ni C1 ;
(18.45)
p 0i C1 D p00;i C1 C n0i C1 :
(18.46)
Here, it is assumed that the noise vectors fni C1 g and fn0i C1 g have zero mean and the respective covariance matrices fC i C1 g and fC 0i C1 g are known. We rewrite Eq. 18.42 as the function f i C1 depending on the variables .p 0;i C1 ; p00;i C1 ; R i C1 /: f i C1 .p 0;i C1 ; p00;i C1 ; R i C1 / D p00;i C1 R.R i C1 /p 0;i C1 D 0:
(18.47)
Expanding this equation about .p i C1 ; p 0i C1 ; RO i C1= i / in terms of the first-order Taylor series, we get f i C1 .p0;i C1 ; p00;i C1 ; R i C1 / D f i C1 .p i C1 ; p0i C1 ; RO i C1= i / C
@f i C1 .pi C1 ; p0i C1 ; RO i C1= i / 0 .p 0;i C1 p 0i C1 / @p 00;i C1
18.5 Recursive Estimation Using Kalman Filter Techniques
475
C
@f i C1 .pi C1 ; p0i C1 ; RO i C1= i / .p 0;i C1 p i C1 / @p 0;i C1
C
@f i C1 .pi C1 ; p0i C1 ; RO i C1= i / .R i C1 RO i C1= i / C O 2 D 0; @R i C1
(18.48)
where the second-order term O 2 can be omitted, and therefore @f i C1 .pi C1 ; p0i C1 ; RO i C1= i / D 1; @p 00;i C1
(18.49)
@f i C1 .pi C1 ; p0i C1 ; RO i C1= i / D R.RO i C1= i /; @p 0;i C1
(18.50)
@f i C1 .pi C1 ; p0i C1 ; RO i C1= i / @R.RO i C1= i /p i C1 D : @R i C1 @R i C1
(18.51)
In order to compute expression (18.51), we utilize the following vectors: pi C1 Š .p1 R i C1 Š .r1 O R i C1= i Š .Or1
p2 r2 rO2
p 3 /T ; r3 rO3
(18.52) T
r4 / ; rO4 /T :
(18.53) (18.54)
Thus, we can write R.RO i C1= i /p i C1 0 2 rO1 C rO22 rO32 rO42 2.Or2 rO3 C rO1 rO4 / @ D 2.Or2 rO3 rO1 rO4 / rO12 rO22 C rO32 rO42 2.Or4 rO3 rO1 rO2 / 2.Or2 rO4 C rO1 rO3 / 0 T D pO 0 i C1= i : D pO1 pO20 pO30
10 1 2.Or2 rO4 rO1 rO3 / p1 A @ p2 A 2.Or4 rO3 C rO1 rO2 / rO12 rO22 rO32 C rO42 p3
The derivative of the vector pO 0 i C1= i with respect to a vector R i C1 is of the form 0 @pO 0 B @pO 0 i C1= i @R.RO i C1= i /p i C1 B D DB @ @R i C1 @R i C1
Now, defining a 3 4 matrix Hi C1= i D vious equation as
1 @r1 @pO 20 @r1 @pO 30 @r1
@pO 10 @r2 @pO 20 @r2 @pO 30 @r2
@pO 10 @r3 @pO 20 @r3 @pO 30 @r3
O i C1= i /p i C1 @R.R @R i C1
@pO 10 @r4 @pO 20 @r4 @pO 30 @r4
1 C C C: A
(18.55)
, we can write the pre-
476
18 Rigid Motion Estimation Using Line Observations
Hi C1= i
0 h1 @R.RO i C1= i /p i C1 D D @ h4 @R i C1 h3
def
h2 h3 h4
h3 h2 h1
1 h4 h1 A ; (18.56) h2
where h2 D 2.Or2 p1 C rO3 p2 C rO4 p3 /; h1 D 2.Or1 p1 C rO4 p2 rO3 p3 /; h4 D 2.Or4 p1 C rO1 p2 C rO2 p3 /: h3 D 2.Or3 p1 C rO2 p2 rO1 p3 /; By first substituting (18.56) into (18.51), and then substituting the resultant equation along with Eqs. 18.49 and 18.50 into Eq. 18.48, we get an expression in which the second-order terms can be omitted, as follows: 0 D p 0i C1 R.RO i C1= i /pi C1 C .p00;i C1 p 0i C1 / R.RO i C1= i /.p0;i C1 p i C1 / C Hi C1= i .R i C1 RO i C1= i /: (18.57) This equation can then be further rearranged as p0i C1 R.RO i C1= i /pi C1 C Hi C1= i RO i C1= i D Hi C1= i R i C1 C .p0i C1 p 00;i C1 / R.RO i C1= i /.pi C1 p0;i C1 /: (18.58) The terms of this equation are now identifiable as the measurement zi C1 and the noise of the measurement ni C1= i , as follows: zi C1 D p 0i C1 R.RO i C1= i /pi C1 C Hi C1= i RO i C1= i ; ni C1= i D .p 0 p0 / R.RO i C1= i /.p i C1 p 0;i C1 / i C1
D
n0i C1
(18.59)
0;i C1
R.RO i C1= i /ni C1 :
(18.60)
Using these variables, we can finally write the first-order linearized measurement equation more compactly: zi C1 D Hi C1= i R i C1 C ni C1= i :
(18.61)
Here, ni C1= i represents zero mean noise, with the covariance given by C i C1= i D C 0i C1 C R.RO i C1= i /C i C1 RT .RO i C1= i /;
(18.62)
where C 0i C1 and C i C1 are the known covariance matrices of noise n0i C1 and ni C1 , respectively. Rotation Estimation Next, we describe the procedure for estimating a rotation expressed as a rotor R. At the beginning of the iteration, or step i D 0, the initial
18.6 The Motor-Extended Kalman Filter
477
state RO 0 and an initial estimation error covariance matrix P 0 are given. According to Eq. 18.41, given an estimate for RO i and the covariance matrix P i at step i , it is reasonable to predict the next step i C 1 in this way: RO i C1= i D RO i ;
(18.63)
P i C1= i D P i :
(18.64)
Taking into account the measurements p i C1 , p 0i C1 and the predicted state O R i C1= i , the new measurement zi C1 from Eq. 18.59 can be straightforwardly computed. Then the Kalman gain matrix and the estimate RO i C1 at step i C 1 are computed using EKF, as follows: Ki C1 D P i C1= i HTiC1= i .Hi C1= i P i C1= i Hi C1= i C C i C1= i /1 ;
(18.65)
RO i C1 D RO i C1= i C Ki C1 .zi C1 Hi C1= i RO i C1= i / D RO i C1= i C Ki C1 .p0i C1 R.RO i C1= i /pi C1 /:
(18.66)
Prudently, RO i C1 must be modified to satisfy the constraint k R k D 1: RO i C1 D
RO i C1 : k RO i C1 k
(18.67)
The updating of the error covariance matrix follows, according to P i C1 D .I Ki C1 Hi C1 /P i C1= i .I Ki C1 Hi C1 /T C Ki C1 C i C1 KTiC1 : (18.68) Note that Hi C1 and C i C1 are recalculated using the current estimate RO i C1 . As far as the implementation of the filter is concerned, the measurement error covariance matrix, C i , might be measured prior to the operation of the filter.
18.6 The Motor-Extended Kalman Filter The use of the EKF filter in the motor algebra framework gives us the simultaneous estimation of translation and rotation. According to the literature, there are only batch methods for the simultaneous estimation of these components [11, 100]. The motor-extended Kalman filter (MEKF) turns out to be a natural extension of the rotor-extended Kalman filter, thanks to the multivector concept of geometric algebra. First, let us define the noisy motion equation using lines in the motor alC gebra framework G1;3;0 . The geometric features we consider for the measurements are 3D observed lines (L1 , L2 ,. . . , Ln , n 2) which belong to an object moving in the 3D space. The rigid motion parameters between any pairing of consecutive time
478
18 Rigid Motion Estimation Using Line Observations
instants (t0 ; t1 ; t2 ; : : : ; tN ) are described compactly by the motor M i . According to Eq. 4.17, the motion of any line of the object is modeled by fi : Li D M i Li 1 M
(18.69)
If the change of the parameters of the line in motion between the time instants ti 1 and ti is described in terms of motor velocity V i= i 1 , e i= i 1 ; Li D V i= i 1 Li 1 V
(18.70)
we can then express the recursive motion equation of the line in general as follows: fi 1 V fi D .V i= i 1 M i 1 /Li 1 .M e i= i 1 /: Li D M i Li 1 M
(18.71)
Thus, we obtain the ideal dynamic motion model in terms of the motors: M i D V i= i 1 M i 1 :
(18.72)
For example, suppose the motion is a screw motion with a rotation of constant angular velocity, !, about an axis of a known line (Ls D rN C I t c ^ r) N and with constant translation velocity, vs , along the axis line. If the data sampling is done at equidistant time intervals, then the time instants can be represented by integers, so the motor equation reads V i= i 1 D V D .1 C I vs =2/.cos.!=2/ C sin.!=2/Ls /:
(18.73)
Since in real work the relation between M i 1 and M i is known only approximately, the real dynamic model of the noisy 3D motion is given by M i D V i= i 1;M l M i 1 C W i ;
(18.74)
where the statistics of W i are given by Eqs. 18.23 and 18.24. Note that V i= i 1;M l signifies the “left-multiplication matrix” of the motor V i= i 1 .
18.6.1 Representation of the Line Motion Model in Linear Algebra The line motion model presented in the previous section uses geometric algebra C G3;0;1 . Because the EKF computer algorithm is implemented using the techniques Q of linear algebra, we should also formulate the line motion model L0 D MLM within the framework of linear algebra. Let us start by considering rotor relationships. The multiplication of two rotors C U and V in geometric algebra G3;0;1 may be represented by
18.6 The Motor-Extended Kalman Filter
479
W D U V D .u0 C u/.v0 C v/ D u0 v0 C u v C u0 v C v0 u C u^v:
(18.75)
Multiplication of these two rotors in linear algebra is represented by W D U Rl V D V Rr U ; where U D .u0
u1
u3 /T , V D .v0
u2
v1
v2
(18.76) v3 /T , and
0
U Rl
V Rr
1 u0 u1 u2 u3 B u1 u0 u3 u2 C C; DB @ u2 u3 u0 u1 A u3 u2 u1 u0 0 1 v0 v1 v2 v3 B v1 v0 v3 v2 C C: DB @ v2 v3 v0 v1 A v3 v2 v1 v0
(18.77)
(18.78)
We call U Rl the “left-multiplication matrix of rotor U ,” and V Rr the “rightmultiplication matrix of rotor V .” C Multiplying S D U C I U 0 and T D V C I V 0 in geometric algebra G3;0;1 gives Q D ST D .U C IU 0 /.V C IV 0 / D UV C I.U V 0 C U 0 V /;
(18.79)
where U , U 0 , V , and V 0 are all expressed in the form of rotors. Multiplication of these two motors in terms of matrices gives Q D SM l T D T M r S ;
(18.80)
where S D .u0
u1
u2
u3
u00
u01
u02
u03 /T ;
T D .v0 v1 v2 v3 v00 v01 v02 v03 /T ; U Rl 044 V Rr 044 SM l D ; T : D Mr U 0 Rl U Rl V 0 Rr V Rr
(18.81)
In this case, we call S M l the “left-multiplication matrix of motor S ,” and T M r the “right-multiplication matrix of motor T .” To convert the line motion model of Eq. 4.17 to matrix algebra, we can treat the real and dual components n, m, n0 , and m0 of the lines L and L0 as rotors with zero scalar. By multiplying, from the right, both sides of Eq. 4.17 by M , we get L0 M ML D 0:
(18.82)
480
18 Rigid Motion Estimation Using Line Observations
This results in the following linear motion equation: .L0 M l LM r /M D AM M D 0;
(18.83)
This matrix representation of the line motion was suggested in [47]. The constraints of Eq. 3.16 and 3.89, respectively, now are R T R D 1; R T R 0 D 0;
(18.84) (18.85)
with R D .r0 r1 r2 r3 /T , R 0 D .r00 r10 r20 r30 /T , and M D R C I R 0 . These properties will be used for the implementation of the MEKF algorithm in the next section.
18.6.2 Linearization of the Measurement Model Considering Eq. 18.83, we can easily see that the relation between the measurement AM and the state M is, unfortunately, nonlinear; that is why we should linearize it. Assume that the measurement AM i represents the true data AM 0;i contaminated by measurement noise N AM ;i with zero mean and known covariance matrix C AM ;i . Then, AM i D AM 0;i C N AM ;i :
(18.86)
O i= i , then according to Eq. 18.48, we Supposing that the predicted state of M i is M can define a function f M;i depending on the variables .AM 0;i ; M i /, as follows: f M;i .AM 0;i ; M i / D AM 0;i M i D 0:
(18.87)
This equation can then be expanded into a first-order Taylor series about the preO i= i 1 /: dicted state .AM i ; M O i= i 1/ f M;i .AM 0;i ; M i / D f M;i .AMi ; M O i= i 1 / @f M;i .AM i ; M O i= i 1 / C .M i M @M i O i= i 1 / @f M;i .AM i ; M C.AM 0;i AM i / C O2D0: (18.88) @AM 0;i Now, by renaming the components O i= i 1 / @f M;i .AM i ; M D AM i ; @M i
(18.89)
18.6 The Motor-Extended Kalman Filter
O i= i 1 / @f M;i .AM i ; M O i= i 1 ; DM @AM 0;i
481
(18.90)
then omitting the second-order terms O 2 , and, finally, by taking into account Eq. 18.86 for AM i C1 , we obtain O i= i 1 / C .AM 0;i AM i /M O i= i 1 O i= i 1 C AM i .M i M AM i M O i= i 1 / N A O i= i 1 C AM i .M i M O i= i 1 D AM i M M M;i C1 D 0;
(18.91)
or O i= i 1 D AM i M O i= i 1 AM i M O i= i 1 D 0: (18.92) AM i M i C N AM;i M As a result, we can claim that the measurement equation for the MEKF at step i is given by O i= i 1 D Hi M i C N Z;i D 0; Z i D AM i M i C N AM ;i M
(18.93)
O i= i 1 . The covariance matrix where we call Hi D AMi and N Z;i D N AM;i C1 M of N Z;i is C i .
18.6.3 Enforcing a Geometric Constraint In order to estimate the motor state, we assume first that at the beginning, or step O 1=0 and the initial error covariance matrix of the estimate i D 0, the initial state M P 1=0 are given. Now, according to Eqs. 18.29, 18.33, 18.32, and 18.35, the estimation equation of the motor state is given by O i 1 C Ki .Z i Hi ˚ i= i 1 M O i 1 / M i D ˚ i= i 1M O i 1 C Ki .Z i Hi V i= i 1;M l M O i 1 / D V i= i 1;M l M D .R i
T
R 0 i /T ; T
(18.94)
where the optimal Kalman gain matrix Ki is computed according to the formula Ki D P i= i 1 HTi .Hi P i= i 1 HTi C C i /1 ;
(18.95)
where P i= i 1 D ˚ i= i 1 P i ˚ Ti= i 1 C Qi
(18.96)
482
18 Rigid Motion Estimation Using Line Observations
and the error covariance matrix for the i -step is updated as P i D P i= i 1 Ki Hi P i= i 1 D .I Ki Hi /P i= i 1 :
(18.97)
Now, M i consists of two four-dimensional vectors R i and R 0 i that must be varied to fulfill the constraints of Eqs. 18.84 and 18.85. In the case of Eq. 18.84, the modifications can be done simply by considering the unit rotor RO i D
R i : k R i k
(18.98)
It is not that simple to satisfy the constraint of Eq. 18.85, however. The constraint R 0T R D 0
(18.99)
tells us that R must be orthogonal to the dual rotor R 0 and valid up to a scalar (see the role of this scalar in Eq. 18.104). Unfortunately, in practice, the rotor estimate R i is usually not orthogonal to the estimated dual rotor R 0 i . Figure 18.6 suggests clearly how to enforce this geometric constraint in order to modify the orientation of R 0 i . In order to do so, first we consider the cosine of the angle ' between estimates R i and R 0 i : R 0 i R i : k k R i k T
cos.'/ D
k
R 0 i
(18.100)
O i , as follows: This equation can be simplified by using Eq. 18.98 and the unit rotor R R 0 i RO i : k R 0 i k T
cos.'/ D
Fig. 18.6 Constraint of orthogonality according 0 O D0 to RO R
(18.101)
18.6 The Motor-Extended Kalman Filter
483
O 0: Then we consider the deviation from the ideal orthogonal R 0 T O O 0 O O 0 ıR i D k R i k cos.'/R i D .R i R i /R i :
(18.102)
Finally, it is a straightforward matter to compute the ideal orthogonal RO 0 i ; first we O i: compute the unit rotor R u orthogonal to R Ru D
0 T O O 0 O .R 0 .R 0 i ıRi / i .R i R i /R i / ; D 0 T 0 0 Oi k O i /R O k R i .R i R k R 0 i ıRi k
(18.103)
then we multiply it times the absolute value of the estimated R 0 i : RO 0 i D k R 0 i k Ru:
(18.104)
O In other words, we have rotated the estimated R 0 i until it is orthogonal to R i .
18.6.4 Operation of the MEKF Algorithm The processing of information using the MEKF filter can be explained simply by considering the block diagrams presented in Fig. 18.7a,b. In general, during the MEKF cycle, represented by Fig. 18.7a, the updating phase uses a prediction and a corrected input measurement to actualize a new estimate, which in turn will be modified using a geometric constraint. This cycle continues for each new measurement into infinity, but the MEKF should stabilize to the estimated proper states after only a few iterations. The Kalman gain, Ki , can be calculated before the actual estimation is carried out since it does not depend on the measurement of Z i . The computation cycle for Ki , illustrated in Fig. 18.7b, would proceed as follows:
Step 1 2 3
Procedure Given Pi1= i1 D Pi , Qi , and ˚ i= i1 , then Pi= i1 is computed using Eq. 18.96. Pi= i1 , Hi , and Ci are substituted in Eq. 18.95 to obtain Ki , which will be used in step 3 of the MEKF algorithm. Pi= i1 , Ki , and Hi are substituted in Eq. 18.97 to determine Pi , which is stored until the time of the next measurement, when the cycle is repeated.
Now the MEKF algorithm illustrated in Fig. 18.8 will be explained. For the initialization of the MEKF, we can use for the initial time instant i the known values
484
18 Rigid Motion Estimation Using Line Observations
a
b
Fig. 18.7 MEKF operation: (a) cycles of estimation and updating and (b) Kalman gain computation
O i 1 and P i 1 , or, if we do not know them, we can simply choose the trivial of M O i 1 D Œ10000000T and P i 1 D I 88 . After the initialization, the MEKF values M O i for some future instant of time. The computation procedure seeks to determine M of the MEKF would proceed as follows:
Step
Action
Procedure
1
Prediction
2
Estimation
O i1 is propagated forward by The estimated M premultiplying it by the discrete system model matrix ˚ i= i1 . This gives the predicted estimate O i1 . Then the measurement O i= i1 D ˚ i= i1 M M model Hi D AM i is linearized. O i= i1 is premultiplied by Hi D AM , ˚ i= i1 M i giving the estimated input measurement O i= i1 , which is then subtracted ZO i D Hi ˚ i= i1 M (continued)
18.6 The Motor-Extended Kalman Filter Step
Action
3
Correction
4
Modification
485
Procedure from the actual measurement Z i to obtain the O i1 . measured residual or error, e 0i D Z i Hi ˚ i= i1 M The error e 0i is premultiplied by the matrix Ki and the O i= i1 to give the current estimate M result is added to M i (see Eq. 18.94). T T 0 T The component R 0 i of the estimate M i D .R i ; R i / is modified using a geometric constraint according to T T
O i ; RO 0 i / , O i D .R Eq. 18.104. Then the final estimate, M is stored until the next measurement is made, at which time the cycle is repeated. T
Fig. 18.8 Representation of the MEKF algorithm
The MEKF would run recursively, completing these cycles until time instant N . It is necessary to mention that Kalman filter implementation is sensitive to numerical instability, but several techniques, such as square-root filtering and the so-called UD factorization [131], are available to overcome such problems.
486
18 Rigid Motion Estimation Using Line Observations
18.6.5 Estimation of the Relative Positioning of a Robot End-Effector In this experiment, we applied the MEKF algorithm to estimate the relative motion between the end-joint of a Staubli RX90 robot arm and a 3D line belonging to a rigid object. The 3D line parameters were recovered during the arm movement using a stereo vision system. This approach can be used for several kinds of industrial applications, including maneuvering and grasping. The physical setup of the experiment is shown in Fig. 18.9. The system looks at a pair of lines lying on the floor and moves in the 3D space, always conserving those lines in its field of view. The main task is to estimate automatically the relative motion between the floor lines and the system’s end-joint. The visual system consists of two gray-scale CCD 640 480 cameras fastened to the last joint of the robot arm. The Staubli robot arm has six joints that can be controlled by six variables (x, y, z, roll, pitch, and yaw). The coordinates (x, y, z) describe the position of the tool coordinate system, T , of the end-joint, which refers to the global coordinate system, W , of the robot. The orientation of the end-joint is described by the variables (roll, pitch, yaw) in terms of Euler angles. Since in practice we require three views in order to reconstruct a 3D line, we can create a virtual third line by slightly moving the two-camera stereo system, as illustrated in Fig. 18.10. The movement of the robot arm is controlled by the relative position and orientation between the tool coordinate system T and the base system W . The camera calibration procedure obtains the projective transformation matrix P, which relates the visual space to the image plane up to a scalar value. The coordinate system of the camera C at the endjoint is related to the tool system frame T via a certain transformation X, which was
Fig. 18.9 Physical setup of the experiment
18.6 The Motor-Extended Kalman Filter Fig. 18.10 Relationship between the tool system T and the camera system C
487 T
x
T1
X y z
y
T2 X
y z
C2
z x
x
C1
x
y
z
C
computed using a hand–eye calibration procedure [112]. When the tool system T is transformed from T1 to T2 by the transformation T, the camera system C will be transformed from C1 to C2 by a certain transformation C: C D XTX1 :
(18.105)
Since the motion of the robot arm specified by the transformation T is known, we can compare it with the relative motion C. The relative motion of the frame W and the frame C is a screw motion with constant angular velocity w D =90 and constant translation velocity vs D 0:2. The axis line, Ls , is parallel to the z-axis of the system C , and one point touching this axis line is given by the coordinates (1.5, C 0, 0). In motor algebra, G3;0;1 , the axis line is given by Ls D 1 2 C I.1:523 /^.1 2 / D 1 2 C I1:53 1 ;
(18.106)
and the motor V is calculated as follows: V D .1 C I vs0 =2/.cos.!=2/ C sin.!=2/Ls / D 0:9994 0:034912 C I.0:0035 0:052331 C 0:099912 /: (18.107) According to Eq. 18.72 the motor M i C1 , expressed in terms of linear algebra, is given by M i D V M l M i 1 ;
(18.108)
initialized with M 0 D .1 0 0 0 0 0 0 0/T . The reconstructed 3D lines listed in Table 18.1 were used to estimate the relative motion between the end-joint and the object on the floor using the MEKF algorithm. The procedure followed in the experiment is summarized below. The algorithm for motion estimation runs online recursively following steps 3 to 6.
488
18 Rigid Motion Estimation Using Line Observations Table 18.1 Reconstructed 3D lines Time Line A point on the line 0 1 (0.000 3.087 2.327) 2 (0.556 0.000 2.250) 1 1 (1.125 0.000 2.027) 2 (0.701 0.000 2.049) 2 1 (1.111 0.000 1.82) 2 (0.794 0.000 1.83) :: : 14 1 (0.018 0.000 0.648) 2 (1.103 0.000 0.538) 15 1 (0.680 0.000 0.753) 2 (0.000 6.341 0.783)
Direction (0.345 (0.941 (0.404 (0.915 (0.462 (0.880
0.937 0.027) 0.336 0.023) 0.914 0.013) 0.401 0.029) 0.886 0.017) 0.471 0.055)
(0.971 (0.241 (0.986 (0.171
0.236 0.036) 0.965 0.103) 0.159 0.025) 0.985 0.003)
Fig. 18.11 Stereo triplet of sample object at time I = 0 with its edge images overlapped by extracted 2D lines
Step
Procedure
1 2 3
Camera calibration to obtain P for each camera Hand–eye calibration to obtain X for each camera Robot arm movement images taken at constant sample rate (see Figs. 18.11 and 18.12) Extraction of 2D lines from the images using Hough transform (see Figs. 18.11 and 18.12) 3D line reconstruction using matched lines of three images (see Table 18.1) Estimation of the motion using 3D line observations and the MEKF algorithm (see Fig. 18.13)
4 5 6
Figure 18.13 presents the eight estimated parameters of the motion for fifteen instants of time. We can see clearly that after only four observations the MEKF algorithm starts to follow almost perfectly the ground truth of the eight parameters. This real experiment together with the previous simulation confirm that the MEKF algorithm is an appropriate tool for the estimation of screw transformations using line observations.
18.6 The Motor-Extended Kalman Filter
489
Fig. 18.12 Stereo triplet of a sample object at time i D 4 with its edge images overlapped by extracted 2D lines r’1
r1
Estimation Given motion
Estimation Given motion
time
time
r2
r’2 Estimation Given motion
Estimation Given motion
time
r3
time
r’3 Estimation Given motion Estimation Given motion
time
time
r4
r’4 Estimation Given motion Estimation Given motion
time
Fig. 18.13 Estimated motor parameters by the MEKF for the visually guided robot system
time
490
18 Rigid Motion Estimation Using Line Observations
18.7 Conclusion In this chapter, we modeled the motion of lines in 4D space using motor algebra. This kind of modeling linearizes 3D Euclidean rigid motion transformation. The model of motion of lines using motors is very appealing for the design of an EKF for the motion estimation. For the design of the filter, we started with the rotorextended Kalman filter. As a natural extension of that process, we then posited the theoretical foundations for an MEKF. The MEKF algorithm has the virtue that it can estimate rotation and translation transformations simultaneously. Since most recursive algorithms in the literature compute translation and rotation transformations separately, we can claim that this is one of the most important advantages of the MEKF. Additionally, using the modeling of the lines in motor algebra we were able to linearize the nonlinear measurement model, thereby avoiding the problem of singularities. The dynamic motion model using motors as states is useful to effectively formulate and to compute the screw motion of a line as a minimal rigid entity. In the algorithm of MEKF, we modified the estimating step so that a certain geometric constraint is satisfied, which made the estimation converge faster to a proper motor state. Tests with simulated data [19] confirmed that the tuning of the MEKF parameters improved substantially the MEKF capabilities. We presented a real application of visually guided robot manipulation. The system was efficiently calibrated using controlled robot movements and an effective hand–eye calibration method. The recovery of the parameters of the 3D lines was carried out using a stereo vision system and the techniques of Hough transform and shape filtering for the line matching. During robot maneuvering, the MEKF algorithm efficiently estimated the relative motion between its end-joint and a 3D object. These experiments confirmed that the new MEKF algorithm is an attractive online method for estimating screw motions using line observations.
Chapter 19
Tracker Endoscope Calibration and Body-Sensors’ Calibration
In general, when the sensors are mounted on a robot arm, one can use the hand–eye calibration algorithm to calibrate them. In this chapter, we present the calibration of an endoscopic camera with respect to a tracking system and the case of a mobile robot for which one has to calibrate the robot’s sensors with respect to the robot’s global coordinate system.
19.1 Camera Device Calibration This section presents an algorithm in the conformal geometric algebra framework that computes the transformation (rotation and translation) relating the coordinate system of an endoscopic camera with the coordinate system of the markers placed on it. Such markers are placed in order to track the camera’s position in real time by using an optical tracking system. The problem is an adaptation of the so-called hand–eye calibration, well known in robotics; however, we call it the tracker endoscope calibration. In this way, we can relate the preoperative data (3D model and planing of the surgery) with the data acquired in real time during the surgery by means of the optical tracking system and the endoscope.
19.1.1 Rigid Body Motion in CGA In CGA, rotations are represented by the rotors, which are defined as 1
R D e 2 b D cos
C b sin ; 2 2
(19.1)
where b is the bivector dual to the rotation axis, and is the rotation angle. The rotation of an entity is carried out by multiplying it at the left with the rotor R, and e Translation is carried out at the right for the reversion of the rotor R: X 0 D RXR. by the so-called translator:
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 19, c Springer-Verlag London Limited 2010
491
492
19 Tracker Endoscope Calibration and Body-Sensors’ Calibration
T De
e1 t 2
D1C
e1 t ; 2
(19.2)
where t 2 hG3 i1 is the translation vector. Translations are applied in a similar way e. to rotations: X 0 D TXT To express rigid body transformations, rotors and translators are applied consecutively. The result is called motor: M D TR:
(19.3)
Such an operator is applied to any entity of any dimension by multiplying the entity by the operator from the left, and by the reverse of the operator from the right: f . The motor M is a special multivector of even grade. To see its X 0 D MXM components, let us carry out the multiplication of R and T 1 M D TR D 1 C e1 t C b sin cos 2 2 2 1 C b sin C e1 t cos C b sin D cos 2 2 2 2 2 D R C R0: (19.4) Since the multiplication of a vector t 2 hG3 i1 by a bivector b 2 hG3 i2 results in a multivector of the form 1 e1 C 2 e2 C 3 e3 C 4 e123 , and since t cos. 2 / 2 hG3 i1 , we can rewrite (19.4) as M D cos C b sin C 2 2 C b sin C D cos 2 2
1 e1 t cos C tb sin 2 2 2 1 e1 t cos 2 2 C1 e1 C 2 e2 C 3 e3 C 4 e123 ; D cos C b sin C e1 .t 0 C e123 /; 2 2 M D cos C b sin C e1 t 0 C e1123 ; (19.5) 2 2 where t 0 2 hG3 i1 and D 12 4 . Note that e1 t 0 is a bivector with components e11 ; e12 ; e13 . If we take only the bivectorial parts of the motor M , we obtain hM i2 D hRi2 C hR 0 i2 D m C m0 D sin
b C e1 t 0 : 2
(19.6)
Therefore, if we express the vector t 0 in terms of their dual bivector t 0 D t 00 IE , we can rewrite (19.6) as
19.1 Camera Device Calibration
493
hM i2 D b0 IE C e1 t 00 IE :
(19.7)
If we consider the representation of the lines L D .a b/IE C e1 .a^b/IE D nIE C e1 mIE ;
(19.8)
we observe that the bivectorial part of the motor M is in fact a line, and it corresponds to the screw axis in which the rotation and translation of the object are carried out.
19.1.2 Hand–Eye Calibration in CGA The formulation for the hand–eye calibration problem given in Chap. 18 reads MA MX D MX MB ;
(19.9)
where MA D A C A 0 , MB D B C B 0 , and MX D R C R 0 (Sect. 19.1.1). Chapter 18 shows that the angle and pitch of the gripper (endoscope in our case) are equal to the angle and pitch of the camera (they remain invariant under coordinate transformations, which is known as the screw congruence theorem); therefore, the problem is solved using only the lines defined by the motors fX LA D a C a 0 D M X L B M 0 0 D .R C R /.b C b /.R g C R0/ e 0 C Rb0 R e e C R 0 bR; e C e1 .RbR D RbR
(19.10)
where a, a0 , b, b0 are bivectors (as in Eq. 19.6). By separating the real part and the part multiplied by e1 , we have e a D RbR; 0 f0 C Rb0 R e e C R 0 bR; a D RbR
(19.11) (19.12)
e 0CR e 0 R D 0, the Multiplying from the right by R and using the relationship RR following relationships are obtained aR Rb D 0: 0
0
0
(19.13)
0
.a R Rb / C .aR R b/ D 0;
(19.14)
which can be expressed in a matrix form as
ab a0 b0
Œa C b Œa0 C b0
031 033 a b Œa C b
R R0
D 0:
(19.15)
494
19 Tracker Endoscope Calibration and Body-Sensors’ Calibration
We call this 6 8 matrix D; the unknown vector ŒR; R 0 T is 8-dimensional. The notation Œu represents the skew-symmetric matrix formed with the vector u, which is given by 2 3 0 u3 u2 uO D 4 u3 (19.16) 0 u1 5 : u2 u1 0 The matrix D is composed only of bivectors (blades of any other grade are not included); therefore, we can use the SVD method to find ŒR R 0 T as the kernel of D. Considering that we have n 2 movements, the following matrix is built: C D D1T
D2T
D3T
D4T
:::
T
(19.17)
in order to apply the SVD method and to find the solution for ŒR; R 0 T . Since the range of the matrix C is at most 6, the last right two singular vectors, v7 and v8 , correspond to the two singular values whose value is zero or near zero, and such vectors expand the null space of C . Therefore, as ŒR; R 0 T is a null vector of C , we can express it as a linear combination of v7 and v8 . If we express these vectors in terms of two vectors of 4D, v7 D .u1 ; v1 /T and v8 D .u2 ; v2 /T , this linear combination can be expressed as
R R0
D˛
u1 u2 C ˇ : v01 v02
(19.18)
Taking into account the constraints eD1 RR
and
f0 R D 0; e 0CR RR
(19.19)
we obtain the following quadratic equations in terms of ˛ and ˇ ˛ 2 uT1 u1 C 2˛ˇuT1 u2 C ˇ 2 uT2 u2 D 1; ˛ 2 uT1 v1 C ˛ˇ uT1 v2 C uT2 v1 C ˇ 2 uT2 v2 D 0:
(19.20) (19.21)
In order to solve these equations, we make a change of variable, substituting in (19.21) D ˛=ˇ, and we obtain two solutions for . Going back to (19.20) and replacing the relationship ˛ D ˇ, we obtain ˇ 2 2 uT1 u1 C 2uT1 u2 C uT2 u2 D 1;
(19.22)
which takes two solutions for ˇ.
19.1.3 Tracker Endoscope Calibration A surgery scenario is shown in Fig. 19.1, where the reader can see that there is a (rigid) transformation between the calibration grid and the Polaris system, M Bg .
19.1 Camera Device Calibration
495
Fig. 19.1 The problem of calibration between the endoscopic camera and the optical tracking system (Polaris)
Such a transformation will be used to validate the results of the endoscope-tracking system calibration method. The transformations involved in the problem are expressed as motors of the CGA: M D TR. They and the coordinate systems involved are as follows: – – – – –
Op : Reference frame of the Polaris system. Ome : Reference frame of the spherical markers. Oce : Reference frame of the endoscopic camera. Ogc : Reference frame of the calibration grid. M Ai : Transformation from Ogc to Oce in the i th movement of the camera. This transformation is given by the extrinsic parameters of the camera ([197]). – M Bi : Transformation from Ome to Op . It is obtained by the optical tracking system. – MX : Transformation from Oce to Ome . It is obtained with the endoscope-tracking calibration algorithm. – M Bg : Transformation from Ogc to Op . It is obtained directly from the Polaris system by placing spherical markers on the plane where the calibration grid lies on.
The procedure is summarized as follows: 1. Given n movements of the endoscopic camera (we move it freely by hand to arbitrary positions), M Bi , and their corresponding movements, M Ai , verify if their scalar parts are equal (screw congruence theorem with motors). 2. For the movements that fulfill the previous requirement, extract the directions and moments of the lines LAi and LBi defined by the motors. Build the matrix C as in (19.17).
496
19 Tracker Endoscope Calibration and Body-Sensors’ Calibration
3. Apply SVD to matrix C . Take the right singular vectors v7 and v8 corresponding to the two singular values nearest to zero (a threshold is applied by the noise). 4. Compute the coefficients for (19.21) and find the two solutions of . 5. For both values of , compute the value of 2 uT1 u1 C2uT1 u2 CuT2 u2 and choose the one that gives a bigger value. Then, compute ˛ and ˇ. The final solution is ˛v7 C ˇv8 . To validate the accuracy of the estimated transformation MX , we use the calibration grid used to calibrate the endoscopic camera by Zhang’s method [197]. Let X g be the set of points corresponding to the corners of the calibration grid, referred to as the coordinate system Ogc . These coordinates are expressed in millimeters, according to the size of each square in the calibration grid, whose sides are 1:25 mm in length in our case. Such points are projected to the camera following two paths: the first one is given applying M Ai and then the matrix of the intrinsic parameters K, resulting in points xAi ; the second one is obtained applying the transformations f B , and M f X , resulting in points xB . A linear correction is applied to M Bg , M i i 0 these points xB D xBi C .cAi cBi /, where c is the centroid of each point set. The i transformation MX was obtained with the exposed method. 1. Taking the points Xg in the grid reference frame, apply the transformation M Ai to express them in the camera’s reference frame. Let XAi be the resulting points. 2. Project the points XAi to the image plane using xAi D KŒRMA tMA XAi
(19.23)
These points should be projected on the corners of the squares in the calibration grid on the image. f B , and M f X . Let 3. Taking the points Xg , apply the transformations M Bg , M i XMBi X be the resulting points. 4. Project the points XMBi X onto the image plane using xBi D KŒR MBi X tMBi X XMBi X :
(19.24)
In the ideal case (without noise), the projected points xAi should match the projected points xBi . However, as a result of noise in the Polaris readings or noise in the estimation of transformations, a small linear displacement between xAi and xBi is possible (see Fig. 19.2a). We can measure the error between the two projections as Pn .xAi xBi / D i D1 : (19.25) n 5. In order to correct the displacement, the centroid of each point set is calculated: cAi and cBi . Then, the points xBi are displaced in such a way that the centroids match: 0 xB D xBi C .cAi cBi /: (19.26) i
19.2 Body-Sensor Calibration
497
Fig. 19.2 Result of the projection and linear correction
After the displacement, the average error is calculated as Pn 0 i D1 .xAi xBi / 0 : D n
(19.27)
Figure 19.2b shows the result after carrying out the correction to the points shown in Fig. 19.2a. The experiments were carried out using an endoscope from Karl Storz Endoscopy, which consists of a TeleCam SL II camera and an endoscope with a view angle of zero degrees. The movements were done by hand to arbitrary positions in space. We test the algorithm computing the transformation MX using a different number of motions each time. To measure the error between the projection, we take 0 the Euclidean distance between each point xAi and its corresponding point xB : i 0 Error D jxAi xBi j. This error is a distance measured in pixels. In each motion, the image of the grid contains 154 points approximately.
19.2 Body-Sensor Calibration The body–eye calibration problem is closely related to hand–eye calibration, where in the latter we proceed as follows: given rigid displacements between the movements of a robot arm and the movements measured by a camera mounted on the arm, the unknown rigid transformation between the coordinate system of the arm and of the cameras has to be computed. As it is shown in Chap. 18, this unknown transformation remains fixed through all the movements. The use of the homogeneous transformation matrices is the usual way to formulate the hand–eye calibration problem. We denote by X the transformation from the coordinate system of the camera to the gripper, by Bi the transformation matrix from the robot base to the gripper, and by ai the transformation matrix that describes the rigid motions from the camera from one position to another. The well-known equation AX D XB (19.28)
498
19 Tracker Endoscope Calibration and Body-Sensors’ Calibration
was solved in a linear manner using dual quaternions or motors in [11, 47], where A; B, and X are homogeneous transformation matrices of the form
R t : (19.29) 0T 1 For the case where the sensors are mounted on a mobile robot, one extends in a certain way the hand–eye calibration method to multiple sensors. Thus, from now on we will call this kind of problem body–eye calibration.
19.2.1 Body–Eye Calibration In the body–eye calibration problem, there are not many degrees of freedom as in the case of a robotic arm; furthermore, the rotation axes of the mobile robot are always parallel, which help us to simplify the problem. To solve the problem we use motors of the conformal geometric algebra; this representation allows us to solve the problem straightforwardly and linearly. An abstraction of the setup is given in Fig. 19.3. Next we will show the process to find the unknown transformation. First, we rewrite (19.28) for the body–eye problem in terms of motors: MA M D MMB ;
Fig. 19.3 Robot movements in order to calibrate a sensor mounted on it
(19.30)
19.2 Body-Sensor Calibration
or
499
f: MA D MMB M
(19.31)
This equation can be rewritten in terms of rotors and translators as follows: 1 1 (19.32) M D TR D 1 C e1 t R D R C e1 tR D R C e1 r; 2 2 where 12 tR is equal to
1 1 tR D .˛1 e1 C ˛2 e2 / cos sin e12 2 2 2 2 1 D .˛1 e1 C ˛2 e2 / cos 2 2 1 e12 .˛1 e1 C ˛2 e2 / sin 2 2
1 .˛1 e1 C ˛2 e2 / cos .˛1 e2 ˛2 e1 / sin D 2 2 2
1 ˛1 cos C ˛2 sin e1 ˛2 cos ˛1 sin e2 : D 2 2 2 2 2
Note that the product 12 tR is a vector, which is represented by r (r D 12 tR). The motor can also be written as M D R C e1 r D cos sin e12 C e1 r D cos C L; (19.33) 2 2 2 where the line L D sin. 2 /e12 C e1 represents the screw axis of the motor (Fig. 19.4). The inverse of the motor can be computed with
C
f D Tg e C eg e M R D . R C e1 r/ D R 1 r D R e1 r:
(19.34)
f satisfy the following identity: The motor M and its inverse M f D MM f D 1; MM
(19.35)
which can be rewritten using (19.33) and (19.34) as e e1 r/ f D .R C e1 r/.R MM e e Re1 r C e1 r R D RR e e C e1 .Rr C r R/: D RR
(19.36)
e D 1, we get the following identity: Since we know that R R e Rr D Rr e rR D 0: rR
(19.37)
500
19 Tracker Endoscope Calibration and Body-Sensors’ Calibration
Fig. 19.4 Motor in space and its corresponding axis line
Now we can rewrite (19.31) as e e1 r/ R A C e1 r A D .R C e1 r/.R B C e1 r B /.R e e e RR B e1 r; (19.38) D RR B R C Re1 r B R C e1 rR B R from which we get
e R A D RR B R;
(19.39)
which is equal to A B B A e C sin nA D R cos C sin nB R cos 2 2 2 2 B e C sin B RnB R e D cos RR 2 2 B B e C sin RnB R D cos 2 2 B B C sin nB : D cos (19.40) 2 2 Equating the scalar and bivector parts of both sides of the equations, we get the following equations: A B cos D cos ; 2 2 A B sin nA D sin nB : (19.41) 2 2
19.2 Body-Sensor Calibration
501
19.2.2 Algorithm Simplification Taking this fact into account, we can be sure that only the bivector parts of MA and MB will contribute to the computation of the unknown MX . Using (19.33), we can define the body–eye calibration problem as f: LA D MLB M
(19.42)
Therefore, the angle and pitch of the motors MA and MB are always equal through all the hand movements. Thus, it suffices to regard the rotation axes (LA , LB ) of the involved motors (MA , MB ). This is also known as the screw congruence theorem [36]. However, thanks to the use of conformal geometric algebra, the proof of this theorem was reduced to one step as shown by (19.40). Since the body–eye problem is a matter of motion of lines, we define two of them as LA D n A C e1 mA ; LB D n B C e1 mB :
(19.43)
Substituting the above equations in (19.42), we get e e1 r/ nA C e1 mA D .R C e1 r/.nB C e1 mB /.R e C Re1 mB R e C e1 rnB R e RnB e1 r/: D .RnB R Equating both sides of the equation, we get e nA D RnB R; e C rnB R e RnB r: mA D RmB R
(19.44) (19.45)
Multiplying on the right by R gives nA R D RnB ; mA R D RmB C rnB RnB rR:
(19.46) (19.47)
Using (19.37) and (19.46), we get mA R D RmB C rnB nA r;
(19.48)
and thus we can write nA R RnB D 0;
(19.49)
.mA R RmB / C .nA r rnB / D 0:
(19.50)
Equation 19.49 can be written as nA R nB R D .nA nB /R D 0:
(19.51)
502
19 Tracker Endoscope Calibration and Body-Sensors’ Calibration
Since in this case the rotation axes are parallel we have that nB and nA are equal; thus, the equation is zero. Therefore, only (19.50) is useful when solving the problem. Equation 19.50 can be written as a matrix vector equation:
˛2 ˛1 ˇ1 ˇ2 0 2 R D 0: (19.52) r ˇ2 ˇ1 ˛2 C ˛1 2 0 This will be written as
Ax D 0:
(19.53)
In (19.52), the scalars ˛1 ; ˇ1 and ˛2 ; ˇ2 represent the components of the moments mB and mA , respectively, that is, mB D ˛1 e1 C ˇ1 e2 ;
(19.54)
mA D ˛2 e1 C ˇ2 e2 :
(19.55)
Note that the matrix has two equations; thus, we need at least two motions to compute the four unknowns. Stacking couples of rows for more than two motions, we get an overdetermined equation system that can be solved using SVD. That is, we compute the SVD of A D US V T , and then we choose the singular vector with the smallest singular value as the solution. Let v D ˛1 e1 C ˛2 e2 C ˛3 e3 C ˛4 e4 be such a vector, then we define a motor as M 0 D ˛1 ˛2 e12 C e1 .˛3 e1 C ˛4 e2 /:
(19.56)
Normalizing this motor, we get MD q
M0 f0 M 0M
;
(19.57)
which is the solution motor. The calibration algorithm is defined as follows: 1. Carry out several movements with the robot (Fig. 19.3). 2. For each movement, build the motor MA D T A RA , representing the robot movement. The parameters of the motor are estimated using odometry. 3. For each motor MA , build a correspondence motor MB D T B R B , representing the sensor motion. 4. Extract the line representing the axis of the motor (LA ; LB ) for each pair of motors MA and MB ; this can be done with LA D hMA i2 and LB D hMB i2 (Fig. 19.4). 5. Using the moments of the lines LA and LB , mB D ˛1 e1 C ˇ1 e2 and mA D ˛2 e1 C ˇ2 e2 , add two rows to the matrix A. 6. For each movement of the robot, go to step 2. 7. Compute the SVD of the matrix A; then choose the vector with the smallest singular value as the solution. Let v D ˛1 e1 C ˛2 e2 C ˛3 e3 C ˛4 e4 be such a vector. Then we define the following motor: M 0 D ˛1 ˛2 e12 C e1 .˛3 e1 C ˛4 e2 /.
19.3 Conclusions
8. Finally, the solution is found normalizing the motor M 0 , that is, M D
503 q
M0 . f0 M 0M
19.3 Conclusions Currently, the use of endoscopes in surgeries is common. It provides the surgeon with very useful information, but to achieve this, camera calibration is needed. Additionally, if it is necessary to display the endoscopic image in a special frame, relating it with a virtual model of the patient, then a second calibration procedure is essential. Such a procedure finds the transformation between the camera and the tracking system used. That is the problem we deal in the first part of the chapter, which is addressed as a hand–eye calibration problem in the geometric algebra framework. In general, when the sensors are mounted on a robot arm, one can use the hand– eye calibration algorithm to calibrate them. However, for the case of a mobile robot, one has to calibrate the sensors with respect to the robot’s global coordinate system, so that the robot would be able to fuse the data of different sensors. The second part of the chapter presents a calibration algorithm for the standard sensors mounted on a mobile robot. Once the sensors are calibrated with respect to the robot platform, the robot can autonomously build 3D maps, navigate, and relocalize itself. These tasks are presented in Chap. 21.
Chapter 20
Tracking, Grasping, and Object Manipulation
In this chapter, we utilize conformal geometric algebra for the development of concepts and computer algorithms in the domain of robot vision. We present an interesting application of fuzzy logic and conformal geometric algebra for grasping using Barrett hand. We present real-time algorithms for a real scenario of perception, approach, and action that handles a number of real grasping and manipulation tasks.
20.1 Tracking First, we show an example using our formulation of the Jacobian. This is the control of a binocular robot head based on a pan-tilt unit (Fig. 20.1). In the head, there are just two axes of movement, but the position of the focus of attention has three degrees of freedom related to the joint angles by using Eq. 12.1 of direct kinematics. Now the axis lines are needed and they can be computed from the extrinsic parameters of the camera calibration. In this example, the axis lines have the following equations: L1 D e31 ; L2 D e12 C d1 e1 e1 ;
(20.1) (20.2)
L3 D e1 e1 ;
(20.3)
where d1 is the height of the tilt axis to the coordinate system. Note that L1 crosses the origin and its direction is e2 . Also, the axis L3 is a line at infinity that represents the translation of the focus of attention in front of the cameras. 1 f i D e 12 qi Li , the position of the end-effectors is Since M i D e 2 qi Li and M computed as f3 M f2 M f1 ; xp0 D M 1 M 2 M 3 x p M f2 M f1 ; L0 D M 1 M 2 L3 M 3 L02 L01
(20.4) (20.5)
f1 ; D M 1 L2 M
(20.6)
D L1 :
(20.7)
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 20, c Springer-Verlag London Limited 2010
505
506
20 Tracking, Grasping, and Object Manipulation
Fig. 20.1 Pan-tilt unit
L1
L2
e3
xp x0
e1
Now, using the equation of differential kinematics (12.15), the state variable representation of the system is written as follows: 0 1 u1 xP 0p D x 0 L01 L02 L03 @u2 A ; u3 y D xp0 ;
(20.8)
where the position of the focus of attention x p at t D 0 is given by the conformal mapping of xpe D d3 e1 C .d1 C d2 /e2 , the line L0i is the current position of Li , and ui is the velocity of the joint i of the system.
20.1.1 Exact Linearization via Feedback Now the following state feedback control law is chosen in order to get a new linear and controllable system. Figure 20.2 shows a plant. Figure 20.3 shows the new system, adding a block that linearizes the system as follows: 0 1 0 1 u1 1 v1 @u2 A D x 0p L01 x 0p L02 x 0p L03 @ v2 A ; (20.9) u3 v3 where V D .v1 ; v2 ; v3 /T is the new input to the linear systems, then we rewrite the equations of the system as follows:
20.1 Tracking
507
Fig. 20.2 Block diagram of the Pan-tilt unit
Fig. 20.3 Block diagram of the new system Fig. 20.4 Block diagram of closed-loop control
xP 0p D V; y D x 0p :
(20.10)
The problem of following a constant reference, xt , is solved by computing the error between the end-effector position x 0p and the target position x t as er D .x 0p ^ x t / e1 . The control law is then given as V D ker ;
(20.11)
where k is the control gain. Figure 20.4 shows the new system with close-loop control. This error is small if the control system is accomplishing its task. It is mapped to an error in the joint space using the inverse Jacobian: U D J 1 V:
(20.12)
Computing the Jacobian J D x 0p L01 L02 L03 , j1 D x 0p .L1 /; j2 D j3 D
x 0p x 0p
f 1 /; .M 1 L2 M f2 M f 1 /: .M 1 M 2 L3 M
(20.13) (20.14) (20.15)
Once that we have the Jacobian, it is easy to compute dqi , using Crammer’s rule: 0 1 0 1 V ^ j2 ^ j3 u1 @u2 A D .j1 ^ j2 ^ j3 /1 @j1 ^ V ^ j3 A ; u3 j1 ^ j2 ^ V
(20.16)
508
20 Tracking, Grasping, and Object Manipulation
0 1 u1 @u 2 A D u3
1 0 ..x 0p ^ x t / e1 / ^ .x 0p L02 / ^ .x 0p L03 / C B 0 @.x p L01 / ^ ..x 0p ^ x t / e1 / ^ .x 0p L03 /A 0 0 .x 0p L1 / ^ .x 0p L2 / ^ ..x 0p ^ x t / e1 / .x 0p L01 / ^ .x 0p L02 / ^ .x 0p L03 /
:
(20.17)
This is possible because j1 ^ j2 ^ j3 D det.J /Ie . Finally, we have dqi , which will tend to reduce these errors.
20.1.2 Visual Jacobian A point in the image is given by s D .x; y/T , whereas a 3D point is represented as X . The relationship between sP and xP is called the visual Jacobian. Considering a camera in a general position, its projection matrix is represented by the planes 1 ,
2 , and 3 (see more details in [81]): 0 1
1 @ P D 2 A : (20.18)
3 The point X is projected in the image in the point: 0 X 1 1
B 3 X C s D @
XA: 2
(20.19)
3 X To simplify the explanation, the x-variable is introduced and its time derivative xP is defined as 0 1 0 1
1 X
1 xP x D @ 2 X A ; xP D @ 2 xP A ; (20.20)
3 X
3 xP x 1 and its derivative reads Now s is given by s1 D x 3 1 xP 3 ; C x1 sP1 D xP 1 x3 x32 x 3 xP 1 x 1 xP 3 ; sP1 D x32
(20.21) (20.22)
By substituting x and xP in Eq. 20.22, one obtains sP1 D Œ. 3 X / 1 . 1 X / 3 x; P
(20.23)
P sP1 D ŒX . 3 ^ 1 / x;
(20.24)
20.1 Tracking
where D
509
1 . Carrying out the same steps for s2 , it x32
is possible to write the equation
1 ^ 3 x: P sP D X
2 ^ 3
(20.25)
Geometrically, 1 ^ 3 represents a line of intersection of the planes 1 and 3 . Denoting the intersection lines by Lx and Ly , Lx D 1 ^ 3 ; Ly D 2 ^ 3 :
(20.26) (20.27)
It is possible to rewrite Eq. 20.25 as
Lx sP D X x: P Ly
(20.28)
In order to close the loop between perception and action, the relationship between velocities in the points of the image and the velocities in the joints of the pan-tilt unit is computed (see [176]). Taking the equation of differential kinematics (20.8) and the visual Jacobian (20.28), it is possible to write a new expression: 0 0 0 0 X L L X x 1 sP D 0 X L0y X 0 L01
0 0 0 0 L L X X x 2 0 q: P X L0y X 0 L02
(20.29)
We can write a similar expression using the differential kinematics of the Barrett hand. Equation 20.29 is very useful to design a control law to track an object or to grasp it.
20.1.3 Exact Linearization via Feedback Now the following state feedback control law is chosen in order to get a new, linear and controllable system: 0 X L0x X 0 L01 u1 D 0 X L0y X 0 L01 u2
0 0 0 0 1 v1 X 0 L0x X 0 L20 ; (20.30) X Ly X L2 v2
where V D .v1 ; v2 /T is the new input to the linear system, then we rewrite the equations of the system: sPp0 D V; y D sp0 :
(20.31)
510
20 Tracking, Grasping, and Object Manipulation
20.1.4 Experimental Results In this experiment, the binocular head should smoothly track a target. Figure 20.5 shows the 3D coordinates of the focus of attention. Figure 20.6 shows examples of the image sequence. We can see that the curves of the 3D object trajectory are very rough. However, the control rule manages to keep the trajectory of the pan-tilt unit smooth. In this experiment, the coordinate system is in the center of the camera. Then the principal planes of the camera are given by
1 D fx e1 C xo e3 ;
2 D fy e2 C yo e3 ;
(20.32) (20.33)
3 D e3 ;
(20.34)
0
30 x−PTU x−Object
y−Object y−PTU
25
−5 20 −10
15 10
−15
5 −20 0 −5
−25
−10 −30 0
50
100
150
200
250
−15 0
50
100
30 z−Object z−PTU
25 20 15 10 5 0 −5 −10 −15 −20
0
50
100
150
200
Fig. 20.5 x-, y- , and z-coordinates of the focus of attention
250
150
200
250
20.1 Tracking
511
Fig. 20.6 Sequence of ball tracking
Fig. 20.7 Sequence of circuit board tracking
where fx , fy , xo , and yo are the camera’s parameters. Using these planes, we compute the lines Lx and Ly using the known axis of the pan-tilt unit: L1 D e23 C d1 e2 ;
(20.35)
L2 D e12 C d2 e2 :
(20.36)
Note that the tilt axis is called L1 and the pan axis L2 , because the coordinate sysQ 1 tem is related to the camera. Also, L02 is a function of the tilt angle L02 D M 1 L2 M with M 1 D cos.tilt / C sin.tilt /L1 . In this experiment, a point on the circuit board was selected and tracked by using the Lukas–Kanade Tracking (LKT) algorithm. Its displacement in the image was transformed into velocities of the pan-tilt’s joint, using the visual–mechanical Jacobian of Eq. 20.29. As a result, we can see in Fig. 20.7 a sequence of pictures captured by the robot. In these images, the position of the board does not change while the background is changing continuously .
512
20 Tracking, Grasping, and Object Manipulation
20.2 Barrett Hand Direct Kinematics The direct kinematics involves the computation of the position and orientation of the end-effector given the parameters of the joints. Direct kinematics can be easily computed given the lines of the axes of screws. In order to explain the kinematics of a Barrett hand, we show the kinematics of one finger. In this example, we will assume that the finger is totally extended. Note that such a hypothetical position is not reachable in normal operation, but this simplifies the explanation. We start by denoting some points on the finger that help to describe the finger position:
x1o D Aw e1 C A1 e2 C Dw e3 ; x2o D Aw e1 C .A1 C A2 /e2 C Dw e3 ; x3o D Aw e1 C .A1 C A2 C A3 /e2 C Dw e3 :
(20.37) (20.38) (20.39)
The points x1o ; x2o , and x3o describe the position of each joint and the end of the finger in Euclidean space (see Fig. 20.8). Having defined these points, it is quite simple to calculate the axes that will be used as the motor’s axes: L1o D Aw .e2 ^ e1 / C e12 ; L2o D x c1o ^ e1 ^ e1 Ic ; L3o D x c2o ^ e1 ^ e1 Ic :
Fig. 20.8 Barrett hand in hypothetical position
(20.40) (20.41) (20.42)
20.2 Barrett Hand Direct Kinematics
513
When the hand is initialized, the fingers move away to the home position, which is ˚2 D 2:46ı in joint two and ˚3 D 50ı in joint three. In order to move the finger from this hypothetical position to its home position, the appropriate transformation needs to be computed: M 2o D cos .˚2 =2/ sin.˚2 =2/L2o ;
(20.43)
M 3o D cos .˚3 =2/ sin.˚3 =2/L3o :
(20.44)
Having obtained these transformations, we then apply them to those points and lines that have to be moved. f 2o ; x 2 D M 2o x 2o M f 3o M f 2o ; x 3 D M 2o M 3o x 3o M f L3 D M 2o L3o M 2o :
(20.45) (20.46) (20.47)
The point x 1 D x 1o is not affected by the transformation, as is the case for the lines L1 D L1o and L2 D L2o (see Fig. 20.9). Since the rotation angles of the two axes L2 and L3 are related, we will use fractions of the angle q1 to specify 2 these rotation angles. The motors of each joint are computed using 35 q4 to rotate 1 1 around L1 , 125 q1 around L2 , and 375 q1 around L3 . The angle coefficients were taken from the Barrett hand user manual: M 1 D cos.q4 =35/ C sin.q4 =35/L1 ; M 2 D cos.q1 =250/ sin.q1 =250/L2 ;
(20.48) (20.49)
M 3 D cos.q1 =750/ sin.q1 =750/L3 :
(20.50)
Fig. 20.9 Barrett hand at home position
514
20 Tracking, Grasping, and Object Manipulation
The position of each point is related to the angles q1 and q4 as follows: f1 ; x 01 D M 1 x 1 M 0 f2 M f1 ; x2 D M 1M 2 x2M 0 f3 M f2 M f1 ; x D M 1M 2 M 3x3 M
3 L03 L02
f2 M f1 ; D M 1 M 2 L3 M f1 : D M 1 L2 M
(20.51) (20.52) (20.53) (20.54) (20.55)
Since we already know x 03 , L01 , L02 , and L03 , we can calculate the speed of the end of the finger using Eq. 12.15 as follows: 2 1 0 1 0 XP 03 D X 03 L01 qP4 C L2 qP1 C L3 qP1 : 35 125 375
(20.56)
20.3 Pose Estimation The pose estimation problem consists of determining the 3D position and orientation of the object. To estimate these parameters, the object in the image is segmented and the mathematical model of the object is projected onto the image to be compared with the segmented one. The difference between them is used to change the position and orientation of the object in order to reduce it. Figure 20.10 shows the closedloop process (segmentation, comparison, and control) to estimate the pose of the object. Now each step of the process is explained.
Fig. 20.10 Steps of the pose estimation algorithm
20.3 Pose Estimation
515
20.3.1 Segmentation Figure 20.11 shows the diagram of the evolution of the segmentation. Note that the object changes its color in the image. This is the result of the algorithm and it means that the center and boundaries of the object are known. The user chooses the object in the image by a double click of the image over a pixel of the object. This pixel is not necessarily in the center of the object. Suppose that the pixel is in the coordinates .x; y/. Then the RGB color of this pixel is mapped to the hue saturation intensity (HSI) and memorized in Cx;y . The neighborhood of this pixel is explored, looking for similar pixels following the directions shown in Fig. 20.11. These correspond to the upper side: i 2 fx k; ; x C kg Ci;yk1 Cx;y \ .Ci;yk D Red/ [ .Ci 1;yk1 D Red/ ) (20.57) Ci;yk1 D Red; the right side: j 2 fy k 1; : : : ; y C kg CxCkC1;j Cx;y \ .CxCk;j D Red/ [ .CxCkC1;j 1 D Red/ ) CxCkC1;j D Red;
(20.58)
the lower side: i 2 fx C k C 1; ; x kg Ci;yCkC1 Cx;y \ .Ci;yCk D Red/ [ .Ci C1;yCkC1 D Red/ ) (20.59) Ci;yk1 D Red; and the left side: j 2 fy C k C 1; ; y k 1g Cxk1;j Cx;y \ .Cxk;j D Red/ [ .Cxk1;j C1 D Red/ ) Cxk1;j D Red:
(20.60)
i
k
j
j i k
k
Fig. 20.11 Sequence of the segmentation algorithm
k
516
20 Tracking, Grasping, and Object Manipulation
Fig. 20.12 Sequence of the segmentation algorithm
For example, Fig. 20.12 shows the segmentation of a rhombus. Of course, by codification, the number of computations can be reduced using flags to memorize the value of Ci 1;yk1 in the upper side, CxCkC1;j 1 in the lower, etc. This algorithm labels the selected object in red.
20.3.2 Object Projection Considering that by using cameras we can only see the surface of the observed objects, we consider them in this work as two-dimensional surfaces embedded in a 3D space and are described by the following function (Table 20.1): H.s; t/ D hx .s; t/e1 C hy .s; t/e2 C hz .s; t/e3 ;
(20.61)
where s and t are real parameters in the range Œ0; 1. Such a parametrization allows us to work with different objects like points, conics, quadrics, or even more complex real objects like cups, glasses, etc. Computing the center of mass of the segmented object on each image (left and right) and retro-projecting the object lines onto left and right images, one can estimate the 3D position of the object. This is a useful approximation for the initialization of the algorithm. Next, we must consider that the object is in a general position, which means that the object is rotated and translated with respect to the coordinate frame attached to the center of the left camera (see Fig. 20.13): Q TQ ; H 0 .s; t/ D TRH.s; t/R
(20.62)
20.3 Pose Estimation
517
Table 20.1 Functions of some objects
Particle H D 3e1 C 4e2 C 5e3 Cylinder H D cos.t /e1 C sin.t /e2 C se3 Plane H D t e1 C se2 C .3s C 4t C 2/e3
Fig. 20.13 Mathematical model of the object
where sin l D e 2 l ; 2 2 1 a T D 1 C ae1 D e 2 e1 ; 2
R D cos
(20.63) (20.64)
where l is the rotation axis, the rotation angle, and a is a translation vector. We need to estimate these transformations in order to know the object’s pose. To estimate these parameters, we project the known mathematical model of the object (H 0 .s; t/) on the camera’s image. This is possible because after calibration, we know the intrinsic parameters of the camera:
1 H 0 .s; t/ ;
3 H 0 .s; t/
2 H 0 .s; t/ yD :
3 H 0 .s; t/
xD
(20.65) (20.66)
Note that the projective depth or translation (in the motor expression a) is initialized with the result of the triangulation with respect to the mass centers. The image of the mathematically projected model is compared with the image of the segmented object (see Fig. 20.13). If we find a match between them, then this means that the mathematical object is placed in the same position and orientation as the real object. Otherwise, we follow a descendant gradient-based algorithm to rotate and translate the mathematical model in order to reduce the error between them. This algorithm runs very fast.
518
20 Tracking, Grasping, and Object Manipulation
Fig. 20.14 Pose estimation of a disk with a fixed camera
Fig. 20.15 Pose estimation of a pot
Figure 20.14 shows the pose-estimation result. In this case, we have a maximum error of 0:4ı in the orientation estimation and 5 mm of maximum error in the estimated position of the object. The problem becomes more difficult to solve when the stereoscopic system is moving. Figure 20.15 shows how well the stereo system tracks the object. If we want to know the real object’s position with respect to the world coordinate system, we must of course know the extrinsic camera’s parameters. Figure 20.16 illustrates the object’s position and orientation with respect to the robot’s hand. In the upper row of this figure, we can see an augmented reality position sequence of the object. This shows that we can add the mathematical object to the real image. Furthermore, in the second row of the same image, we can see the virtual reality pose-estimation result.
20.4 Grasping Objects There are many styles of grasping. However, we are taking into account only three principal styles. Also, note that for each style of grasping there are many possible solutions: for another approach, see [26].
20.4 Grasping Objects
519
Fig. 20.16 Object presented in augmented and virtual reality
20.4.1 First Style of Grasping Since our objective is to grasp objects with the Barrett hand, we must consider that it has only three fingers. So the problem consists of finding three points of grasping by which the system can be held in equilibrium. This means that the sum of the forces equals zero, as well as the sum of the moments. We know the surface of the object, and so we can compute its normal vector in each point using 0
1 ! ! @ @ .s; t/ .s; t/ H H A Ie : N.s; t/ D @ ^ @s @t
(20.67)
For surfaces with low friction, the value of F has to be closed to its projection over the P normal (F Fn ). To maintain equilibrium, the sum of the forces must be zero, 3iD1 kFn k N.si ; ti / D 0 (Fig. 20.17). This fact restricts the points over the surface in which the forces can be applied. This number of points is more reduced if we consider forces of equal magnitude over the object: 3 X
N.si ; ti / D 0:
(20.68)
i D1
Additionally, in order to maintain the system’s equilibrium, the sum of the moments must equal zero: 3 X i D1
H.s; t/ ^ N.s; t/ D 0:
(20.69)
520
20 Tracking, Grasping, and Object Manipulation
Fig. 20.17 Object with normal vectors
The points on the surface with the maximum and minimum distance to the mass center of the object fulfill H.s; t/ ^ N.s; t/ D 0. The normal vector in such points crosses the center of mass (Cm ) and does not produce any moment. Before determining the external and internal points, we must compute the center of mass as Cm D
Z
1
Z
1 !
H .s; t/dsdt:
0
(20.70)
0
Once Cm is calculated, we can establish the next restriction: .H.s; t/ Cm / ^ N.s; t/ D 0:
(20.71)
The values s and t satisfying (20.71) form a subspace of critical points H.s; t/ on the surface (being maxima, minima, or points of inflections). The constraint imposing that the three forces must be equal is difficult to fulfill because it implies that the three points must be symmetric with respect to the center of mass. When such points are not present, we can relax the constraint to demand that only two forces are equal in order to fulfill the hand’s kinematics equations. Then the normals N.s1 ; t1 / and N.s2 ; t2 / must be symmetric with respect to N.s3 ; t3 /: N.s3 ; t3 /N.s1 ; t1 /N.s3 ; t3 /1 D N.s2 ; t2 /:
(20.72)
20.4 Grasping Objects
521
20.4.2 Second Style of Grasping In the first style of grasping, three points of contact were considered. In this section we are taking into account a greater number of contact points. This fact generates a style of grasping that holds the objects more securely. To increment the number of contact points, the base of the hand is taken into account. Since the object is described by the equation H.s; t/, it is possible to compute a plane b that cuts the object in the middle. This is possible using object points referred to the principal axis, Lp , and linear regression (see Fig. 20.18). One selects only the points from locations with their normal parallels to the plane b : N.s; t/ ^ b 0:
(20.73)
Now we choose three points of the object that are separated from another by 25 mm, to generate a plane on the object (see top plane in Fig. 20.18). In this style of grasping, the position of the hand relative to the object is trivial, because we just need to align the center of these points with the center of the hand’s base. Also, the orientation corresponds to the normal of the plane, 1 D x 1 ^ x 2 ^ x 3 ^ e1 .
20.4.3 Third Style of Grasping In this style of grasping, the forces F1 , F2 , and F3 do not intersect with the mass center. They are canceled by symmetry because the forces are parallel: N.s3 ; t3 /F3 C N.s1 ; t1 /F1 C N.s2 ; t2 /F2 D 0:
(20.74)
Also, the magnitude forces F1 , F2 , and F3 are in the plane b and are orthogonal to the principal axis Lp ( b D Lp N.s; t/), as you can see in Fig. 20.19. A new restriction is then added to reduce the subspace of solutions: F3 D 2F1 D 2F2 ; N.s1 ; t1 / D N.s2 ; t2 / D N.s3 ; t3 /:
Fig. 20.18 Planes of the object
(20.75) (20.76)
522
20 Tracking, Grasping, and Object Manipulation
F3 πb
F2
LP 50mm F1
Fig. 20.19 Forces of grasping
Fig. 20.20 Simulation and result of the grasping
Finally, let us consider three points x 1 ; x 2 , and x 3 lying on the parallels depicted in Fig. 20.19. The directed distance between the parallels applied to the points x 1 and x 2 must be equal to 50 mm and that between x 1 ; x 2 , to x 3 must be equal to 25 mm. Now we search exhaustively for the three points, changing si and ti . Figure 20.20 shows the simulation and the result of this grasping algorithm. The position of the object relative to the hand must be computed using a coordinate frame in the object and the other frame in the hand.
20.5 Target Pose Once the three grasping points (P1 D H.s1 ; t1 /, P2 D H.s2 ; t2 /, and P3 D H.s3 ; t3 /) are calculated, it is simple to determine the angles at the joints for each finger. To determine the angle of the spread (q4 D ˇ), we use cos ˇ D
.P1 Cm / .Cm P3 / : jP1 cm j jCm P3 j
(20.77)
20.5 Target Pose
523
To calculate each of the finger angles, we determine its elongation as Aw A1 ; sin.ˇ/ Aw A1 ; x 03 e2 D j.P2 Cm /j sin.ˇ/ x 03 e2 D j.P3 Cm /j C h A1 ; x 03 e2 D j.P1 Cm /j
(20.78) (20.79) (20.80)
where x 03 e2 determines the opening distance of the finger: f3 f (20.81) x 03 e2 D .M2 M3 x 3 M M 2 / e2 ; 1 4 q C I2 C A3 cos q C I2 C I3 ; (20.82) x 03 e2 D A1 C A2 cos 125 375 where e2 is a unit vector pointing at the second finger joint (see Fig. 20.21). Solving for the angle q, we have the opening angle for each finger. These angles are computed offline for each style of grasping of each object. They are the target in the velocity control of the hand.
Fig. 20.21 Object’s position relative to the hand
524
20 Tracking, Grasping, and Object Manipulation
20.5.1 Object Pose We must find the transformation M , which allows us to put the hand in a way such that each finger-end coincides with the corresponding contact point. For the sake of simplicity, the transformation M is divided into three transformations .M1 ; M2 ; M3 /. With the same purpose, we label the finger ends as X 1 , X 2 , and X 3 , and the contact points as P1 , P2 , and P3 . The first transformation, M1 , is the translation between the object and the hand, which is equal to the directed distance between the centers of the circles named Zh D X 1 ^ X 2 ^ X 3 and Zo D P 1 ^ P 2 ^ P 3 , and it can be calculated as M1 D e
1 2
Z h Z ^e1 h
^
Zo ^e Zo 1
^e1 I
:
(20.83)
The second transformation allows the alignment of the planes h D Zh ^ e1 D X 1 ^ X 2 ^ X 3 ^ e1 and o D Zo ^ e1 , which are generated by the new points 1 of the hand and the object. This transformation is calculated as M2 D e 2 h ^o . The third transformation allows the points to overlap and can be calculated using the planes 1 D Zo ^ X 3 ^ e1 and 2 D Zo ^ P 3 ^ e1 , which are generated by 1 the circle’s axis and any of the points M 3 D e 2 1 ^2 . These transformations also define the pose of the object with respect to the hand. They are computed offline in order to know the target position and orientation of the object with respect to the hand, and are used to design a control law for visually guided grasping.
20.6 Visually Guided Grasping Once the target position and orientation of the object are known for each style of grasping and the hand’s posture (angles of joints), it is possible to write a control law using this information and the equation of differential kinematics of the hand that allows one to grasp the object using visual guidance. Basically, the control algorithm takes the pose of the object estimated as shown in Sect. 20.3 and compares it with each one of the target poses computed in Sect. 20.5 in order to choose the closest pose as the target. This way, the style of grasping is automatically chosen. Once the style of grasping is chosen and the target poses are known, the error (between the estimated and the computed) is used to compute the desired angles in the joints of the hand: 2
2
˛d D ˛t e C .1 e /˛a ;
(20.84)
where ˛d is the desired angle of the finger, ˛t is the target angle computed in Sect. 20.5, and ˛a is the actual angle of the finger. Now the error between the desired
20.7 Fuzzy Logic and Conformal Geometric Algebra for Grasping
525
Fig. 20.22 Visually guided grasping
position and the actual position is used to compute the new joint angle using the equation of differential kinematics of the Barrett hand given in Sect. 20.2.
20.6.1 Results Next, we show the results of the combination of the algorithms of pose estimation, visual control, and grasping to create a new algorithm for visually guided grasping. In Fig. 20.22, a sequence of images of the grasping is presented. When the bottle is carried by the hand, the fingers are looking for possible points to grasp it. Now we can change the object or the pose of the object, and the algorithm then computes a new type of grasping. Figure 20.23 shows a sequence of images changing the object’s pose.
20.7 Fuzzy Logic and Conformal Geometric Algebra for Grasping A system in a closed loop is used to adjust the output variable of the process to control a reference input. To accomplish this, it is necessary to feedback the output variable of the process to a controller; see Fig. 20.24.
526
20 Tracking, Grasping, and Object Manipulation
Fig. 20.23 Changing the object’s pose
Fig. 20.24 A system with fuzzy controller
20.7.1 Mandami Fuzzy System A Mandami fuzzy system [145] is characterized by the extensive use of geometric forms and heuristic methods to solve problems in real time. Such a system is comprised of the following elements: fuzzification—the input variables are fuzzified by
20.7 Fuzzy Logic and Conformal Geometric Algebra for Grasping Fig. 20.25 Fuzzy inference diagram for a Mandami system
1. if
and
then
1.if
and
then
input 1
527
input 2
output
Fig. 20.26 BH in a resting position
the membership functions. These can be trapezoids, triangles, Gaussians, and others. Rule bases: This is a rule set “if-then” where the human knowledge base about the problem to be solved is stored. Example: if an object is very big then open the hand very much. Inference mechanism: This is responsible to process fuzzy rules of interest; for a reference input, it is possible that many rules can be fired. Defuzzification: This can be realized by many methods, including centroid, bisector, SoM, MoM, LoM, Med. Geom. Figure 20.25 shows the fuzzy inference diagram for a Mandami system (MS).
20.7.2 Direct Kinematics of the Barrett Hand Figure 20.26 shows the Barrett hand (BH) [4] in a resting position. The BH has four motors, a motor for each finger (F1, F2, F3), each finger has an operation interval of [0ı , 140ı], and the other motor is for the simultaneous spread of F1 and F2 with an operation interval of [0ı , 180ı ]. Figure 20.27 shows the calculation of the direct kinematics of F1 using its axis lines. The strategy consists of translating the parts of F1 to the origin, using px .A1 ; 0; 0/, py .A2 ; 0; 0/, pz .A3 ; D3 ; 0/. We can
528
20 Tracking, Grasping, and Object Manipulation
Fig. 20.27 Representation of F1 using lines
obtain the point pk , which corresponds to the fingertip for F1. Now we compute p2
first the point related to the point Px : px D px C 2x e1 C e0 , t 1 D AW e2 C DW e3 , 1 1 T D e 2 e1 t 1 , and R D e 2 11 e1 e2 . The equation of point p reads 1
1
2
e1T e1: p2 D T 1 R1 px R p2
The other fingers have an identical geometry: py D py C 2y e1 C e0 , t 2 D p 2 , 1 1 T D e 2 e1 t 2 , R D e 2 .12 C˚2 /e2 e3 . The equation of point p then reads 2
2
3
e2R e1 T e2; p3 D T 2 R 1 R 2 p y R p2
p z D pz C 2z e1 C e0 , t 3 D p 3 , T 3 D e 2 e1 t 3 , R 3 D e 2 .13 C˚3 /e2 e3 , where 13 D 13 12 . Finally for the point pk , we write 1
1
e3R e2R e1T e3 : pk D T 3 R 1 R 2 R 3 P z R
20.7.3 Fuzzy Grasping of Objects Figure 20.28 shows the algorithm used for grasping objects using FL and CGA. We numbered those stages in which we used FL and/or CGA. These are as follows: (1) CGA; (2) FL; (3) FL; (4) CGA, direct kinematics; (5) CGA, direct kinematics; (6) FL; (7) FL; (8) direct kinematics; (9) CGA. The first control stage consists of grasping the object or to come closer to it; it is based on the height and width of the object. To obtain the output angle of the fingers, we use 49 fuzzy rules shown in Table 20.2.
20.7 Fuzzy Logic and Conformal Geometric Algebra for Grasping
529
First control stage 1.
Object Recognition. To obtain width and height of the object (Stereo System)
2.
Open the spread of the BH (it is according to the object type: spherical, cubic, etc.)
3.
To realize the first stage of grasping (to grasp the object or approach to it, by using 49 fuzzy rules based on the width and height)
4.
Calculate the separation distance (for each finger tip toward the point of goal plane for grasping, using stereo vision system)
Second control stage
5.
While finger; (l : 1...3) not reach the point of the goal plane, do:
6.
To apply the fuzzy rules of the secondary control stage
7.
Using the preferred defuzzification method, close the finger; of the BH
8.
Calculate the finger; position in the space, using direct kinematics
9.
Calculate the distance to reach the point or the goal plane. End
Fig. 20.28 Algorithm for grasping objects Table 20.2 Fuzzy rules for the 1st control stage Fuzzy Rules VT T VS S M L VT VL L M M L VS T VL L M M L VS VS VL L L L M VS S VL L L M M VS M VL L L M M S L VL L L M M S VL VL L L M L S
VL VT VT VT VT T T T
The fuzzy variables, height and width, of the object are shown in the rows and columns, respectively. The abbreviations used are VT, very tiny; T, tiny; VS, very small; S, small; M, medium; L, large; VL, very large. An example of these rules is if the height is VT and the width is VT then the output angle for all fingers is VL. The second control stage has five fuzzy rules, based on the separation distance of
530
20 Tracking, Grasping, and Object Manipulation
Fig. 20.29 Example of execution of the algorithm (a–d). Stereo vision system (c)
the fingertips of each finger to its plane or point goal. These rules are if separation is VL or L then the output angle is M, if separation is M then the output angle is S, if separation is VS or S then the output angle is VS, if separation is T then the output angle is T, if separation is VT then the output angle is VT. The sphere was calculated with only four points on it. The position in 3D and the size of the sphere were calculated using CGA; for the cube, four points were taken to calculate the width, height, and depth. The 3D points are obtained using a stereo vision system; see Fig. 20.29c. Figure 20.29a–d shows the execution of the algorithm shown in Fig. 20.28 in virtual reality and in real time.
20.8 Conclusion
531
20.8 Conclusion This chapter applies a powerful geometric language for the development of theory, concepts, and computer algorithms in the domain of robot vision. As opposed to current approaches, all the computations are done in the conformal geometric framework, and there is no need to change the mathematical framework. In order to show the power of this flexible mathematical language, we developed real-time algorithms for a real scenario of perception, approach and action that handles a number of real robot manipulation tasks. This chapter shows a feasible grasping strategy based on the mathematical model of the object and that of the manipulator. We compute the 3D poses of an object and the robot hand necessary to close the loop among perception, approaching, and action. A control law with visual feedback has been developed to grasp objects using the mechanical and visual Jacobian matrix in terms of the line axis of the Barrett hand and the principal optical planes of the camera. We show also an interesting application of fuzzy logic and conformal geometric algebra for grasping using the Barrett hand. For the fuzzy control of the hand, we implemented a Mandami fuzzy system. Our applications in a real scenario prove the efficiency of algorithms developed using conformal geometric algebra. By simplifying the geometric algorithms at a symbolic level and accelerating their execution using cost-effective hardware, one should reduce the execution time by great proportions.
Chapter 21
3D Maps, Navigation, and Relocalization
This chapter presents applications of body–eye calibration algorithms using motors of the conformal geometric algebra. A scan-matching algorithm, based on such algorithms, aligns the scans by representing the scan points as lines. We show then a path-following procedure that also uses the conformal geometric algebra techniques to estimate the geometric error for a control law. The chapter extends the applications of body sensor calibration for the case of stereo vision and laser scanner. Such sensors are used for building 3D maps and tackling the relocalization problem. For the relocalization, we resort to an approach based in the Hough transform, where the desired position is searched for in the line Hough space.
21.1 Map Building One of the most important tasks in mobile robotics is the tracking of the robot pose. To solve this problem, we use a laser sensor mounted on the robot and a scanmatching algorithm to estimate the robot pose using laser scans. A laser scan is a set of points corresponding to the intersection between the laser light and the surrounding objects. Each measurement is represented in a polar coordinate system whose origin is the laser origin; the magnitude of the measurement represents the distance between the object and the sensor (Fig. 21.1).
21.1.1 Matching Laser Readings Given a reference scan Sref , a new scan Snew , and a rough estimation .0 ; x0 ; y0 / of the relative displacement of the sensor between the scans, the objective of the matching problem is to estimate the real displacement .; x; y/ between them (Fig. 21.2). As we saw in Eq. 6.30, the inner product of two points X and Y represents the Euclidean distance between the points x and y, that is kx yk D
p 2.X Y / :
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 21, c Springer-Verlag London Limited 2010
(21.1) 533
534
21 3D Maps, Navigation, and Relocalization
a b
Fig. 21.1 (a) Laser readings in polar coordinates. (b) Laser readings in rectangular coordinates
Fig. 21.2 (a) This figure shows a pair of laser readings that must be aligned. (b) This figure shows the lines passing through the laser reading points; these lines are orthogonal to the plane where the points lies. These lines can be interpreted as the motor axes of the body–eye calibration problem, and thus we can use the same algorithm to solve the problem
For speed reasons, we use directly the result of the product X Y . However, we should take into account the negative sign in (6.30); thus instead of looking for a minimum, we look for a maximum to find the closest point to another point. The closest point can be defined as closest.X ; Y i / D max X Y i
for i 2 .1; : : : ; N / :
(21.2)
Once the correspondence between the points is known, we proceed to compute the rigid transformation between them. To solve this problem we utilize the motor estimation algorithm of Sect. 19.2.2 used in the body-sensor calibration problem. We only need to express the points as lines orthogonal to the plane where they lie, as if they were the axes of the motors in the body-sensor calibration problem. These lines can be found easily with Li D X i ^ e3 ^ e1
for i 2 .1; : : : ; N / :
(21.3)
21.1 Map Building
535
b
a
c
d
Fig. 21.3 Alignment of two laser scan references and new laser scan using the scan-matching algorithm: (a) after 5 iterations, (b) after 10 iterations, (c) after 15 iterations, (d) after 20 iterations
In Fig. 21.2, we show the scans to be matched and the lines used do it. The algorithm to match two scans X i and Y j is shown below and can be initialized with odometry. Figure 21.3 shows some iterations for the alignment of the scans. Note that the algorithm can be initialized with odometry. Scan and match algorithm 1. For each point X i , estimate the closest point to the points Y j using (21.2). 2. For each pair of correspondence points .X i ; X j /, compute the lines Li and Lj using (21.3). 3. From each pair of lines, extract their moments m1 D ˛1 e1 C ˇ1 e2 and m2 D ˛2 e1 C ˇ2 e2 . 4. With the elements ˛1 ; ˇ1 ; ˛2 and ˇ2 , we add two files to the matrix A of Eq. 19.52. 5. Compute the SVD of A, A D US V T and choose the solution vector with the smallest singular value, say v D v D ˛1 e1 C ˛2 e2 C ˛3 e3 C ˛4 e4 . 6. With the vector v, compute the following motor: M 0 D ˛1 ˛2 e12 Ce1 .˛3 e1 C ˛4 e2 /. 7. Normalize the motor M 0 to get the solution motor, that is, M D q M 0 . f0 M 0M f .for j 2 1; : : : ; M /. 8. Update the points Y j with the new motor, Y j D M Y j M 9. If error > threshold and n < max iterations go to step 1.
536
21 3D Maps, Navigation, and Relocalization
21.1.2 Map Building Once we are able to match a pair of scans, we can build a global map. The global map is created using the motors computed from the scan-matching algorithm. Let M 0 ; M 1 ; : : : ; M n be the motors obtained from the scan match that align the pair of scans 0; 1; : : : ; n, respectively; we define the global motor as M D M n M n1 ; : : : ; M 0 :
(21.4)
This motor is applied to the n-scan in order to add it to the global map. In Fig. 21.4, we show how the matched scans are added to the global map.
21.1.3 Line Map Once we have a 2D map, we proceed to build a line map, which is a map represented with a set of line segments. Each line segment represents a wall on the 3D world,
Fig. 21.4 Alignment of consecutive laser scans in order to build a global map
21.1 Map Building
537
that is, each line segment is the projection of the wall onto the floor. There are three main problems in line extraction in the map. They are as follows: How many lines are there? Which points belong to which line? Given the points that belong to a line, how do we estimate the line model parameters? For the third problem, we use a common fitting algorithm, called total least squares, since it has been used extensively in the literature [99]. To solve the first two problems, we use the RANSAC (random sample consensus) algorithm [84]. The RANSAC is an algorithm for robust fitting of models in the presence of data outliers; see the algorithm below. Figure 21.5 shows the resulting line map; observe that now the map is represented with line segments not with points. RANSAC algorithm 1. 2. 3. 4. 5.
Initial: A set of N points. Repeat. Choose a sample of two points uniformly at random. Compute the distances of other points to the line. Construct the inlier set.
Fig. 21.5 Linear map built with matched laser scans
538
21 3D Maps, Navigation, and Relocalization
6. If there are enough inliers, recompute the line parameters, store the line, and remove the inliers from the set. 7. Perform until Max iterations are reached or remain too few points and left.
21.1.4 3D Map Building The line model can be used as a floor plane for the 3D map, where each line segment represents a wall (Fig. 21.6). Each line segment (Li ) describes a plane, that is, ˘i D Li ^ e3 :
(21.5)
Once we have the walls of the 3D map, we proceed to add the texture. The texture is obtained from catadioptric images taken at the same time as the scans. Since the laser and the catadioptric camera are calibrated, we can project sections of the catadioptric image to the planes to add the texture. Since the mirror causes a distortion to the image, we must apply a processing to generate the texture correctly. The process consists of the generation of a perspective image from the catadioptric image, using the inverse point projection (Sect. 9.7.3). Figure 21.7 shows a catadioptric image and a perspective image generated from it. Once the perspective image is generated, it is projected to its corresponding plane. Let S be a sphere representing an omnidirectional camera, and let X s be a point on the sphere computed with the inverse point projection. To find the
Fig. 21.6 3D map built with the linear map
21.1 Map Building
539
Fig. 21.7 (a) Omnidirectional image, (b) perspective image generated from the omnidirectional image
Fig. 21.8 (a) 3D map with texture; the texture was generated using the omnidirectional vision system, (b) zoom of the 3D map
projection of the point onto the plane, we first compute the line that passes through the center of the sphere and the point X s with L D S ^ X s ^ e1 :
(21.6)
Then, compute the projection of the point X s onto the plane ˘ as Xp D ˘ L :
(21.7)
The computations are applied to all the points that will be added to the texture. A 3D map with texture is shown in Fig. 21.8a.
540
21 3D Maps, Navigation, and Relocalization
21.2 Navigation The mobile robot must know its position and heading to implement a navigation task. Global localization is used to find the mobile robot’s initial position. Local localization is used to track the mobile robot’s position over time when the initial position of the mobile robot is known.
21.2.1 Localization To solve the local localization problem, the robot estimates its ego-motion using pairs of readings and the scan-matching algorithm (see Sect. 21.1.1). Each scan matching gives us a motor that aligns two scans; therefore, for n matchings we have n motors, that is, M 1 ; M 2 ; : : : ; M n . The actual pose of the laser sensor is given by the concatenation of n motors and the initial motor M 0 obtained in the global localization. Thus, the motor representing the actual pose of the laser is defined as M L D M n M n1 ; : : : ; M 0 :
(21.8)
The actual pose of the robot can be found using the calibration motor M x between the laser and the robot as M r D M xM L : (21.9) The center of the robot can be found with f; X o D M e0 M
(21.10)
and the line representing its heading is f: Lh D M e1 E M
(21.11)
With this information the robot can be localized on the map; Fig. 21.9 shows the localization of the robot using only odometry and using the scan-matching algorithm.
21.2.2 Adding Objects to the 3D Map Sometimes when the robot navigates, it can be necessary to add objects from the real world to the built map. The representation of the objects can be simple or detailed. In the simple representation, we use a sphere to indicate the position of an object on the map. Also, we store images of the object in order to know what represents the sphere. The figure shows the simulation of the built map, the pose of the robot, and the position of the object.
21.2 Navigation
a
b
541
c
d
Fig. 21.9 Parts (a) and (b) show the global map before and after the scan matching with respect to it, respectively. Parts (c) and (d) show the scan matching of the current scan with respect to a scan from the global map
21.2.3 Path Following The path-following task runs as follows: given a map trajectory the robot should follow it. To achieve the task, the robot must control its heading. The path is defined as a set of points on the map. The points are sampled close to each other and interpolated to smoothly follow the curve. The control system designed for the path-following task is based on a feedback control system encompassing a self-localization module and a control module
542
21 3D Maps, Navigation, and Relocalization
Fig. 21.10 Flowchart of the control system for the path-following task
(Fig. 21.10). The control module computes the commanding signals for the robot, namely, linear and angular velocities, based on the error distance obtained by comparing the current pose of the robot, estimated by the self-localization module, with the desired pose. The kinematic model of our robot is described by xP D cos./; yP D sin./; P D !;
(21.12)
where the two velocity control inputs are the linear velocity, v, and the angular velocity !. The mobile robot and the path to be followed, denoted by P, are represented in Fig. 21.11, where X r is the orthogonal projection of the robot point X on P. The line Lh represents the heading of the robot, and the line Lr represents the line tangent to the path at X r . The signed distance between the point X r and X is represented Q by d . The orientation error is denoted by . Q The orientation error, , can be computed with the following motor: M h D 1 C Lr Lh ;
(21.13)
and the lines Lh and Lr , in the following way: Q D sign..M e12 / hM i0 / acos.Lh Lr / :
(21.14)
To compute the signed distance between the points X o and X r , we use r d D sd
1 Xo Xr; 2
(21.15)
where sd is defined as sd D
1; X o P1 D 0; 1; else:
(21.16)
21.2 Navigation
543
Fig. 21.11 This figure shows the robot and the path to be followed
Fig. 21.12 Signed distance of the robot to the path
The point P1 is obtained from the point-pair Z, which is computed from the intersection of the circle Cr and the plane e3 (Fig. 21.12). The circle Cr is defined using the line Lr and the point X o with C r D X o Lr :
(21.17)
Then, the point-pair Z is computed with Z D Cr e3 :
(21.18)
544
21 3D Maps, Navigation, and Relocalization
Fig. 21.13 This figure shows how the curvature of the path can be computed using the circle Ck
Finally, the point P1 can be computed with P1;2 D
Z ˙ jZ j : Z e1
(21.19)
In order to compute the path curvature c, we use the following circle (Fig. 21.13): Ck D X r1 ^ X r ^ X rC1 :
(21.20)
The circle radius can be computed with s rk D
Ck2 : .Ck e1 /2
(21.21)
Using the center point X k of the circle Ck , we compute the line that passes through the points X r and X k , that is, Lk D X r ^ X k ^ e 1 :
(21.22)
The motor that transforms the line Lk into the line Lr is M k D 1 C Lr Lk W
(21.23)
this motor yields the curvature sign sk D sign..M k e12 / hM k i0 / :
(21.24)
Finally, the path curvature is defined with c D sk =rk :
(21.25)
21.3 3D Map Building Using Laser and Stereo Vision
545
The controller used to generate the robot’s angular velocity was proposed by Canudas et al. in [31] for path following, and was shown to be stable: Q Q sin./ v cos./c ! D 3 jvjQ 2 v d ; C 1 cd Q
(21.26)
where 2 and 3 are constants to be tuned, and c denotes the path curvature. The control gains are selected as 2 D ˛ 2 ;
(21.27)
3 D 2˛ ;
(21.28)
p where is usually set to 1= 2, meaning a small overshoot, and ˛ is left free to specify faster or slower systems, that is, shorter or longer settling times. To describe more intuitively the control law (21.26), we analyze two particular important cases, namely, the cases where the reference path is a circle or straight line. – When the reference path is a straight line, the curvature c D 0: therefore, the control is reduced to the simple case of encompassing only two terms. One term is proportional to the heading error and the other to the distance error. – When the path is a circle, the curvature is constant. When the robot is close to the steady state, that is, the heading and distance errors are close to zero, then the third term is constant. According to intuition, the robot must have a constant angular velocity to describe a circle. The path-following task was tested in the global map. The path was generated by hand giving some control points and interpolating between them lines or circles. Figure 21.14 has a sequence of images showing the path following-task. Figure 21.15 shows some pictures of the robot in the path-following task.
21.3 3D Map Building Using Laser and Stereo Vision In this section, we present an interesting use of conformal geometric algebra to tackle problems of 3D map building and relocalization using a robot called Geometer that is equipped with a laser rangefinder sensor and stereo camera system mounted on a pan-tilt head. These input devices have their own coordinate system, and so we apply the method of hand–eye calibration (see Sect. 19.1.2) to get the center coordinates of each device related to a global robot coordinate system. While Geometer is moving exploring the new areas, getting the data from the laser rangefinder and the stereo camera system, two maps are performed simultaneously: one with local coordinates L (according to the current reading of the robot) and the other with global coordinates G (according to the initial position of exploration). The use of the encoders helps us to estimate the actual position of the mobile robot,
546
21 3D Maps, Navigation, and Relocalization
Fig. 21.14 The first image shows the robot initial pose in the map and the path to be followed
but this has errors due to friction on the wheels. Therefore, the pose of the robot, its rotation angle, and the translation are calculated by D o C error ;
(21.29)
T D T o C T error ;
(21.30)
where o and To are the rotation angle and the translation vector given by the odometer, respectively, and error and Terror are the value of correction error generated by
21.3 3D Map Building Using Laser and Stereo Vision
Fig. 21.15 Sequence of images showing the robot following the path
547
548
21 3D Maps, Navigation, and Relocalization
the comparison of the actual laser reading (line segments in local map) and the prior reading (line segments in global map). Using the perpendicular line to plane .x; y/ as the rotation axis and Eq. 21.29, and adding a third fixed coordinate to Eq. 21.30, we can apply these values in a translator, T pos , and a rotor, R pos , to represent the movement of the robot in the environment. In the next section, we explain how to model data from the input devices in the 3D environment.
21.3.1 Laser Rangefinder To extract line segments from range points, we use the recursive line-splitting method shown in [175]; this is a fast algorithm that performs divide-and-conquer algorithms [143]. For every endpoint of the line segments, we map them to CGA to get the pair of points entity (see Table 6.1) and store in a local map L. As the endpoints are 2D points, we take the last coordinate in V 3 and give the 0 value to fix the point in that plane. Now the local map L has every line segment represented as a pair of points in CGA, and we can apply any transformation on it (rotation, translation). While the map is being built, the collected data is stored in it with regard to the initial position. The following records taken from the laser rangefinder replace the actual local map for every new robot position in the environment. When a new local map L is taken, it is mapped to the global coordinate system using Eq. 21.34 to perform a line matching. Here we use one property of the sphere to match line segments (pair of point), namely, having two spheres s1 and s1 we compute the product 8 < < 0 if s1 and s2 intersect, .s1 ^ s2 /2 D 0 if s1 and s2 are tangent, : > 0 if s1 and s2 do not intersect, and to get a sphere from a pair of points, we use SPP D
PP : PP ^ e1
(21.31)
Figure 21.16 shows the possible combination of states of the line segments described below: (a) (b) (c) (d)
Both spheres intersect and have the same direction. One sphere is inside of the other but has the same direction. Both spheres intersect but the segments do not have the same direction. One sphere is inside of the other but the segments do not have the same direction. (e) Both spheres do not intersect. When we get the matching lines, we merge both maps and correct the angle and displacement of the lines, comparing between local and global maps. This little error
21.3 3D Map Building Using Laser and Stereo Vision
549
Fig. 21.16 Merging lines using spheres. (a) to (e) show five alternatives
is caused by the odometry sensor. Then we update the actual position of the robot using Eqs. 21.29 and 21.30. We can express a motor that maps any entity that has been taken from the laser coordinate system to the global coordinates system. We take the laser’s center of coordinates, make a motor M ls r that represents the rotation and translation from the center of the global coordinate system to the laser’s center, and develop e pos ; M cl D R pos M lsr R M pos D T pos R lsr ; M lu D M cl M pos ;
(21.32) (21.33) (21.34)
where Eq. 21.32 is the translation and rotation motor toward the laser’s center; Eq. 21.33 is the movement of the robot using the laser rangefinder, and Eq. 21.34 is the motor that leads us to the source of the laser sensor in the global coordinate system (Fig. 21.17). Using Eq. 21.34 with any geometric entity (points, lines, circles) recorded with the laser rangefinder sensor, we can move easily to the global coordinate system using the form f lu : x 0 D M lu x M
(21.35)
550
21 3D Maps, Navigation, and Relocalization
Fig. 21.17 Coordinate systems of the stereo pan-tilt unit and the laser rangefinder
Fig. 21.18 3D map using virtual walls
As we are dealing in a 3D real world and the laser rangefinder only shows us a plane measure, we can add a virtual wall (Fig. 21.18) to the shapes from the laser rangefinder to get a 3D visual sense of the walls that are inside the virtual world. If a new laser rangefinder is mounted on the mobile robot or if the laser rangefinder is moved to another place in the mobile robot, it is easy to get the new motor that maps the data from the laser rangefinder to the global map, only updating the motor M ls r that represents the rotation and translation from the center of the global coordinate system to the laser’s center, and recalculate Eq. 21.34.
21.3.2 Stereo Camera System with Pan-Tilt Unit The pan-tilt unit has two degrees of freedom, which can be expressed as two rotations, one for pan and the other for tilt. This rotation can be modeled using rotors.
21.4 Relocation Using Lines and the Hough Transform
551
Let R pan be the rotor for the pan movement and let R tilt be the rotor for the tilt movement. Applying these rotors using the geometric product, we can model the whole pan-tilt system. The stereo camera system has its center coordinates on the left camera (right camera viewing in front). We apply the method of hand–eye calibration [11] to get the axis from the pan-tilt unit getting its intersection (or the closet point between the rotation axis); we build a translation from this intersection to the source of the stereo camera system. This translation is performed using a translator Teye . Now, with all this information, we develop a motor that maps any entities taken from the stereo camera system to the global coordinate system as e pos ; T ap D R pos T axis R R pt D R pos R pan R tilt ; e pt ; T opt D R pt T eye R M mpt D T pos R pt ; M su D T opt T ap R mpt ;
(21.36) (21.37) (21.38) (21.39) (21.40)
where Eq. 21.36 is the translation to the point that has the minimum distance to the axis of the pan-tilt system, taking into account the rotation of the robot position. Equation 21.37 is the rotor resulting from all the spins that have done so much in the position of the robot, as in the pan-tilt system. Equation 21.38 is the translation to the left camera of the stereo camera system taking into account all the system’s movements. Equation 21.39 is the movement motor of the robot, along with that of the pan-tilt system. Equation 21.40 is the complete movement motor of the robot. Any point captured by the cameras in any angle of the pan-tilt unit, in any position of the robot, can be mapped from the stereo camera system to the global coordinate system using the form f su : x 0 D M su x M
(21.41)
Using the CGA, we can capture all the entities shown in Table 6.1 by using the OPNS form. By capturing the 3D objects using its representative points, we can represent points, line segments (pair of points), lines, circles, planes, and spheres in the frame of the stereo camera system and then take them to the global coordinate system using (21.41).
21.4 Relocation Using Lines and the Hough Transform The location of a robot in an environment has already been captured in one of the problems that arises once the robot has finished the full map of its environment, and it has moved to an arbitrary place within it. The goal is to relocate the mobile robot within the map previously captured. This is known as the “kidnapping problem.” The map that has been made has geometric information about the surrounding
552
21 3D Maps, Navigation, and Relocalization
environment. From such information, we only use the records obtained by the laser rangefinder sensor, as lines are less noise sensitive. The Hough transform [102] is a robust and effective method to identify the location and orientation of lines. The transform is the parametrization of a line from the (x,y) plane (a Cartesian plane) to the (,) plane (the Hough domain). The line segments of the map are transformed to the Hough domain, defining the transformation in the domain of 2 Œ0; 2 /, so every line segment in (x,y) corresponds to a point (,). This gives us one characteristic in a line, if it varies only in its angle , it keeps the value of constant. So given a previously captured map G (global map) and a new captured map L (new local map), the difference between them is an angle and a displacement x and y that affects the value: D G L;
(21.42)
0
X
G D C L;
(21.43)
G 0 L D 0:
(21.44)
The difference of an angle in the Hough domain is defined as follows: 8 < a b 2
.a ; b / D a b C 2
: a b
if case 1; if case 2; otherwise;
(21.45)
where case 1: if a > b , a C 2 , and b C 0, case 2: if a < b , b C 2 , and a C 0. This gives us the calculus of any point near another where its angles are near 0 or 2 . The relocation follows the following steps: – Extract the actual environment using the laser rangefinder, and extract the line segment and map them to the Hough domain and store them in L (L only has (,) from each line). – Make the difference for each element in L with each element in G (using Eq. 21.45) in angles) and store it in .;/ , giving us a twist and displacement; this step can be seen as the difference between the actual map and the previously captured map. .;/ D G L: (21.46) – Now we build a new global map, adding all the elements of .;/ to an element li 2 L and store it in Gi0 as shown in Eq. 21.47: G 0 D .;/ C Li ; which gives us a displacement of the actual map close to the global map.
(21.47)
21.4 Relocation Using Lines and the Hough Transform
553
– Now the angle has been shifted in Gi0 , so we decrease G by the value of Gi0 , and get an error of displacement i . The goal is to reduce this error using X
Gi0 L D 0:
(21.48)
– Let V be a zero vote matrix of dimension jGj jLj, whose votes are given if the error of the displacement is less than a threshold i < . ; /;
(21.49)
where and are the threshold from the angle and , respectively. Repeat the last three steps for each line in L. Finally, when all the lines have been displaced and voted, extract the maximum value per column from V where the row position corresponds to a line in G, so this is the line correspondence if the value is null, and there is no matching. Now we get all the information about which lines are matching, and with this data we can move and rotate the robot to the right place on the map according to the samples taken. Use each matching line to get the average of the angles. With this angle, get the rotation angle to build a new rotor to turn the robot and the local map L (line segments) in the new environment. Now we have the orientation of the mobile robot, and we are only missing the displacement position. We can find the displacements x and y using the closest point to the origin in the matching lines, to generate a translation vector. The closest point to the origin on a line in CGA can be calculated by p D .L E/ ..eC L/IE /; (21.50) as we get line segments (pair of points in CGA); we only need to apply the wedge operator with the point at infinity as shown in Eq. 21.51 L D PP ^ e1
(21.51)
to get the line in CGA and perform Eq. 21.50. With the translation vector, we make a translator and apply it to the local map and to the mobile robot. And now that the robot is located in the correct place in the map, we can continue with the navigation within the environment. In Fig. 21.19, we can see the relocation evolution, where (a) shows the initial position of the robot and is taking a sample of the environment, (b) generates the line segments of the actual environment, (c) loads the previous map to perform matching (here we can see that the mobile robot is displaced and turned in a random place), (d) locates and puts the robot in the correct place in the map (here the robot has located itself on the previous environment and is placed in the right place).
554
21 3D Maps, Navigation, and Relocalization
Fig. 21.19 (a) to (d) four steps in relocation Fig. 21.20 Geometer registering the environment, merging laser and stereo data
21.5 Experiments Figure 21.18 shows a virtual scenario using a laser rangefinder simulator and a real 3D object captured with the stereo camera. Here we can see that the 3D objects are in the place where the robot took the sample from them. Also we can delimit the environment using virtual walls. These virtual walls help us to have the feeling of an indoor environment. Figure 21.20 shows the creation of a real, small 3D map merging the data registered by the laser rangefinder and the 3D shapes recorded by the stereo camera system. Here, we can see that the laser line (line segments) is together with the rest of the 3D objects in the right pose.
21.6 Conclusions
555
All the maps are stored using only CGA entities, which helps reduce storage space. It has all the information about the objects inside the environment, and will increment the map, and not just continue with the last place where the last sample was taken. Also we can apply any transformation such as rotation or translation to continue with the map.
21.6 Conclusions We presented a complete theory for the solution of the body–eye calibration problem using motors of the conformal geometric algebra. The algorithm builds a matrix that can be solved using SVD, from which we get the solution motor. We also presented a scan-matching algorithm based on the body–eye calibration algorithm, which aligns the scans by representing the scan points as lines, which can be thought of as the screw axis of the body–eye calibration. We also present a path-following task that also uses the conformal geometric algebra to estimate the geometric error for the control law. The chapter extended the applications of body sensor calibration for the cases of stereo vision and laser scanner. Such sensors are used for building 3D maps and tackling the relocalization problem. For the relocalization, we resort to an approach based on the Hough transform, where the desired position is searched in the line Hough space. The experiments with a real robot validate our method. Our approach can be of great use for mobile robots or upper-body humanoids installed on moving platforms.
Chapter 22
Modeling and Registration of Medical Data
22.1 Background In medical image analysis, the availability of 3D models is of great interest to physicians because it allows them to have a better understanding of the situation, and such models are relatively easy to build. However, sometimes and in special situations (such as surgical procedures), some structures (such as the brain or tumors) suffer a (nonrigid) transformation and the initial model must be corrected to reflect the actual shape of the object. In the literature, we can find the union of spheres algorithm [160], which uses the spheres to build 3D models of objects and to align or transform them over time. In our approach we also use the spheres, but we use the marching cube algorithm’s ideas to develop an alternative method, which has the advantage of reducing the number of primitives needed; we call our method marching spheres.
22.1.1 Union of Spheres This algorithm was proposed in [160] and can be summarized as follows: – Given a set of boundary points (borders of the 3D volumetric data), calculate the Delaunay tetrahedrization (DT). – Compute the circumscribing sphere to each tetrahedron. – Verify from the original data which spheres are inside the defined object. – Simplify the dense sphere representation by clustering and eliminating redundant or nonsignificant spheres. This algorithm has the worst-case complexity of O.n2 / in both time and number of primitives, where n is the number of boundary points. However, in [160] it is noted that the highest number of primitives observed in experiments was 4n, approximately. This number, although good, could be computationally heavy for a large n. To register models based on spheres, the authors first match a sufficient number of spheres from the union of spheres representation of one object to the other, and then from the matches find the most likely transformation using some method like least squares. E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 22, c Springer-Verlag London Limited 2010
557
558
22 Modeling and Registration of Medical Data
22.1.2 The Marching Cubes Algorithm The basis of the marching cubes algorithm is to subdivide the space into a series of small cubes. The algorithm then moves (or ‘marches’) through each of the cubes testing the corner points to determine whether or not they are on/inside the surface and replacing the cube with an appropriate set of polygons. The resulting set of polygons will be a surface that approximates the original one. To explain the algorithm, let us look at a two-dimensional equivalent. Suppose we want to approximate a 2D shape like the one in Fig. 22.1a, then we create a grid of squares (equivalent to the cubes for the 3D version of the algorithm). The first task is to determine which of the corners of these squares are inside (or on) the shape (see Fig. 22.1b). Then we insert new vertices that are positioned halfway between each inside and outside corner connected to each other by an edge of the grid (see Fig. 22.1c). Finally, we join the inserted vertices using lines and obtain an approximation to the original surface (Fig. 22.1d). Obviously, the finer the grid, the better the approximation to the shape.
Fig. 22.1 The 2D equivalent of marching cubes. (a) The shape we want to represent; (b) determining which corners of the grid are inside the shape; (c) new vertices inserted; (d) approximation to the surface by joining the vertices
22.2 Segmentation
559
Fig. 22.2 The basic marching cubes algorithm and the order of vertices
For the 3D case, suppose we have a set of m images (slides) containing the shape information of the object we want to represent. Then we create a set of logical cubes, with each cube created from eight pixels: four of slide k and four of slide k C 1. The algorithm determines how the cube intersects the surface and then moves (or marches) to the next cube. At each cube, we assign a 1 to a cube’s vertex if this vertex is inside (or on) the surface, and a 0 if this vertex is outside the surface (Fig. 22.2). Since there are eight vertices in each cube and two states for each one, there are 28 D 256 ways the surface can intersect the cube. In [123], it is shown how these 256 cases can be reduced to only 15 for clarity in explanation, although a permutation of these 15 cases using complementary and rotational symmetry produces the 256 cases. With this number of basic cases, it is then easy to create predefined polygon sets for making the appropriate surface approximation. Before we proceed to explain our proposed method for volume representation and registration based on spheres, let us introduce the basic concepts of geometric algebra and conformal geometric algebra, because we will use the representation of spheres in such algebras for such tasks.
22.2 Segmentation Before we can proceed to model the objects, we need to segment them from the images. Segmentation techniques can be categorized in three classes [64, 141]: (a) thresholding, (b) region-based, and (c) boundary-based. Due to the advantages and disadvantages of each technique, many segmentation methods are based on the integration of information obtained by two techniques: the region and boundary information. Some of them embed the integration in the region detection, while others integrate the information after both processes are completed. Embedded integration can be described as integration through definition of new parameters or decision criterion for the segmentation. Postprocessing integration is performed after both techniques (boundary and region based) have been used to process the image. A different approach is the use of dynamic contours (snakes). Within each category for integration of the information, we have a great variety of methods; some of them work better in some cases, some need user initialization, and some are more sensitive to noise.
560
22 Modeling and Registration of Medical Data
Due to the fact that we are dealing with medical images, we also need another important strategy: texture segmentation. Textural properties of the image can be extracted using statistical features, spatial frequency models, etc. A texture operator describes the texture in an area of the image. So, if we use a texture operator over the whole image, we obtain a new “texture feature image.” In such an image, the texture of a neighborhood around each pixel is described. In most cases, a single operator does not provide enough information about the texture, and a set of operators needs to be used. This results in a set of “texture feature images” that jointly describe the texture around each pixel. The main methods for texture segmentation are Laws’ texture energy filters, co-occurrence matrices, random fields, frequencydomain methods, and perceptive texture features [35, 185]. Simple segmentation techniques such as region growing, split and merge, or boundary segmentation cannot be used alone to segment tomographic images due to the complexity of the computer tomographic images of the brain. For this reason, we decide not only to combine boundary and region information (as typically is done) but to integrate information obtained from texture segmentation methods with boundary information and embed that in a region-growing strategy. A block diagram of our proposed approach is shown in Fig. 22.3. In order to obtain the texture information, we use the Laws’ texture energy masks. Laws’ texture energy measures are a set of filters designed to identify specific primitive features such as spots, edges, and ripples in a local region. Laws’ masks are obtained from three basic vectors, which correspond to a Gaussian, and its first and second derivatives: L3 D Œ1 2 1;
Fig. 22.3 Block diagram of our proposed approach to segment tumors in computer tomographic images.
E3 D Œ1 0 1;
S3 D Œ1 2 1:
(22.1)
22.2 Segmentation
561
Fig. 22.4 (a) to (d) four energy texture masks used to characterize each pixel over the whole image (these masks are convolved with the image)
Convolution of these three vectors with themselves and with one another generates five 5 1 vectors: L5 D Œ1 4 6 4 1; S 5 D Œ1 0 2 0 1; R5 D Œ1 4 6 4 1;
E5 D Œ1 2 0 2 1; W 5 D Œ1 2 0 2 1; (22.2)
which identify certain types of features: level, edge, spot, wave features and ripple. Multiplying these five vectors by themselves and by one another produces a set of 25 unique 5 5 masks (see some examples in Fig. 22.4). Convolving these masks with an image, we obtain the so-called texture images, which contain the response of each pixel in the image to the convoluted mask. Our objective is to build a characteristic vector for each pixel in the image, so we take for each mask result at each pixel the absolute value and fix the value in a position of such vector with 1 or 0, depending on whether the value is greater than zero or zero, respectively. The characteristic vector will have k values; one of them is only to identify if the pixels correspond to the background (value set to zero) or to the patient’s head (value set to one), the patient’s head could be skin, bone, brain, etc., while the other values are determined according to the set of masks used. If we name the characteristic vector as Vxy , and identify each of its k coordinates as Vxy Œi ; i D 1; : : : ; k, the procedure could be summarized as follows: 1. For each pixel p in the CT image: if pij 2 background , then V Œ1 D 0; else V Œ1 D 1 . 2. Convolve the image with each of the texture masks to obtain a total of k “characteristic texture images.” 3. For each position .x; y/ in the image and for each result of the convolution of the masks, take the absolute value absvalxy D kvalij k. If absvalxy > 0 then Vxy Œi D 1; else Vxy Œi D 0, where i D 2; : : : ; k C1 corresponds to the k different masks. As a result, each structure (tissue, bone, skin, background) on the medical images has a typical vector; an example using only four masks is shown in Fig. 22.5. Obviously, the more masks used, the better the object will be characterized. It is important to note that not all the pixels belonging to the same object (i.e., the tumor) have the desired vector because of variations in values of neighboring pixels, but a good
562
22 Modeling and Registration of Medical Data
Background [00000]
Bone [10011]
Skin [01110]
Tumor [10110]
Brain [11110]
Fig. 22.5 Characteristic vectors for the main structures present in a computer tomographic image
enough quantity of them do. So we can use the pixels having the characteristic vector of the object we want to extract, to establish them as seed points in a region-growing scheme. The growing criterion (how to add neighboring pixels to the region) and the stopping criterion are as follows: compute the mean, seeds , and standard deviation, seeds , of the pixels fixed as seeds; then, for each neighboring pixel, to determine whether or not it is being added to the region
If
kC1 X
.Vxy ˚ Vseed / D 0
or
1
and I.x; y/ D ˙2 seeds ;
kC1 X
.Vxy ˚ Vseed / D 1
1
then
.x; y/ 2 Rt I
(22.3)
where Vxy ˚ Vseed is like an XOR operator (a ˚ b D 1 if and only if a ¤ b), PkC1 .Vxy ˚ Vseed / acts as a counter of different values in Vxy vs Vseed (to be1 long to a specific region, it must differ at most in one element), and Rt is the tumor’s region. Example: suppose we use only four masks; let Vxy D Œ0 1 0 1 1 and P Vseed D Œ0 1 1 1 1, then Vxy ˚ Vseed D Œ0 0 1 0 0, therefore kC1 .Vxy ˚ Vseed / D 1 1 and the point .x; y/ is included in that region. The region grows in all directions, but when a boundary pixel is found, the growth in that direction is stopped. Note that the boundary information helps to avoid the inclusion of pixels outside the object boundary; if we use only the texture information, wrong pixels can be added to the region, but boundary information reduces the risk of doing that. Figure 22.6 shows an example of the process explained before. Figure 22.6(a) shows the original image; Fig. 22.6b shows the seed points fixed; Fig. 22.6c shows the final result, highlighting the segmented tumor in the original image.
22.3 Marching Spheres
563
Fig. 22.6 Results for the segmentation of tumor in CT images. (a) One of the original images, (b) seed points, (c) result of segmentation
3D model of brain : different views
Fig. 22.7 After segmentation, 3D models are constructed. Here we show the 3D model of the brain segmented from a set of 200 MR images
The overall process takes only a few seconds per image and could be used to segment any of the objects; but in our case, we focus our attention on the extraction of the tumor. After that, the next step is to model the volumetric data by some method. An example is shown in Fig. 22.7; however, in Sect. 22.1 we present a method that uses the spheres as basic entities for modeling, and in Sect. 22.3 we present our proposed approach.
22.3 Marching Spheres Once we segment the object we are interested in, the next step is to model it in 3D space. In our approach for object modeling, we will use the spheres as the basic entities, and we will follow the ideas of the marching cubes algorithm. First, we will see the 2D case. We want to approximate the shape shown in Fig. 22.1a. Then the process is – Make a grid over the image and determine the points inside (or on) and outside the surface (same as in Fig. 22.1a,b). – Draw circles centered at each inside corner connected with others outside of the surface.
564
22 Modeling and Registration of Medical Data
Fig. 22.8 The 2D equivalent of the modified marching cubes obtaining a representation based on circles
– Finally, draw circles centered at vertices that are connected with two vertices inside the surface. – As a result, we obtain an approximation to the original surface (Fig. 22.8). Obviously, the finer the grid, the better the approximation. For the 3D case, given a set of m slides, – Divide the space in logical cubes (each cube contains eight vertices, four of slide k and four of slide k C 1). – Determine which vertices of each cube are inside (or on) and outside the surface. – Define the number of spheres of each cube according to Fig. 22.9 taking the indices of the cube’s corners as in Fig. 22.2. The magnitude of the radius is j j rpi D d2 for the smallest spheres, Spi , rmi D d2 for medium size spheres Smi , j and rgi D d for the biggest spheres Sgi . Note that we use the same 15 basic cases of the marching cubes algorithm because a total of 256 cases can be obtained from this basis. Also note that instead of triangles, we define spheres and that our goal is not to have a good render algorithm, but to have a representation of the volumetric data based on spheres, which, as we said before, could be very useful in the process of object registration.
22.3.1 Experimental Results for Modeling To test the algorithm, we used several images: 26 images of a human skull containing balloons to simulate the brain, 36 images of a real patient, 16 images of a real tumor in a patient, and other different CT images. The first step is to segment the brain (tumor) from the rest of the structures. Once the brain (tumor) is properly segmented, the images are binarized to make the process of determining if each pixel
22.3 Marching Spheres
565 Case 2
Case 3
S4p2
S4p1 S2p1 Case 6
S5g1
S3m1 Case 7
S7p1
Case 5
Case 4
S8p1
S7g1
Case 8 S8p3
S9g1
Case 9
Case 10 S10 p1
S8p2
S6g1
S10 g1
S8p4 Case 11 S11 p1
Case 12 11 Sp2
Case 13 S13 p1 12 Sp1
Sjpi=cpi +(cpi−rp ) e+e0 i
S14 m2
Case 15 S15 m2
S13 p3
S 12 m1 where
Case 14 S14 m1 S13 p2
S jm =c mi+(cmi− rmi)e+e0 i
15 S m1
S jgi=cgi+ (cgi−rgi)e+e0
Fig. 22.9 The 15 basic cases of surface intersecting cubes. The cases are numbered in this figure starting with the upper left corner, from 1 to 15
Fig. 22.10 Simulated brain: (a) original of one CT slide, (b) segmented object and its approximation by circles according to the steps described in Sect. 22.3, (c) zoom of (b)
is inside or outside the surface easier. The Canny edge detector is used to obtain the boundary points of each slide; by this way we can compare the number of such points (which we call n) with the number of spheres obtained, also giving a comparison with the number of primitives obtained with the union of spheres algorithm. Figure 22.10 shows one CT image of the skull with balloons, the segmented object, and the approximation of the surface by circles using the 2D version of our approach. Figures 22.11 and 22.12 show the CT of a real patient and a segmented tumor. Figure 22.13a shows the results for the 3D case, modeling the brain of a human head model (balloons) extracted from CT images, while Fig. 22.13b shows the 3D model of the tumor from the real patient (extracted from 16 CT images). Table 22.1 is a comparison between the results of the Union of Spheres and our approach for the case of the brain being modeled. The first row shows the worst
566
22 Modeling and Registration of Medical Data
Fig. 22.11 Real patient: (a) original of one CT slide, (b) zoom of the segmented brain, (c) its approximation by circles
Fig. 22.12 Real patient: (a) original of one CT slide, (b) zoom of the segmented tumor, (c) tumor approximation by circles
Fig. 22.13 Approximation of shape of three-dimensional objects with marching spheres: (a) approximation of the brain structure extracted from CT images (synthetic data), (b) approximation of the tumor extracted from real patient data
22.4 Registration of Two Models
567
Table 22.1 Comparison between the number of spheres using an approach based on Delaunay tetrahedrization and our approach based on the marching cubes algorithm (called marching spheres); in our approach n is the number of boundary points; d is the distance between vertices in logical cubes No. of spheres in No. of spheres in n=d DT approach marching spheres 3370/1 3370/3 10329/2 2641/1
13480 8642 25072 6412
11866 2602 8362 5008
case with both approaches; the second row shows the number of spheres with improvements in both algorithms (reduction of spheres in DT is done by grouping spheres in a single one that contains the others, while such a reduction is done using a displacement of d D 3 in our approach). The number of boundary points was n D 3370 in both cases. The reduction in the number of primitives obtained with our approach, while maintaining clear enough the representation, is evident.
22.4 Registration of Two Models In [40], a comparison between a version of the popular iterated closest point (ICP) [197] and the thin-plate spline robust point matching (TPS-RPM) algorithms for nonrigid registration appears. TPS-RPM performed better because it can avoid getting trapped in local minima, and it aligns the models even in the presence of a large amount of outliers. However, the authors had used only sets of 2D and 3D points. Now we have spheres modeling the object, and these spheres have not only different centers but also different radii. This fact prompted us to look at the representation of spheres using the conformal geometric algebra (CGA), which is a geometric algebra of five dimensions. In the next section, we explain how to register two models based on spheres using their representation in CGA.
22.4.1 Sphere Matching The registration problem appears frequently in computer vision and medical image processing. Suppose we have two point sets and one of them results from the transformation of the other, but we do not know the transformation nor the correspondences between the points. In such a situation, we need an algorithm that finds these two unknowns as well as possible. If, in addition, the transformation is nonrigid, the complexity increases enormously. In the variety of registration
568
22 Modeling and Registration of Medical Data
algorithms existing today, we find some that assume knowledge of one of these unknowns and solve for the other one; but there are two that solve for both: iterated closest point (ICP) and thin-plate spline–robust point matching (TPS–RPM). It has been shown [16] that TPS-RPM gives better results than ICP; therefore, we will adapt such an algorithm for the alignment of models based on spheres. It is important to note that previous works only deal with 2D and 3D points, but now we have spheres in the 5D space of the CGA. The notation is as follows: V D fSjI g; j D 1; 2; : : : ; k is the set of spheres of the first model, or model at time t1 ; X D fSiF g; i D 1; 2; : : : ; n is the set of spheres of the second model, or model at time t2 (expected model); U D fSiE g; i D 1; 2; : : : ; n is the set of spheres resulting after the algorithm finishes (estimated set). The super index denotes if the sphere S belongs to the initial set (I), the expected set (F), or the estimated set (E). To solve the correspondence problem means finding the correspondence matrix Z, (Eq. 22.4), which indicates which sphere of the set V is transformed in which sphere of the set X by some transformation f . Thus, if we correctly estimate the transformation f , and apply it to X (U D f .V /), then we should obtain the set X . To deal with outliers, the matrix Z is extended as in (22.4). The inner part of Z (with size k n) defines the correspondences. zj i S1I S2I ZD S3I ::: SkI outlierIkC1
S1F 1 0 0 ::: 0 0
S2F 0 1 0 ::: 0 0
S3F 0 0 1 ::: 0 0
S4F 0 0 0 ::: 0 1
::: ::: ::: ::: ::: ::: :::
SnF 0 0 0 ::: 1 0
outlierF nC1 0 ::: ::: ::: 0
(22.4)
However, to deal with outliers using binary correspondence matrices can be very cumbersome [40]. For this reason, TPS-RPM uses two techniques to solve the correspondence problem: Soft assign: The basic idea is allow the matrix of correspondences M to take continuous values in the interval [0,1]; this “fuzzy” correspondence improves gradually during optimization without jump in the space of permutations of binary matrices. Deterministic annealing: A technique used to control the fuzzy correspondences by means of an entropy term in the cost function (called energy function), introducing a parameter T of temperature that is reduced in each stage of the optimization process beginning at a value T0 and reducing by a factor r until some final temperature is reached. Using these two techniques, the problem is to minimize the function: E.M; f / D
n X k X iD1 j D1
n X n X k k ˇ ˇ2 X X ˇ ˇ mji ˇSiF f .SjI /ˇ C jLf j2 C T mji log mji & mji ; (22.5) iD1 j D1
iD1 j D1
22.4 Registration of Two Models
569
where mji satisfy the condition ˙inC1 2 f1; 2; : : : ; k C 1g, D 1 mj i D 1 for j ˙jkC1 m D 1 for i 2 f1; 2; : : : ; n C 1g, and m 2 Œ0; 1. The parameter is ji D 1 ji reduced in an annealing scheme i D init T . The basic idea in this heuristic is that more global and rigid transformations should be first favored with large values of . We will follow the next two steps: Update the correspondences: : : : ; n, modify mji :
for the spheres SjI , j D 1; 2; : : : ; k and SiF ; iD1; 2;
F f S I 2 j T
S i 1 mj i D e T for outliers j D k C 1 e i D 1; 2; : : : ; n:
mkC1;i
1 D e T0
;
2 S F f S I i kC1 T0
(22.6)
;
(22.7)
:
(22.8)
and for outliers j D 1; 2; : : : ; k e i D n C 1: mj;nC1 Update transformation:
1 D e T0
2 SF f S I nC1 j T0
To update the centers, we use
Etps .D; W / D kYc Vc D ˚c W k2 C 1 .W T ˚c W / C 2 ŒD I T ŒD I ;
(22.9)
where Vc is the matrix given by Vc D f.SjI ^ E/ Eg; j D 1; 2; : : : ; k, Yc is interpreted as the new estimated position of the spheres, ˚c is constructed by b D .SbI ^ E/ E .SaI ^ E/ E; therefore, ˚c contains information about the structure of the sets; D; W represent the rigid and non-rigid deformations, respectively, affecting the centers of the spheres, and are obtained by solving SaI ^ E EjD; W D SaI ^ E E D C SaI ^ E E W: (22.10) The radii are updated using a dilator, which is defined as f
D D e
log. /^E 2
D e
/^E = log .j max imax 2
;
(22.11)
where jmax and imax are the radii of the more corresponding pair of spheres, according to matrix M , that is, the index i of the initial set and j of the expected set where the i th row of M has a maximum value max.mij /. Dilators are applied as S 0 j D D SjI D : I
(22.12)
570
22 Modeling and Registration of Medical Data
The annealing process helps to control the dual update process. Parameter T is initialized in Tinit and is gradually reduced as Ti D Tinit r, where r is the annealing rate until some final temperature is reached. Parameters 1 and 2 follow an annealing scheme as mentioned earlier.
22.4.2 Experimental Results for Registration Figures 22.14 and 22.15 show two examples registering sets of spheres. Figures 22.14a and 22.15a show the initial set or representation at time t1 ; Figures 22.14b and 22.15b show the deformed or expected set, or representation at time t2 . These two representations should be registered. Figure 22.14c shows the results of the registration process. Note that researchers usually use TPS-RPM with 2D or 3D vectors because they cannot go beyond such a dimension; in contrast, using conformal geometric algebra we have a homogeneous representation that preserves isometries and uses the sphere as the basic entity. Note that the algorithm adjusted the radius, as expected (this is not possible using only 3D vectors). We carried out other experiments. Table 22.2 shows the average errors between the corresponding spheres’ centers, measured with the expected and resulting sets.
a
Shape at time t1 (initial set)
c
b
Shape at time t2 (expected set)
Result of the algorithm (initial set transformed)
Fig. 22.14 Registration of models based on spheres: (a) initial set (or representation at time t1 ), (b) expected set (or representation at time t2 ), (c) result of the deformation of the initial model to match the second one
22.4 Registration of Two Models
571
a Tumor at time t1 (initial set)
b
Tumor at time t2 (expected set) Shape expected of the tumor (the one of time t2)
c
Result of the algorithm (initial set transformed)
Fig. 22.15 Registration of models based on spheres for a tumor: (a) initial shape of the tumor, (b) expected shape of the tumor (representation at time t2 ), (c) result after registration (transformed initial model) Table 22.2 Average errors measured as the distance (in voxels) between the centers of corresponding spheres, calculated with the expected and resulting sets, and according to (22.13) Number of spheres Number of spheres Average error of the initial set of the expected set according to (22.13) 60 98 500 159 309 340
60 98 500 143 280 340
12.86 0.72 2.80 18.04 15.63 4.65
This error is measured as P D
q i;j
c.i; j /.SiF SiE /2 N
;
(22.13)
where N is the number of pairs of corresponding spheres; c.i; j / D 1 if SiF corresponds to SiE , or c.i; j / D 0 if they do not correspond.
572
22 Modeling and Registration of Medical Data
22.5 Conclusions The chapter first presents an approach for medical image segmentation that combines texture and boundary information and embeds it into a region-growing scheme, having the advantage of integrating all the information in a simple process. We also show how to obtain a representation of volumetric data using spheres. Our approach, called marching spheres, is based on the ideas exposed in the marching cubes algorithm but it is not intended for rendering purposes or displaying in real time, but rather for reducing the number of primitives modeling the volumetric data. With this saving, it makes a better registration process, better in the sense of using fewer primitives, which reduces the registration errors. Also the chapter shows how to represent these primitives as spheres in the conformal geometric algebra, which are five-dimensional vectors that can be naturally used with the principles of TPS-RPM. Experimental results seem to be promising.
Part VII
Appendix
Chapter 23
Clifford Algebras and Related Algebras
23.1 Clifford Algebras Clifford algebras were created and classified by William K. Clifford (1878–1882) [41–43], when heVpresented a new multiplication rule for vectors in Grassmann’s exterior algebra, R [74]. In the special case of the Clifford algebra, CL3 for R3 , the mathematical system embodied Hamilton’s quaternions [78]. In Chap. 1 Sect. 1.2, we explain a further geometric interpretation of the Clifford algebras developed by David Hestenes and called geometric algebra. Throughout this book we have utilized geometric algebra.
23.1.1 Basic Properties A Clifford algebra is a unital associative algebra which contains, and is generated by, a vector space V equipped with a quadratic form Q. The Clifford algebra Cl.V; Q/ is generated by V subject to the condition v2 D Q.v/
for all v 2 V:
(23.1)
This is a fundamental Clifford identity. If the characteristic of the ground field F is not 2, this condition can be rewritten in the following form uv C vu D 2 < u; v >
for all u; v 2 V;
(23.2)
where < u; v > D 12 .Q.u C v/ Q.u/ Q.v// is the symmetric bilinear form associated to Q. Regarding the quadratic form Q, one can notice that Clifford algebras are closely related to exterior algebras. As a matter of fact, if Q D 0 then the Clifford algebra Cl.V; Q/ is just the exterior algebra .V /. Since the Clifford product includes the extra information of the quadratic form Q, the Clifford product is fundamentally richer than the exterior product.
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 23, c Springer-Verlag London Limited 2010
575
576
23 Clifford Algebras and Related Algebras
23.1.2 Definitions and Existence A pair .A; ˛/ consisting of an R-algebra A and a homomorphism of R-modules ˛ W V ! A is compatible with V if ˛.v/2 D Q.v/1A
for all v 2 V:
(23.3)
This equation applied to v C u implies that ˛.v/˛.u/ C ˛.u/˛.v/ D 2 < u; v >
for all u; v 2 V:
(23.4)
In this regard, it can be said that a Clifford algebra Cl.V; Q/ is a universal with respect to the pair .A; ˛/. Thus, a Clifford algebra of V is a pair Cl.V; Q/ D .C l.V; Q/; / such that (i) (ii) (iii)
Cl.V; Q/ is an R-algebra. W V ! C l.V; Q/ is and R-module map that satisfies .v/2 D Q.v/1A and .u/.v/ C .v/.u/ D 2 < u; v > for all u; v 2 V . If .A; ˛/ is any pair compatible with V , then there is a unique R-algebra homomorphism : Cl.V; Q/ ! A such that the following diagram commutes (Fig. 23.1):
A Clifford algebra described as above always exists and it can be constructed as follows: begin with the most general algebra that contains V , namely the tensor algebra T .V /, and then enforce the fundamental identity by taking a suitable quotient. In this regard, one takes the two-sided ideal IQ in T .V /, which is generated by all elements of the form v ˝ v Q.v/1
for all v 2 V;
(23.5)
and define Cl.V; Q/ as the quotient C l.V; Q/ D T .V /=IQ :
Fig. 23.1 Diagram for the definition of existence
(23.6)
23.1 Clifford Algebras
577
23.1.3 Real and Complex Clifford Algebras The most important Clifford algebras equipped with a non-degenerated quadratic forms are those over the real and complex vector spaces. Every non-degenerated quadratic form on a finite real vector space of dimension n is equivalent to the standard diagonal form given by Q.v/ D v21 C C v2p v2pC1 v2pCq ;
(23.7)
where p C q D n corresponds to the dimension of the space V . The pair of integers .p; q/ is known as the signature of the quadratic form Q.v/. The real vector space associated with this quadratic form Q.v/ is often denoted by Rp;q , and the Clifford algebra on Rp;q is denoted by Clp;q .R/. The notation Cln .R/ means either Cln;o .R/ or Cl0;n .R/ depending on the user preference for positive definite or negative definite spaces. The standard orthonormal basis ei for Rp;q consists of n D p C q mutually orthogonal unit vectors, of which p has the norm to 1 and q to 1. Thus the Clifford algebra Clp;q .R/ has p base vectors that square to +1 and q to 1. The set of the orthonormal bases fulfills i ¤ j:
< ei ; ej >D 0;
(23.8)
The fundamental Clifford identity of Eq. 23.1 implies that for an orthonormal basis ei ej D ej ei ;
i ¤ j:
(23.9)
In terms of these unit vectors, the so-called unit pseudoscalar in Clp;q .R/ is defined as follows In D e 1 e 2 : : : e n :
(23.10)
Its square is given by In2
D .1/
n.n1/ 2
q
.1/ D .1/
.pq/.pq1/ 2
( D
C1 p q 0; 1 p q 2;
1 mod 4: (23.11) 3 mod 4:
Next we derive the basis of a Clifford algebra in terms of the basis e1 ; e2 ; : : : ; en of the related vector space Rp;q . Using the canonical order of the power set P.f1; : : : ; ng/, we first derive the index set B WD ffb1 ; : : : ; bk g 2 P.f1; : : : ; ng/j1 < b1 < < bk < ng:
(23.12)
Then by defining for all B 2 B eB D eb1 eb2 : : : ebk ;
(23.13)
578
23 Clifford Algebras and Related Algebras
one obtains the expected basis feB jB 2 Bg of the entire Clifford algebra Clp;q .R/. Here the empty product (for k D 0) is defined as the multiplicative identity element. Note that each element x 2 C lp;q .R/ can then be written as xD
X
xB eB :
(23.14)
B2B
For every k 2 f0; : : : ; 2pCq 1g, the set feB jB 2 B; jBj D kg spans a linear subspace of Clp;q .R/. Each eB is called a blade of k grade. An element of this linear subspace is called a k-vector which in turn is a linear combination of eB blades of k grade. For running k from 0 to 3, a k-vector is usually called a scalar, vector, bivector, or trivector, respectively. All even k-vectors form the even part Cl.V; Q/C of the Clifford algebra Cl.V; Q/. Note that Cl.V; Q/C is a subalgebra, whereas the part Cl.V; Q/ formed by all odd k-vectors is not. Depending on k, there are odd n basis elements, so that the total dimension of Clp;q .R/ is k n X n D 2n : dim C l.V; Q/ D k
(23.15)
kD0
The center of an algebra consists of those elements that commute V with all the elements of this algebra. As examples, the center Cen.C l3 / D R ˚ 3 R3 (scalar and pseudoscalar) of Cl3 is isomorphic to C. The algebra Cl0;0 .R/ is isomorphic to R; it has only scalars and no base vectors. Cl0;1 .R/ corresponds to a two-dimensional algebra generated by a single base vector that squares to 1 and, therefore, is isomorphic to C (the field of the complex numbers). The four-dimensional algebra Cl0;2 .R/ is spanned by f1; e1 ; e2 ; I2 D e1 e2 g. The latter three base elements square to 1 and all anticommute. This algebra is isomorphic to the quaternion algebra of Hamilton H. The eight-dimensional algebra Cl0;3 .R/ spanned by f1; e1 ; e2 ; e3 ; e2 e3 ; e3 e1 ; e1 e2 ; I3 D e1 e2 e3 g is isomorphic to the Clifford biquaternions (direct sum of H). The Clifford algebras on complex spaces denoted as Cln .C/ are also of great interest. Every non-degenerate quadratic form on an n dimensional complex space is equal to the standard diagonal form Q.z/ D z21 C z22 C C z2n :
(23.16)
An algebra Cln .C/ may be obtained by complexification of the algebra Clp;q .R/ as follows C ln .C/ Š C lp;q .R/ ˝ C Š C l.C pCq ; Q ˝ C/;
(23.17)
where Q is the real quadratic form of signature .p; q/. Note in this equation, that the complexification does not depend on the signature. One computes easily Cl0 .C/ D C, Cl1 .C/ D C ˚ C, and Cl2 .C/ D M2 .C/. Each algebra Clp;q .R/ or Cln .C/ is isomorphic to a matrix algebra over R; C, or H or a direct sum of two such algebras.
23.1 Clifford Algebras
579
23.1.4 Involutions Given the linear map v ! v for all v 2 V , which preserves the quadratic form Q.V /, and taking into account the universal propriety of the Clifford algebras, this mapping can be extended to an algebra automorphism: ˛ W CL.V; Q/ ! CL.V; Q/:
(23.18)
When ˛ squares to the identity, it is called an involution. There are also two other important anti-automorphisms in the Clifford algebras. To elucidate these involutions consider the anti-automorphism of the tensor algebras that reverses the order in all products as follows: v 1 ˝ v2 ˝ ˝ vk ! vk ˝ v2 ˝ v1 :
(23.19)
Since the ideal IQ of Eq. 23.6 is invariant under this operation of reversion, such an operation descends to an automorphism of CL.V; Q/ and it is called the transpose or reversal operation and denoted by x t . The transpose of a product is .xy/t D y t x t . The second anti-automorphism is composed by ˛ and the transpose as follows xO D ˛.x t / D ˛.x/t :
(23.20)
These three operations depend only on the degree module 4, that is, if x is a k-blade then ˛.x/ D ˙x
x t D ˙x
xO D ˙x;
(23.21)
where the signs are given in Table 23.1.
23.1.5 Structure and Classification of Clifford Algebras The structure of Clifford algebras can be worked out using a simple methodology. For this purpose, let us consider the space V with a quadratic form Q and another vector space U of an even dimension and a non-singular bilinear form with discriminant d . The Clifford algebra of V C U is isomorphic to the tensor product of dim.U / the Clifford algebras of .1/ 2 d V and U. This means that the space V with its
Table 23.1 Signs in involutions depending on k mod 4
k mod 4
0
1
2
3
˛.x/ xt xO
+ + + +
+
+
.1/k 1k.k1/=2 1k.kC1/=2
580
23 Clifford Algebras and Related Algebras
quadratic form is multiplied by the factor .1/ the following formulas
dim.U/ 2
d . Over the reals, one can write
ClpC2;q .R/ D M2 .R/ ˝ Clq;p .R/; ClpC1;qC1 .R/ D M2 .R/ ˝ Clq;p .R/; Clp;qC2 .R/ D H ˝ Clq;p .R/;
(23.22)
which in turn can be used to find the structure of all real Clifford algebras. Here M2 .R/ stands for the algebra of 22 matrices over R. In the theory of nondegenerated quadratic forms on real and complex vector spaces, the finite-dimensional Clifford algebras have been completely classified. In each case, the Clifford algebra is isomorphic to a matrix algebra over R; C, or H or to a direct sum of two such algebras, but not in a canonical way. Next we will denote by K.n/ the algebra of n n matrices with entries in the division algebra K. The direct sum of algebras will be denoted by K2 .n/ D K.n/ ˝ K.n/. The Clifford algebra on C n with the quadratic form given by Eq. 23.16 will be denoted by Cln .C/. When n is even, the algebra Cln .C/ is central simple and according to the Artin–Wedderbum theorem is isomorphic to a matrix algebra over C. For the case n is odd, the center included not only the scalars but the pseudoscalar In as well. Depending on p and q by Clq;p .R/, one can always find a pseudoscalar that squares to one (In2 D 1/. Using the pseudoscalar In , let us define the following operators: P˙ D
1 .1 ˙ In /: 2
(23.23)
These two operators form a complete set of orthogonal idempotents. Since they are central, they allow a decomposition of Cln .C/ into a direct sum of two algebras Cln .C/ D ClC n .C/ ˚ Cln .C/;
(23.24)
where Cl˙ n .C/ D P˙ Cln .C/. This decomposition is called a grading of Cln .C/. The algebras Cl˙ n .C/ are just the positive and negative eigenspaces of In and the operators P˙ are just the projection operators. Since In is odd these algebras are isomorphic, related with each other via the automorphism ˛ as follows: ˛.C ln˙ .C/ D C ln .C/:
(23.25)
These two isomorphic algebras are each central simple and thus isomorphic to a matrix algebra over C. The sizes of the matrices can be determined by the dimension 2n of the Cln .C/. As a consequence of these considerations, one can classify straightforwardly the complex Clifford algebras as it is shown in Table 23.2. For the real case, the classification is a bit more difficult as the periodicity is 8 rather than 2. According to the Artin–Wedderburn theorem, if n (or p q) is even, the Clifford algebra Clp;q .R/ is central simple and isomorphic to a matrix algebra
23.1 Clifford Algebras
581
Table 23.2 Classification of complex Clifford algebras
Table 23.3 Complete Classification of real Clifford algebras Clp;q .R/ .p C q D 2m/ p q mod 8 p q mod 8 In2 0 2 4 6
+ +
R.2m / R.2m / H.2m1 / H.2m1 /
1 3 5 7
n
Cln .C/
2m 2m C 1
C.2m / C.2m / ˚ C.2m /
In2
Clp;q .R/ .p C q D 2m C 1/
+ +
R.2m / ˚ R.2m / C.2m / H.2m1 / ˚ H.2m1 / C.2m /
over R or H, but if n (or p q) is odd, the Clifford algebra is no longer central simple but rather it has a center that includes the pseudoscalar as well as the scalars. If n is odd and the pseudoscalar In2 D C1, then as in the complex case the Clifford algebra can be decomposed into a direct sum of isomorphic algebras as follows: C C lp;q .R/ D C lp;q .R/ ˚ C lp;q .R/;
(23.26)
each of which is central simple and thus isomorphic to the matrix algebra over R or H. Now, if n is odd and the pseudoscalar In2 D 1, then the center of Clp;q .R/ is isomorphic to C, thus it can be considered as a complex algebra, which is central simple and so isomorphic to a matrix algebra over C. Summarizing, there are in fact three properties these are necessary to determine the class of the algebra Clp;q .H/, namely, – n is even/odd – I 2 D ˙1 – the Brauer class of the algebra (n even) or even subalgebra (n odd) is R or H. Each of these properties depends only on the signature p q modulo 8. The size of the matrices is determined by the consequence that Clp;q .R/ has dimension 2pCq . The complete classification is given in Table 23.3, where the size of the matrices is determined by the dimension of 2pCq of the involved Clifford algebra Clp;q .R/ Table 23.4 shows the results of this classification for p C q 5. In this table, p C q runs vertically and p q runs horizontally. For example, the Clifford algebra Cl3;1 .R Š R.4/ is found in row pCq D 4 and column pq D 2. Note the symmetry about the columns 1; 5; 3.
23.1.6 Clifford Groups, Pin and Spin Groups, and Spinors In this section, we will assume that the space V is of finite dimension and the bilinear form Q is non singular. The Clifford group .V / (or Lipschitz group) is defined to
582
23 Clifford Algebras and Related Algebras
Table 23.4 Classification of real Clifford algebras for p C q 5 5 4 3 2 1 0 1 2 0 R C 1 R2 2 R.2/ R.2/ H C.2/ 3 C.2/ C 2 .2/ 4 H.2/ R.4/ R.4/ H.2/ C.4/ R2 .4/ C.4/ 5 H2 .2/ + + + I2 +
3
4
5
H2 H2 .2/ +
H.2/ +
C.4/
be the set of invertible elements g of the Clifford algebra Clp;q .V / such that the map v 7! gvb g 1 2 V for all v 2 V;
(23.27)
is an orthogonal automorphism of V . In fact, this equation represents the Clifford group action on the vector space V that preserves the norm of Q and also gives a homomorphism from the Clifford group to the orthogonal group. Since the universal algebra Clp;q .V / is uniquely defined up to isomorphism, .V / is also defined up to isomorphism. The Clifford group .V / contains all elements of nonzero norm r of V , which act on V by reflections in the hyperplane .Rfrg/˙ mapping v to . By the case of a characteristic 2, these mappings are called orthogonal v hv;rir Q.r/ transversions rather than reflections. Every orthogonal automorphism of V , like Eq. 23.27, is the composite of a finite number of hyperplane reflections. An element g of .V / represents a rotation of V if and only if g is representable as the product of an even number of elements of V . The set of such elements is denoted by 0 D 0 .V /. If g of .V / is representable as the product of an odd number of elements of V , then g represents an anti-rotation of V . The set of such elements will be denoted 1 D 1 .V /. One can then see the Clifford group as the disjoint union of the subset 0 (the subgroup of index 2) and the subset i (elements of degree i in ). If the space V is finite dimensional with a nondegenerated bilinear form then the Clifford group maps onto the orthogonal group of V , and its kernel consists of the nonzero elements of the field F . This leads to the following exact sequences: 1 ! F ! ! OV .F / ! 1; 1 ! F ! 0 ! OV .F / ! 1:
(23.28)
By an arbitrary characteristic, the spinor norm Q on the Clifford group is defined as follows Q.g/ D g t g:
(23.29)
This is a homomorphism from the Clifford group to the group F of the non-zero element of the field F . The nonzero elements of F have a spinor norm in the group
23.1 Clifford Algebras
583
F 2 (squares of nonzero elements of the field F ). When V is finite dimensional and nonsingular, one gets an induced map from the orthogonal group of V to the group F =F 2 , also known as the spinor norm. The spinor norm of the reflection of a vector p has image Q.p/ in F =F 2 . This property is uniquely defined on the orthogonal group. This provides the following exact sequences: 1 ! ˙1 ! PinV .F / ! OV .F / ! F =F 2 ; 1 ! ˙1 ! SpinV .F / ! SOV .F / ! F =F 2 :
(23.30)
The reader should note that in characteristic 2 the group f˙1g has just one element. Next we explain in the context of Clifford groups, the pin and spin groups. The pin group denoted by PinV .F / is the subgroup of the Clifford group .V / of elements of spinor norm 1. Similarly, the spin group denoted by SpinV .F / is the subgroup of elements of Dickson invariant 0 in PinV .F /. Usually the Spin group has index 2 in the Pin group. As explained before, there is a homomorphism from the Clifford group .V / onto the orthogonal group, thus the special orthogonal group is defined as the image of 0 . There is also a homomorphism from the pin group to the orthogonal group. The image consists of these elements of spinor norm 1 2 F =F 2 . The kernel comprises the elements C1 and 1, and it has order 2 unless F has a characteristic 2. Similarly it exists a homomorphism from the Spin group to the special orthogonal group of V . When V D Rp;q , the notations for , 0 , PinV and SpinV will be .p; q/, 0 .p; q/, PinV .p; q/, and SpinV .p; q/, respectively since R0q;p Š R0 p; q, 0 .q; p/ Š 0 .p; q/ and SpinV .q; p/ Š Spi nV .p; q/, respectively. Often 0 .0; n/ and Spin.0; n/ are abbreviated as 0 .n/ and Spin.n/ respectively. The groups PinV .p; q/, SpinV .p; q/, SpinV C .p; q/ are two-fold covering groups of O.p; q/, SO.p; q/, SOC .p; q/, respectively. As an example let us consider the spin group Spin.3/. The traceless Hermitian matrices x 1 C y 2 C z 3 with x; y; z 2 R represents vectors v D xe1 Cye2 Cze3 2 R3 . The group of unitary and unimodular matrices given by S U.2/ D fU 2 Mat.2; C/jU U D I; det U D I g
(23.31)
represents the spin group Spin.3/ D fu 2 C l3 juQu D 1; uOu D 1g or Spin.3/ D fu 2 C l3C juQu D 1g. The groups Spin.3/ and SU.2/ are both isomorphic with the group of unit quaternions S 3 D fq 2 Hj q qO D 1g. Taking an element u 2 Spin.3/, the mapping v ! uvQu corresponds to a rotation of R3 . Thus, every element of SO.3/ can be represented by an element in Spin.3/. Next we explain what are spinors. The Clifford algebras Clp;q .C/ with p C q D 2n even are isomorphic to matrix algebras with complex representation of dimension 2n . Restricting our analysis to the group Pinp;q .R/, one obtains a complex representation of the pin group of the same dimension, which is known as spinor representation. By restricting this to the spin group Spinp;q .R/, it then splits as the sum of two half spin representations (or Weyl representations) of dimension 2n1 . Now if pCq D 2nC1 is odd, the Clifford algebra Clp;q .C/ is a sum of two matrix algebras, each of which has a representation of 2n dimension, and these are also both representations of the pin group Pinp;q .R/.
584
23 Clifford Algebras and Related Algebras
By restricting us to the spin group Spinp;q .R/, these become isomorphic; thus the spin group has a complex spinor representation of dimension 2n . Generally speaking, spinor groups and pin groups over any field F have similar representations where their exact structure depends on the structure of the corresponding Clifford algebras Clp;q .C/. If a Clifford algebra has a factor that is a matrix algebra over some division algebra, the corresponding representation of the pin and spin groups are consequently over that division algebra. In order to describe the real spin representations, one needs to know how the spin group resides in its Clifford algebra. Let us consider the pin group, Pinp;q . It is the set of invertible elements in Clp;q .C/ that can be expressed simply as a product of unit vectors as follows: Pinp;q D fv1 v2 : : : vr j for all jjvi jj D ˙1g:
(23.32)
In fact the pin group corresponds to the product of the arbitrary number of reflections and it is a cover of the full orthogonal group O.p; q/. Now the spin group consists of such elements of Pinp;q that are built by multiplying an even number of unit vectors. So, according to the Cartan–Dieudonn´e theorem, the Spin is a cover of the group of the proper rotations SO.p; q/. As a consequence, the classification of the pin representations can be done straightforwardly using the already existing classification of the Clifford algebras Clp;q .C/. The spin representations are representations of the even subalgebras. Now in order to realize the spin representations in signature .p; q/ as pin representations of signature .p; q 1/ or .q; p 1/, one can make use of either of the following isomorphisms C C lp;q Š C lp;q1 for q > 0; C C lp;q Š C lq;p1 for p > 0;
(23.33)
where ClC p;q is an even subalgebra.
23.2 Related Algebras 23.2.1 Gibbs’ Vector Algebra Josiah Willard Gibbs (1839–1903) and independently Oliver Heavised (1850–1925) laid the foundations of a mathematical system called vector calculus to deal at that time with challenging engineering and physics problems. In 1879, Gibbs delivered a course in vector analysis with applications to electricity and magnetism, and in 1881 he printed a private version of the first half of his Elements of Vector Analysis; the second half appeared in 1884 [70]. The first paper in which Heavised introduced vector methods was in his 1882–1883 paper titled The relation between magnetic force and electric current [61, 85]. In the preface to the third edition of his Trea-
23.2 Related Algebras
585
tise on Quaternions (1890) [184] Peter Guthrie Tait showed his disappointment at “how little progress has recently been made with the development of Quaternions.” He further remarked “Even Prof. Willard Gibbs must be ranked as one of the retarders of Quaternion progress, in virtue of his pamphlet on vector analysis; a sort of hermaphrodite monster, compounded of the notations of Hamilton and Grassmann.” We can accept certainly that Tait’s remark about Gibbs is correct; Gibbs indeed retarded quaternion progress, however his “pamphlet” Elements of Vector Analysis marked undoubtedly the beginning of modern vector analysis. Basically, vector calculus consists of a three-dimensional linear vector space R3 with two associated operations: first, the scalar product ˛ D a b;
˛ 2 R;
(23.34)
which computes a scalar or the projection of vector a towards the vector b. The second operation is the cross product c D a b;
a; b; c 2 R3 :
(23.35)
Since the two vectors a and b lie on a plane their cross product generates a vector c orthogonal to this plane. Note that this scalar product is the inner product within the Clifford algebra. The cross product can be reformulated in geometric algebra via the concept of duality as follows a b D .a ^ b/ D .a ^ b/I31 ;
(23.36)
where a; b 2 G3 , ^ is the wedge product, and I3 D e1 e2 e3 is the pseudoscalar of the 3D Euclidean geometric algebra G3 . There are some identities of the Gibbs vector calculus that can be straightforwardly rewritten in geometric algebra: the triple scalar product of three vectors a; b; c 2 R3 a .b c/ D a .b^c/ D a ..b^c/ I31 / D .a^b^c/ I31 D det.Œa; b; c/I
(23.37)
the resulting determinant is a scalar representing the parallelepiped expanded by three noncoplanar vectors. Another useful identity is the triple vector product that can be computed in geometric algebra applying successively duality and the generalized inner product concept of Eq. 1.45 as follows a .b c/ D .a ^ ..b ^ c/ // D .a ^ ..b ^ c/I31 //I31 D a ...b ^ c/I31 /I31 / D a .b ^ c/ D b.a c/ c.a b/:
(23.38)
Note that the wedge product in Clifford algebra is valid in any dimension, whereas the cross product is only defined in a 3D vector space.
586
23 Clifford Algebras and Related Algebras
After 1898 many operations of differential geometry were defined using Gibbs’ vector calculus, again all of them can be reformulated in Clifford algebra and even generalized beyond the 3D vector space. Gibbs vector calculus was useful for the development of certain areas like electrical engineering. But unfortunately it slowed down development in physics. From a much broader perspective, the progress would have been faster if instead researchers and engineers would have adopted for their work not only the quaternions but even better the powerful Clifford algebra framework. This claim is still far more valid at present times due to the increasing complexity of the problems and to the fortunate fact that researchers have now very convenient computational resources at lower cost.
23.2.2 Exterior Algebras The exterior algebra is the algebra of the exterior product and it is also called an alternating algebra or Grassmann Algebra, after Hermann Grassmann [73, 74]. In 1844, Grassmann’s brilliant contribution appeared under the full title “ Die lineale Ausdehnungslehre, ein neuer Zweig der Mathematik dargestellt und durch Anwendungen auf die u¨ brigen Zweige der Mathematik, wie auch auf die Statik, Mechanik, die Lehre vom Magnetismus und die Krystallonomie erl¨autert.” Grassmann was unrecognized at that time and he worked as a teacher at the Friedrich–Willhelms– Schule (high school) at Stettin. An idea of the reception of his contribution can be obtained by the following quotation extracted from a letter written to Grassmann in 1876 by the publisher: “Your book Die Ausdehnungslehre has been out of print for some time. Since your work hardly sold at all, roughly 600 copies were in 1864 used as waste paper and the remainder, a few odd copies, have now been sold with the exception of one copy Vwhich remains in our library.” The exterior algebra .V / over a vector field V contains V as a subspace and its multiplication product is called the exterior product or wedge product ^, which is associative and a bilinear operator. The wedge product of two elements of V is defined by x ^y D x ˝ y .mod I /:
(23.39)
V This product is anticommutative on elements of V . The exterior algebra .V / for a vector V is constructed by forming monomials via the wedge product: x, x1 ^ x2 , x1 ^ x2 ^ x3 , etc. A monomial is called a decomposable k-vector, because it is built by the wedge product of k linearly independent vectors x1 ; x2 ; : : : ; xk 2 V . The sums formed from linear combinations of the monomials are the elements of an exterior algebra. The exterior algebra for a vector space V can be also described as the quotient vectors space k ^
V WD
k O
V =Wk ;
(23.40)
23.2 Related Algebras
587
where Wk is the subspace of k-tensors generated by transpositions such as W2 D .x˝yCy˝x/ and ˝ denotes the vector space tensor product. Thus, the equivalence class Œx1 ˝ ˝xk or k-vector is denoted as said above x1^x2^ ^xk . For instance x ^y C y ^x D 0;
(23.41)
since the representatives add to an element of W2 , thus x ^y D y ^x:
(23.42)
More generally, if x1 ; x2 ; x3 ; : : : ; xk 2 V and is a permutation of integers Œ1; : : : ; k then x.1/ ^x.2/ ^ ^x.k/ D sgn. /x1 ^x2 ^ ^xk ;
(23.43)
where sgn. / is the signature of the permutation . If any of x1 ; x2 ; : : : ; xk 2 V is linearly dependent, then x1 ^ x2 : : : ^ xk D 0:
(23.44)
The subspace spanned by all possible decomposable k-vectors is also called the kth V exterior power of V and it is denoted by k .V /. Exterior powers are commonly used in differential geometry to define the differential forms and to compute their wedge products. The exterior product of a k-vector and p-vector yields a (k C p)vector and symbolically the wedge of the two correspondent subspaces reads k ^
! .V / ^
p ^
! .V /
kCp ^
.V /:
(23.45)
Thus the exterior algebra is a graded algebra built by the direct sum of kth exterior powers of V ^
.V / D
0 ^
.V / ˚
1 2 n n ^ ^ ^ M .V /; (23.46) .V / ˚ .V / ˚ ˚ .V / D kD0
V n .V / D F and 1 .V / D V . Each of these spaces is spanned by k n V P nŠ n n . Thus .V / is spanned by D 2n k-vectors, where WD k k .n k/ŠkŠ kD0 elements. The k-vectors have a clear geometric interpretation, for example, the 2-vector or bivector x1 ^ x2 represents a planar space spanned by the vectors x1 and x2 and weighted by a scalar representing the area of the oriented parallelogram with sides x1 and x2 . In an analog way, the 3-vector or trivector represents the spanned 3D
where
V0
588
23 Clifford Algebras and Related Algebras
space and weighted by the volume of the oriented parallelepiped with edges x1 , x2 , and x3 . If V denotes the dual space to the vector V space V , then for each # 2 V , one can define an antiderivation on the algebra .V /: i# W
k ^
V !
k1 ^
V:
(23.47)
V Consider x 2 k V . Then x is a multilinear mapping of V to R, thus it is defined by its values on the k-fold Cartesian product V V V . If y1 ; y2 ; : : : ; yk1 are k 1 elements of V , then define .i# X /.y1 ; y2 ; : : : ; yk1 / D X.#; y1 ; y2 ; : : : ; yk1 /:
(23.48)
V In case of a pure scalar g 2 0 V , it is clear that i# g D 0. The interior product fulfills the following properties: V V V (i) For each k and each # 2 V , i# W k V ! k1 V . By convention 1 D 0. V1 (=V ) then i# x D #.x/ is the dual pairing between the elements of (ii) If x 2 V and V . (iii) For each # 2 V , i# is a graded derivation of degree 1: i# .x ^ y/ D .i# x/ ^ y C .1/deg# # ^ .i# y/. These three properties suffice to characterize the interior product as well as define it in the general infinite-dimensional case. Other properties of the interior product follow: i# ı i# D 0 and i# ı i& D i& ı i# . Suppose that V has finite dimension n, then the interior product induces a canonical isomorphism of vector spaces, namely k ^
.V /
n O^
.V / D
nk ^
.V /:
(23.49)
V A non-zero element of the top exterior power n .V / (which is a one-dimensional space) is sometimes called a volume or oriented form. Given a volume form , the isomorphism is given explicitly by #2
k ^
.V / ! i# 2
nk ^
.V /:
(23.50)
Now, if in addition to a volume form, the vector space V is equipped with an inner product which identifies V and V , then the pertinent isomorphism W
k nk ^ ^ .V / ! .V /
is called the Hodge dual or more commonly the Hodge star operator.
(23.51)
23.2 Related Algebras
589
Suppose that U and V are a pair of vector spaces and f W U ! V is a linear transformation, or in other words ^
.U /jV1
.U /
Df WU D
1 ^
.U / ! V D
1 ^
.V /;
(23.52)
then by the universal construction, there exists a unique homomorphism of the graded algebras, namely ^ ^ ^ .f / W .U / ! .V /:
(23.53)
V V Note that .f / preserves the homogeneous degree. A k-graded element of .f / is given by decomposable elements transformed individually ^
.f /.x1 ^ ^ xk / D f .x1 / ^ ^ f .xk /
(23.54)
Consider k ^
.f / D
^
.f /Vk .U / W
k ^
.U / !
k ^
.V /:
(23.55)
V The transformation .f / relative to a basis U and V is equal to the matrixVof k k minors of f . In the case that U is of finite dimension n and U D V , then n .f / is Vn a mapping of one-dimensional vector space to itself, thus it is given by a scalar: the determinant of f . If F is a field of characteristic 0, then the exterior algebra of a vector space V can be canonically identified with the vector subspace of T .V / consisting of antisymmetric tensors. According to Eq. 23.40, the exterior algebra is the quotient of T .V / by the ideal I generated by x ˝x. Let T r .V / be the space of homogeneous tensors of rank r. This space is spanned by decomposable tensors x1 ˝ ˝ xr ; xi 2 V:
(23.56)
The antisymmetrization, also called the skewsymmetrization of a decomposable tensor, is defined by Alt.x1 ˝ ˝ xr / D
1 X sgn. /.1/ ˝ ˝ x.r/ ; rŠ
(23.57)
2r
where the sum is taken over the symmetric group of permutations on the symbols 1; : : : ; r. By linearity and homogeneity, this extends to an operation, also denoted by Alt, on the full tensor algebra T .V /. The image of Alt.T .V // is call the alternating tensor algebra and it is denoted by A.V /. Note that A.V / is a vector subspace of T .V / and it inherits the structure of a graded vector space from that O defined as follows on T .V /. A.V / has an associative graded product ˝
590
23 Clifford Algebras and Related Algebras
O D Alt.x ˝ y/: x ˝y
(23.58)
Even though this product is different than the tensor product, under the assumption that the field F has characteristic 0, the V kernel of Alt is precisely the ideal I and the following isomorphism exists A.V / Š .V /. In physics the exterior algebra is an archetypal example of the so-called superalgebras, which are essential, for instance, in physical theories concerning fermions and super-symmetry. The exterior algebra has remarkable applications in differential geometry for defining differential forms. One can intuitively interpret differential forms as a function on weighted subspaces of the tangent space of a differentiable manifold. Consequently, there is a natural wedge product for differential forms. The differential forms play a crucial role in diverse areas of differential geometry.
23.2.3 Grassmann–Cayley Algebras The Grassmann–Cayley algebra is based on the work by Hermann Grassmann on exterior algebra and the work by the British mathematician Arthur Cayley (1821–1895) on matrices and linear algebra. It is also known as double algebra. The Grassmann–Cayley algebra is a sort of a modeling algebra for projective geometry. This mathematical system utilizes subspaces (brackets) as basic computational elements. This framework facilitates the translation of synthetic projective statements into invariant algebraic statements in the bracket ring, which is the ring of the projective invariants. Furthermore, this mathematical system is useful for the treatment, with geometric insight, of tensor mathematics and for the modeling of conics and, quadrics, among other forms. The Bracket Ring Let S be a finite set of points fe1 ; e2 ; : : : ; en g in .d 1/dimensional projective space over a field F. By using homogeneous coordinates, each point is represented by a d -tuple, which corresponds to a column in this matrix 0
x1;1 Bx B 2;1 B B : X DB B : B @ : xd;1
x1;2 x2;2 : : : xd;2
::: :::
::: :::
1 x1;n x2;n C C C : C C: : C C : A xd;n
(23.59)
Assume now that the entries of matrix X are algebraically independent indeterminate over F , so we can define a bracket as follows ˇ ˇ x1;i1 ˇ Œei1 ; ei2 ; : : : ; eid D det ˇˇ : : : ˇx d;i1
x1;i2 ::: xd;i2
::: ::: :::
ˇ x1;id ˇˇ : : : ˇˇ : xd;id ˇ
(23.60)
23.2 Related Algebras
591
The bracket ring B of S (over F in rank d ) is the subring of the polynomial ring F Œx1;1 ; x1;2 ; : : : ; xd;n generated by all possible brackets. For the projective group, the first theorem of invariant theory states that the projective invariants of the set of points S are strictly the elements of B or bracket polynomials, which are homogeneous with respect to different values the elements of S may take. The equation of a typical projective invariant reads Œx1 ; x2 ; x3 Œx4 ; x5 ; x6 Œx1 ; x4 ; x7 3Œx1 ; x2 ; x4 Œx3 ; x5 ; x7 Œx1 ; x4 ; x6 : (23.61) Note that this is a purely symbolic expression in terms of points in a certain geometric configuration and coefficients belonging to the field F . Surprisingly the coordinates are not explicit and the invariants have a geometric meaning, which is completely coordinate-free. The use of coordinate-free formulas is of major relevance for representing and computing complex physical relations. The advantages of the use of coordinate-free symbolic algebraic expressions for representing geometric conditions and constraints are that it resembles the way we humans regard geometry and that this algebra is conceptually much closer to the geometry essence than the straightforward algebra of coordinates. In fact, we can translate synthetic geometric statements into the bracket algebra by using Grassmann–Cayley algebra, and we can try to translate back invariant algebraic statements, or homogeneous bracket equations. The drawback of this procedure is that the bracket algebra is more complicated than the straight-forward polynomial algebra in the coordinates themselves; this is because the brackets are not algebraically independent. They satisfy the following relations: (i) Œx1 ; x2 ; : : : ; xk D 0 if any xi D xj i ¤ j . (ii) Œx1 ; x2 ; : : : ; xk D sign. /Œx.1/ ; x.2/ ; : : : ; x.k/ . (iii) Œx1 ; x2 ; : : : ; xk Œy1 ; y2 ; : : : ; yk D Pk j D1 Œx1 ; x2 ; : : : ; xk1 ; yj Œy1 ; y2 ; : : : ; yj 1 ; ak ; yj C1 ; : : : ; yk The relations of the type (iii) are called Grassmann–Pl¨ucker relations or syzygies, which correspond to the generalized Laplace expansions in the ring B. The second fundamental theorem of invariant theory for projective invariants states that all relations among brackets are results involving relations of the type (i)–(iii). Plucker ¨ Coordinates Consider k independent columns cj1 ; cj 2 ; : : : ; cjk of the matrix X of Eq. 23.59; pick k out the d rows and index them in an ascendant manner by i1 ; i2 ; : : : ; ik , then the Pl¨ucker coordinate is defined as the determinant of a minor as follows:
Pi1 ;i2 ;:::;ik
ˇ ˇ xi1 ;j1 ˇ ˇx D ˇˇ i2 ;j1 ˇ ::: ˇx ik ;j1
xi1 ;j2 xi2 ;j2 ::: xik ;j2
::: ::: ::: :::
ˇ xi1 ;jk ˇˇ xi2 ;jk ˇˇ : : : : ˇˇ ˇ xik ;jk
(23.62)
592
23 Clifford Algebras and Related Algebras
The Pl¨ucker coordinate vector P D cj1 D cj1 _ cj 2 _ cjk D Pi1 ;i2 ;:::;ik
(23.63)
d is a vector over F of length and depends, up to a nonzero scalar, only on the k subspace U D span.cj1 ; cj 2 ; : : : ; cjk / of V D F d . As an illustration, consider a line U in R3 spanned by two points X1 and X2 , then using k D 2 and d D 4 its Pl¨ucker vector is 0
1 0 1 X11 X21 B X12 C B X22 C t C B C P D X1 _ X1 D B @ X13 A _ @ X23 A D .P01 ; P02 ; P03 ; P12 ; P31 ; P23 / 1 1 D .N; R N /t ;
(23.64)
where 1 corresponds to the homogeneous coordinate in each 3D-point, N stands for the orientation of the line, and R is any point from the origin touching the line U . Extensors and Join and Meet V Operations The previous section gives V an introduction to exterior algebra .V /, now in this section, we will extend .V / to the Grassmann–Cayley algebra. Among the essentially equivalent definitions of exterior algebra let us mention three: the definition as a universal object for alternating multilinear maps on V , the definition as the quotient of the tensor algebra on V by the ideal generated by all tensors of the form x ˝ x for x 2 V , and an interesting one assumes that V is a Peano space (earlier called Cayley space), meaning that V is vector space endowed with a non-degenerated V alternating d -linear form, or bracket. Then one can define the exterior algebra .V / as the quotient of the free associative algebra over V by its ideal generated by all expressions, resulting in linear combinations, for all k d , of k-products X
D ˛i x1;i x2;i ; : : : ; xk;i ;
(23.65)
i
with ˛i 2 F , xj;i 2 V for all i; j such that for all y1 ; y2 ; : : : ; yd k , X
D ˛i Œx1;i ; x2;i ; : : : ; xk;i ; y1 ; y2 ; : : : ; yd k D 0:
(23.66)
i
V Instead of the usual symbol for the exterior product ^ in .V /, one uses _ and refers to it as the join operation. This product is associative, distributive over addition, and anti-symmetric. The exterior algebra is a graded algebra built by the direct sum of kth exterior powers of V
23.2 Related Algebras
593
^
.V / D
k n ^ M
.V /:
(23.67)
kD0
If one chooses a basis fe1 ; : : : ; ed g of V over F , then a basis for
Vk
fei1 _ ei2 _ _ eik j1 i1 i2 : : : ik d g:
.V / over F is (23.68)
V Note that in the exterior algebra .V /, there is no need to choose an explicit basis for V ; one has a coordinate-free symbolic algebra which in fact helps to mimic coordinate-free geometric operations in the .d 1/-dimensional projective space corresponding to V, in other words in an affine space embedded in the projective space. Let x1 ; x2 ; : : : ; xk 2 V and compute the join of these k vectors: X D x1 _ x2 _; : : : ; _xk which can be written simply as X D x1 x2 ; : : : ; xk . If X ¤ 0, then the involved k vectors are linearly independent. For this case X will be called an extensor of step k or a decomposable k-vector. Let Y D y1 y2 : : : yl an another extensor of step l, then X _ Y D x1 _ x2 _; : : : ; _xk _ y1 _ yW 2 _; : : : ; _yl D x1 x2 ; : : : ; xk y1 y2 ; : : : ; yl is an extensor of step k C l. In fact X Y 0 if and only if x1 ; x2 ; : : : ; xk ; y1 ; y2 ; : : : ; yl are distinct and linearly independent. V Grassmann–Cayley Algebra Next we endow the exterior algebra .V / with a second operation ^ called the meet operation. If X D x1 x2 ; : : : xk and Y D y1 y2 : : : yl with k C l d then X ^Y D
X
sgn. /Œx.1/ ; : : : ; x.d l/ ; y.1/ ; : : : ; y.l/ x.d lC1/ ; : : : ; x.k/ ;
(23.69) where the sum is taken over all permutations of f1; 2; : : : ; kg, such that .1/ < .2/ < < .d l/ and .d l C 1/ < .d l C 2/ < < .k/. These permutations are called shuffles of the .d l; k .d l// split of X . A meet of the extensor X of step k and extensor Y of step l, it is an extensor of step jl kj. The meet is associative and anti-commutative in the following sense: X ^ Y D .1/.d k/.d l/ Y ^ X:
(23.70)
The meet is dual to the join, where duality exchanges vectors with covectors or extensors of V step n 1. The definitions of join and meet are extended to arbitrary elements of .V / by distributivity. The extended operations remain well-defined and associative. The meet operation corresponds to the lattice meet of subspaces: N Y D XN \ YN only if XN [ YN spans V . The Grassmann–Cayley algebra is X_ V the vector space .V / equipped with the operations join _ and meet ^. In the Grassmann–Cayley algebra framework, one can translate geometric incidence theorems or incidence relations of the projective geometry into a conjunction of Grassmann–Cayley statements. Provided that those statements involve only join and
594
23 Clifford Algebras and Related Algebras
(1)
Projective geomentry
(2) Grassman – Cayley alagebra Cayley factorization (3)
Bracket algebra
(4)
Coordinate algebra
Fig. 23.2 Cayley factorization: bracket algebra ! geometry
meet operations and not additions, they can be relatively easy translated back to projective geometry. Furthermore, Grassmann–Cayley statements may be in turn expanded into bracket statements by the definitions and properties of join and meet. Every simple Grassmann–Cayley statement is equivalent to a finite conjunction of bracket statements. Conversely, writing a bracket statement as a simple Grassmann–Cayley statement is not always a simple task. This problem is called Cayley factorization. In this manner, one works with an invariant language with respect to the projective general linear group. However, one can introduce a further step away from the coordinate-free geometric computations by introducing vector coordinates; as a result the statements in a larger algebra may include non-invariant expressions. The Cayley factorization as depicted in Fig. 23.2 plays a key role in automated geometric theorem proving like the proof of theorems in projective and Euclidean geometry [6, 39, 113, 193].
Chapter 24
Notation
This section includes a list of notations with a brief description. R Sn C H Rn Rp;q Rmn Gn Gp;q Gp;q;r k Gp;q
x x x; X Xk F
The real numbers The unit sphere in RnC1 The complex numbers The quaternion algebra A vector space of dimension n over the field R with an Euclidean signature A vector space of dimension n D p C q over the field R with signature .p; q/ The direct product Rm ˝ Rn The geometric algebra over Rn The geometric algebra over Rp;q The special or degenerated geometric algebra over Rp;q The k-vector space of Gp;q The scalar element The vector of Rn The vector and multivector of geometric algebra A blade of grade k The flag, a set of geometric entities of an object frame
Operator symbols XY X Y X Y XW ^Y X Y V X Y 1 X < X >k X e X X jjX jj k P
l .< X >k / ? Pl .< X >k /
Geometric product of the multivectors X and Y Scalar product of X and Y Inner product of X and Y Wedge product of X and Y Meet operation of X and Y Join product of X and Y Inverse of X Projection of X onto grade k Dual of X Reverse of X Conjugate of X Norm of X Projection of < X >k onto < Y >l Projection of < X >k onto < Y >l
E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 24, c Springer-Verlag London Limited 2010
595
Chapter 25
Useful Formulas for Geometric Algebra
Geometric Product xy D x y C x ^ y yx D x y x ^ y x X r D 12 .xX r .1/r X r x/ x^X r D 12 .xX r C .1/r X r x/ Multivectors P X D< X >0 C < X >1 C D r < X >r X r Y s D< X r Y s >jrsj C < X r Y s >jrsjC2 C C < X r Y s >rCs X r Y s D< X r Y s >jrsj X r D 0 for scalar X r ^ Y s D< X r Y s >rCs < X Y >D< Y X > X r .Y s Z t / D .X r ^ Y s / Z t r C s t and r; s > 0 X r .Y s Z t / D .X r Y s / Z t r C t s X .Y Z / C Z .X Y / C Y .Z X / D 0 Products of frequent use of vectors x; y; z; w and a bivector X x .y ^x/ D x yz x zy .x^y/ .z^w/ D x wy z x zy w x .Y X / D .x^y/ X .x^y/ .z^w/ D y zx^w x zy ^w C x wy ^z y wx^z .x^y/ X D .x X /^y C x^.y X / Bivectors X Y D X Y C X Y C X ^Y YX D X Y ˝ X ˛ Y C X ^Y Y Xr D Y Xr r Z .X Y / D Z X Y C X Z Y ; 8 X ; Y E. Bayro-Corrochano, Geometric Computing: For Wavelet Transforms, Robot Vision, Learning, Control and Action, DOI 10.1007/978-1-84882-929-9 25, c Springer-Verlag London Limited 2010
597
598
25 Useful Formulas for Geometric Algebra
Pseudoscalar I V Xr V .Y s I / D X r Y s I r C s n X r .Y s I / D X r Y s I r s Reflections and Rotations X r ! .1/r nX r n e X r ! RX r R B =2 RDe D kl mn e D RR e D1 RR Motor Algebra: Rotors, Translators, and Motors t T D 1 C I 12 t D e 2 I R D cos. 2 / C sin. 2 /n Rs D cos. 2 / C sin. 2 /l where l D n C Im ts M D T s Rs D .1 C I t2s /e l 2 D e l 2 CI 2 M D T s Rs D .1 C I t2s /Rs D Rs C I t2s Rs D Rs C IRs0 M D cos 2 C I d2 C sin 2 C I d2 l fD1 jM j D M M e s R0 C R e 0 R s D 0; t s D 2R 0 R e e Rs Rs D 1; R s s s s M D .a0 C a/ C I.b0 C b/ D T s R s fs T f D .a0 a/ C I.b0 b/ D R fs M N fs M D .a0 C a/ I.b0 C b/ D R s T f fs T s N D .a0 a/ I.b0 b/ D R M f 1 f N N a0 D 4 .M C M C M C M / f fM N M N / I b0 D 14 .M C M f 1 fCM N M N / a D .M M 4
f fM N CM N / I b D 14 .M M Linear Algebra
f.x^y ^ ^w/ D f.x/^f.y/^ ^f.w/ f.I / D det.f/I N f.y/ D ek f.e k / y D @x f.x/ y N s / r s f.X r / Y s D fŒX r f.Y N s / D fŒf.X N X r f.Y r/ Y s r s N 1 X / f1 .X / D det.f/1 I f.I 1 1 fN .X / D det.f/ I f.I 1 X /
25 Useful Formulas for Geometric Algebra
599
Frames and Contractions e i ej D ıji e k D .1/kC1 e1 ^ ^ eLk ^ ^en En1 En D e1 ^e2 ^ ^en ek e k X r D @x x X r D rX r ek e k ^X r D @x x^X r D .n r/X r ek X r e k D @P x X r xP D .1/r .n 2r/X r Geometric Calculus P YP r.X Y / D rX Y C rX r^r D0 r.x y/ D y rx 2 D 2x Incidence Algebra PI 1 D ˛II 1 D ˛ ŒP Œx 1 x 2 x 3 : : : x n D Œx 1 ^x2 ^x 3 ^ ^xn D .x 1 ^x 2 ^x 3 ^ ^x n /I 1 P D X1 ^X2 ^X3 ^X4 D W1 W2 W3 W4 h.1 C x 1 /.1 x 2 /.1 C x 3 /.1 x 4 /i4 A D AI 1 AB D .A^B/ D ŒA^B J D A [ B D A^B .A \ B/ D A [ B for If A D A0 C and B D B 0 C A \ B D .A B/ Direct Distance e Ah D e .a1h ^a2h ^ ^akh / D .a2 a1 /^.a3 a2 /^ ^.ak ak1 / d Œa1h ^ : : : akh ; b h Œfe .a1h ^ ^akh /g .e b h /1 Œe .a1h ^ ^akh ^b h / D Œ.a2 a1 /^ ^.ak ak1 /1 Œ.a2 a1 /^ ^.ak ak1 /^.b ak / d Œa1h ^ ^arh ; b1h ^ ^bsh fe .a1h ^ ^arh /g^fe .b1h ^b2h ^ ^bsh /g1 Œe .a1h ^ ^arh ^b1h ^ ^bsh / D Œ.a2 a1 /^ ^.ar ar1 /^.b2 b1 /^ ^.bs bs1 /1 Œ.a2 a1 /^ ^.ar ar1 /^.b1 ar /^.b2 b1 /^ ^ .bs bs1 / Projective Geometry X^ (projective split) XnC1 D XnC1 C X^nC1 D XnC1 1 C X nC1 nC1 X 4 D X4 C X^4 D X4 1 C XX^ 4 4 X4 .1 C x/ L \ ˚ D .X1 ^X2 / \ .Y1 ^Y2 ^Y3 / D L ˚ L \ ˚ D ŒX1 X2 Y2 Y3 Y1 C ŒX1 X2 Y3 Y1 Y2 C ŒX1 X2 Y1 Y2 Y3 L1 ^L2 D 0 (coplanar lines)
600
25 Useful Formulas for Geometric Algebra
L1 \ L2 D ŒX1 X2 Y1 Y2 ŒX1 X2 Y2 Y1 (intersecting lines) L D ˚1 \ ˚2 D .X1 ^X2 ^X3 / \ .Y1 ^Y2 ^Y3 / D ŒX1 X2 X3 Y1 .Y2 ^Y3 /C CŒX1 X2 X3 Y2 .Y3 ^Y1 / C ŒX1 X2 X3 Y3 .Y1 ^Y2 / Projective Invariants Inv1 D
.X3 ^X1 /I21 .X4 ^X2 /I21 .t3 t1 /.t4 t2 / D 1 1 .t4 t1 /.t3 t2 / .X4 ^X1 /I2 .X3 ^X2 /I2
Inv2 D
.X5 ^X4 ^X3 /I31 .X5 ^X2 ^X1 /I31 A543 A521 D 1 1 A513 A524 .X5 ^X1 ^X3 /I3 .X5 ^X2 ^X4 /I3
Inv3 D
.X1 ^X2 ^X3 ^X4 /I41 .X4 ^X5 ^X2 ^X6 /I41 V1234 V4526 D 1 1 V .X1 ^X2 ^X4 ^X5 /I4 .X3 ^X4 ^X2 ^X6 /I4 1245 V3426
Conformal Geometric Algebra .e e /
ei2 D 1; i D 1; : : ; nI e0 D 2 C ; e1 D e C eC E D e1 ^ e0 D eC ^ e D eC e (Minkowsky plane) x c D xe C ˛e0 C ˇe1 (conformal split) PE .x c / D .x c E /E D ˛e0 C ˇe1 2 R1;1 f D .x c ^ E /E D xe 2 Rn PE? .x c / D .x c E /E ? x c D PE .x c / C PE .x c /. x c D x c E 2 D .x c ^ E C x c E /E D .x c ^ E /E C .x c E /E x c D .x c ^ E /E C .x c E /E D xe C e0 C 12 .k1 C k2 /e1 D xe C 12 x2e e1 C e0 x c D x C 12 x2 e1 C e0 (point) x c D s1 ^s2 ^s3 ^s4 (dual point) s D p C 12 .p2 2 /e1 C e0 (sphere) s D a^b^c ^d (dual sphere) x c ^ s D 0
D nIE de1 n D .a b/^.a c/ d D .a^b^c/IE (plane)
D e1 ^a^b^c (dual plane) L D 1 ^ 2 ; L D nIE e1 mIE n D .a b/; m D .a^b/ (line) L D e1 ^a^b (dual line) z D s1 ^s2 D s1 ^ 2 (circle) z D a^b^c (dual circle) PP D s1 ^s2 ^s3 ; PP D s^L (point pair) PP D a^b (dual point pair) s D A r A 1 rC1 s D z 1 ; s D PPL1 ; PP D sL D s^L Lie Algebra of the Conformal Group Œx^y .x E /.y E / E D 0 E i D eC e ; B ij D ei ej .i < j D 1; : : : ; n/; N ij D ei e˙ E
25 Useful Formulas for Geometric Algebra
601
Conformal Transformations s.x c / D sx c s1 (inversion) x 0 D x 1 (reflection) K b D eC T b eC D .e1 e0 /.1 C be1 /.e1 e0 / D 1 C be0 (transversions) a T a D 1 C 12 ae1 D e 2 e1 (translator)
R D n2 n1 D cos. 2 / sin. 2 /l D e 2 l (rotor) e D cos. / sin. /L D e 2 L (motor) M D T RT 2 2 n n Q Q 0 f Q D Mi Q M ni C1 i D1
i D1
D D .1 C E / C .1 E /1 D e E (dilator) E .xe C x2e e1 C e0 /E D .xe C 12 x2e e1 C e0 / (involution) G D K b T a R ˛ (conformal transformation) g.x c / D Gx c .G /1 D x 0c Differential Kinematics n Q fni C1 Mi xp M i D1 i D1 i n h P xp0 L0j dqj dxp0 D
xp0 D
n Q
j D1
Dynamics MqR C C qP C G D M D V T mV C ıI C D V T mVP G D V T mVP D V T ma 0
x10 B 0 B V DB : @ ::
0 x20 :: :
:: :
0 0 :: :
0
0
xn0
10
L01 C B L0 CB 1 CB : A @ :: L01
.V T mV C ıI /qR C V T mVP qP C V T F D ıI qR C V T .mV qR C mVP qP C F / D ıI qR C V T m.V qR C VP qP C a/ D
0 L02 :: :
:: :
0 0 :: :
L02
L0n
1 C C C D XL A
References
1. Ablamowicks R. CLIFFORD Software packet using Maple for Clifford algebra computations. http://math.tntech.edu/rafal. 2. Arena, P., Caponetto, R., Fortuna, L., Muscato, G., and Xibilia, M.G. 1996. Quaternionic multilayer perceptrons for chaotic time series prediction. IEICE Trans. Fundamentals, E79-A(10):1–6. 3. Ashdown, M.A.J. 1998. Maple code for geometric algebra. http://www.mrao.cam.ac.uk/emaja. 4. Barrett-Technology http://www.barrett.com/robot/index.htm. 5. Baker, S., and Nayar, S. 1998. A theory of catadioptric image formation. In Proc. Int. Conf. Computer Vision, Bombay, India, pp. 35–42. 6. Barnabei, M., Brini, A., and Rota, G.-C. 1985. On the exterior calculus of invariant theory. Journal of Algebra, 96:120–160. 7. Barnett, V. 1976. The ordering of multivariate data. Journal of Royal Statistical Society A, 3:318–343. 8. Bayro-Corrochano, E. 1996. Clifford self-organizing neural network, Clifford wavelet network. Proc. 14th IASTED Int. Conf. Applied Informatics, Feb. 20–22, Innsbruck, Austria, pp. 271–274. 9. Bayro-Corrochano, E. 2005. Robot perception and action using conformal geometry. In Handbook of Geometric Computing. Applications in Pattern Recognition, Computer Vision, Neurocomputing and Robotics., E. Bayro-Corrochano (Ed.), Springer-Verlag, Heidelberg, Chap. 13, pp. 405–458. 10. Bayro-Corrochano, E., Buchholz, S., and Sommer, G. 1996. Self-organizing Clifford neural network. IEEE ICNN’96, Washington, DC, June, pp. 120–125. 11. Bayro-Corrochano, E., Daniilidis, K., and Sommer, G. 1997. Hand-eye calibration in terms of motions of lines using geometric algebra. In 10th Scandinavian Conference on Image Analysis, Vol. I, Lappeenranta, Finland, pp. 397–404. 12. Bayro-Corrochano, E., Daniilidis, K., and Sommer, G. 2000. Motor algebra for 3D kinematics. The case of the hand–eye calibration. In International Journal of Mathematical Imaging and Vision, 13(2):79–99. 13. Bayro-Corrochano, E., and Lasenby, J. 1998. Geometric techniques for the computation of projective invariants using n uncalibrated cameras. In Proceedings of the Indian Conference on Computer Vision and Image Processing, New Delhi, India, December 21–23, pp. 95–100. 14. Bayro-Corrochano, E., Lasenby, J., and Sommer, G. 1996. Geometric algebra: A framework for computing point and line correspondences and projective structure using n uncalibrated cameras. In IEEE Proceedings of ICPR’96, Vienna, Austria, Vol. I, August, pp. 334–338. 15. Bayro-Corrochano, E., and L´opez-Franco, C. 2004. Omnidirectional vision: Unified model using conformal geometry. In Proc. European Conference on Computer Vision, Prague, Czech Republic, pp. 536–548. 16. Bayro-Corrochano, E., and Rivera-Rovelo, J. 2004. Non-rigid registration and geometric approach for tracking in neurosurgery. International Conference on Pattern Recognition, Cambridge, UK, pp. 717–720.
603
604
References
17. Bayro-Corrochano, E., and Rosenhahn, B. 2000. Computing the intrinsic camera parameters using Pascal’s theorem. In Geometric Computing with Clifford Algebra (G. Sommer, Ed.), Chapter 16, Springer-Verlag, Heidelberg, Germany. 18. Bayro-Corrochano, E., and Sobczyk, G. 2000. Applications of Lie algebras in the geometric algebra framework. In Geometric Algebra Applications with Applications in Science and Engineering (E. Bayro-Corrochano and G. Sobczyk, Eds.), Birkh¨auser, Boston. 19. Bayro-Corrochano, E., and Zhang, Y. 2000. The motor extended Kalman filter: A geometric approach for 3D rigid motion estimation. Journal of Mathematical Imaging and Vision, 13(3):205–227. 20. Belinfante, J., and Kolman, B. 1972. Lie Groups and Lie Algebras: With Applications and Computational Methods, SIAM, Philadelphia. 21. Bell I. CCC MV 1.3.0 to 1.6 sources supporting N 63. http://www.iancgbell.clara.net/ maths/index.htm. 22. Benosman, R., and Kang, S. 2000. Panoramic Vision, Springer-Verlag, New York. 23. Bernard, C. 1997. Discrete wavelet analysis for fast optic flow computation. Applied and Computational Harmonic Analysis, 11(1): 32–63. 24. Biglieri, E., and Yao, K. 2000. Some properties of singular value decomposition and their application to digital signal processing. Signal Processing, 18:277–289. 25. Blaschke, W. 1960. Kinematik und Quaternionen. VEB Deutscher Verlag der Wissenschaften, Berlin. 26. Borst, C., Fischer, M., and Hirzinger, G. 1999. A Fast and Robust Grasp Planner for Arbitrary 3D Objects. ICRA99, International Conference on Robotics and Automation, pp. 1890–1896. 27. Bottou, L., Ortes, C., Denker, J., Drucker, H., Guyon, I., Jackel, L., LeCun, Y., M¨uller, U., Sackinger, E., Simard, P., and Vapnik, V. 1994. Comparison of classifier methods: A case study in handwriting digit recognition. In International Conference on Pattern Recognition, pp. 77–87, IEEE Computer Society Press. 28. Brannan, D., Esplen, M., and Gray, J. 2002. Geometry, Cambridge University Press, New York. 29. B¨ulow T. 1999. Hypercomplex Fourier transforms. Ph.D. thesis, Computer Science Institute, Christian Albrechts Universit¨at, Kiel. 30. Burges, C.J.C. 1998. A tutorial on support vector machines for pattern recognition. Knowledge Discovery and Data Mining, 2(2):1–43, Kluwer Academic Publishers. 31. Canudas, C., Siciliano, B., and Bastin, G. 1996. Theory of Robot Control, Springer-Verlag, London. 32. Carlsson, S. 1994. The double algebra: An effective tool for computing invariants in computer vision. Applications of Invariance in Computer Vision, Lecture Notes in Computer Science 825; Proceedings of the 2nd Joint Europe-U.S. Workshop, Azores, October 1993. SpringerVerlag. 33. Carlsson, S. 1998. Symmetry in perspective. In Proceedings of the European Conference on Computer Vision, Freiburg, Germany, pp. 249–263. 34. Cerejeiras, P., Ferreira, M., K¨ahler, U., and Sommen, F. 2007. Continuous wavelet transform and wavelet frames on the sphere using Clifford analysis. Submitted to AIMS Journals. 35. Chantler, M.J. 1994. The effect of variation in illuminant direction on texture classification. Ph.D. thesis, Dept. of Computing and Electrical Engineering, Heriot-Watt University. 36. Chen, H. 1991. A screw motion approach to uniqueness analysis of head–eye geometry. In IEEE Conf. on Computer Vision and Pattern Recognition, Maui, Hawaii, June 3-6, pp. 145–151. 37. Chernov, V.M. 1995. Discrete orthogonal transforms with data representation in composition algebras. In Scandinavian Conference on Image Analysis, Uppsala, Sweden, pp. 357–364. 38. Chou, J.C.K., and Kamel, M. 1991. Finding the position and orientation of a sensor on a robot manipulator using quaternions. International Journal of Robotics Research, 10(3):240–254. 39. Chou, S., Schelter, W., and Yang, J. 1987. Characteristic sets and Gr¨obner bases in geometry theorem proving. In Computer-Aided Geometric Reasoning (H. Crapo, Ed.), INRIA, Rocquencourt, France, pp. 29–56.
References
605
40. Chui, H., and Rangarajan, A. 2000. A new point matching algorithm for non-rigid registration. IEEE Conf. on Computer Vision and Pattern Recognition (CVPR), Vol. 2, pp. 44–51. 41. Clifford, W.K. 1873. Preliminary sketch of bi-quaternions. Proc. London Math. Soc., 4: 381–395. 42. Clifford, W.K. 1878. Applications of Grassmann’s extensive algebra. Am. J. Math., 1:350– 358. 43. Clifford, W.K. 1882. On the classification of geometric algebras. In Mathematical Papers by William Kingdon Clifford (R. Tucker, Ed.), Macmillan, London. [Reprinted by Chelsea, New York, 1968; Title of talk announced already in Proc. London Math. Soc. 7(1876):135. 44. Colios, C., and Trahanias, P.E. 2001. A framework for visual landmark identification based on projective and point-permutation invariant vectors. Robotics and Autonomous Systems Journal, 35:37–51. 45. Csurka, G., and Faugeras, O. 1998. Computing three dimensional project invariants from a pair of images using the Grassmann–Cayley algebra. Journal of Image and Vision Computing, 16:3–12. 46. Cybenko, G. 1989. Approximation by superposition of a sigmoidal function. Mathematics of Control, Signals and Systems, 2:303–314. 47. Daniilidis, K., and Bayro-Corrochano, E. 1996. The dual quaternion approach to hand– eye calibration. IEEE Proceedings of the International Conference on Pattern Recognition (ICPR’96), Vol. I, Vienna, Austria, August, pp. 318–322. 48. Denavit, J., and Hartenberg, R.S. 1955. A kinematic notation for the lower-pair mechanism based on matrices. J. Opt. Soc. Am. A, pp. 465–484. 49. Dodwell, P.C. 1983. The Lie transformation group model of visual perception. Perception and Psychophysics, 34(1):1–16. 50. Doran, C.J.L. 1994. Geometric algebra and its applications to mathematical physics. Ph.D. thesis, University of Cambridge. 51. Dorst, L., Fontjine, D., and Mann, T. GAIGEN 2: Generates fast CCC or JAVA sources for low dimensional geometric algebra. 52. Dorst, L., Fontjine, D., and Mann, S. 2007. Geometric Algebra for Computer Science. An Object-Oriented Approach to Geometry, Morgan Kaufmann Series in Computer Science, Cambridge, MA. http://www.science.uva.nl/ga/gaigen/ 53. Dorst, L., Mann, S., and Bouma, T. 1999. GABLE: A Matlab tutorial for geometric algebra. http://www.carol.wins.uva.nl/gable. 54. Dress, A., and Havel, T. 1993. Distance geometry and geometric algebra. Foundations of Physics, 23(10):1357–1374. 55. Ebling, J., and Sheuermann, G. 2005. Clifford Fourier transform on vector fields. IEEE Transactions on Visualization and Computer Graphics, 11(4):469–479. 56. Ell, T.A. 1992. Hypercomplex spectral transformations. Ph.D. thesis, University of Minnesota. Faugeras, O. 1995. Stratification of three-dimensional vision: Projective, affine and metric representations. J. Opt. Soc. Am. A, pp. 465–484. 57. Faugeras, O. 1995. Stratification of three-dimensional vision: projective, affine and metric representations. J. Opt. Soc. Am. A, 465–484. 58. Felsberg, M. 1998. Signal processing using frequency domain methods in Clifford algebra. M.S. thesis, Computer Science Institute, Christian Albrechts Universit¨at, Kiel. 59. Fleet, D.J., and Jepson, A.D. 1990. Computation of component image velocity from local phase information. International Journal of Computer Vision, 5:77–104. 60. Fletcher, R. 1987. Practical Methods of Optimization, 2nd ed., John Wiley and Sons, New York. 61. FitzGerald, G.F. 1902. “[Review of ] Heaviside’s Electrical Papers”. In Scientific Writings of the Late George Francis FitzGerald (J. Lamour, Ed.), Dublin. 62. Fontijne, D. 2007. Efficient implementation of geometric algebra. Ph.D. thesis, University of Amsterdam. http://www.science.uva.nl/ fontjine/phd.html. 63. Fritzke, B. 1995. A growing neural gas network learns topologies. In Advances in Neural Information Processing Systems, Vol. 7, MIT Press, Cambridge, MA.
606
References
64. Fu, K.S., and Mui, J.K. 1980. A survey on image segmentation. Pattern Recognition, 12: 395–403. 65. Fulton, W., and Harris, J. 1991. Representation Theory: A First Course. Springer-Verlag, New York. 66. Gabor, D. 1946. Theory of communication. Journal of the IEE, 93:429–457. 67. Gentile, A., Segreto, S., Sorbello, F., Vassallo, G., Vitabile, S., and Vullo, V. 2005. CliffoSor, an innovative FPGA-based architecture for geometric algebra. In Proceedings of 45th Congress of the European Regional Science Association (ERSA), Vrije, Amsterdam, August 23–27, pp. 211–217. 68. Georgiou, G.M., and Koutsougeras, C. 1992. Complex domain backpropagation. IEEE Trans. on Circuits and Systems, pp. 330–334. 69. Geyer, C., and Daniilidis, K. 2000. A unifying theory for central panoramic systems and practical implications. In Proc. Eur. Conf. on Computer Vision, Dublin, pp. 445–461. 70. Gibbs, J.W. 1884. Elements of Vector Analysis. Privately printed in two parts, 1881 and 1884, Yale University Press, New Haven, CT. Reprinted in The Scientific Papers of J. Willard Gibbs, Vol. 2, pp. 84–90, Dover, New York, 1961. 71. Golub, G.H., and van Loan, C.F. 1989. Matrix Computations, Johns Hopkins University Press, Baltimore, MD. 72. Granlund, G.H., and Knutsson, H. 1995. Signal Processing for Computer Vision, Kluwer Academic Publishers. 73. Grassmann, H.G. 1844. Die Lineale Ausdehnungslehre, Wiegand, Leipzig. 74. Grassmann, H. 1877. Der Ort der Hamilton’schen Quaternionen in der Ausdehnungslehre. Math. Ann., 12:375. 75. Gu, Y.L., and Luh, J.Y.S. 1987. Dual-number transformation and its applications to robotics. IEEE Journal of Robotics and Automation, RA–3(6):615–623. 76. Hahn, S.L. 1992. Multidimensional complex signals with single-orthant spectra. In Proc. IEEE, 80(8):1287–1300. 77. Hahn, S.L. 1996. Hilbert Transforms in Signal Processing, Artech House, Boston. 78. Hamilton, W.R. 1853. Lectures on Quaternions, Hodges and Smith, Dublin. 79. Hamilton, W.R. 1866. Elements of Quaternions, Longmans Green, London; Chelsea, New York, 1969. 80. Hartley, R. 1984. Lines and points in three views – A unified approach. In ARPA Image Understanding Workshop, Monterey, CA. 81. Hartley, R. 1993. Chirality Invariants. In DARPA Image Understanding Workshop, pp. 745–753. 82. Hartley, R.I. 1994. Projective reconstruction and invariants from multiple images. IEEE Trans. PAMI, 16(10):1036–1041. 83. Hartley, R. 1998. The quadrifocal tensor. In ECCV98, LNCS, Springer-Verlag. 84. Hartley, R.I., and Zissermann, A. 2003. Multiple View Geometry in Computer Vision, 2nd ed., Cambridge University Press, Cambridge. 85. Heaviside, O. 1892. Electrical Papers, 2 vols., London. 86. Hestenes, D. 1966. Space-Time Algebra, Gordon and Breach, London. 87. Hestenes, D. 1986. New Foundations for Classical Mechanics, D. Reidel, Dordrecht. 88. Hestenes, D. 1991. The design of linear algebra and geometry, Acta Applicandae Mathematicae, 23:65–93. 89. Hestenes, D. 1993. Invariant body kinematics I: Saccadic and compensatory eye movements. Neural Networks, 7:65–77. 90. Hestenes, D. 1993. Invariant body kinematics II: Reaching and neurogeometry. Neural Networks, 7:79–88. 91. Hestenes, D. 2001. Old wine in new bottles: A new algebraic framework for computational geometry. In Geometric Algebra Applications with Applications in Science and Engineering (E. Bayro-Corrochano and G. Sobczyk, Eds.), Birkh¨auser, Boston. 92. Hestenes, D. 2003. Orsted Medal Lecture 2002: Reforming the mathematical language of physics. American Journal of Physics, 71(2):104–121.
References
607
93. Hestenes, D. 2009. New tools for computational geometry and rejuvenation of screw theory. In Geometric Algebra Computing for Engineering and Computer Science (E. BayroCorrochano and G. Sheuermann, Eds.), Springer, London. 94. Hestenes, D., and Sobczyk, G. 1984. Clifford Algebra to Geometric Calculus: A Unified Language for Mathematics and Physics, D. Reidel, Dordrecht. 95. Hestenes, D., and Ziegler, R. 1991. Projective geometry with Clifford algebra. Acta Applicandae Mathematicae, 23:25–63. 96. Hildebrand, D., Pitt, J., and Koch, A. 2009. High-performance geometric algebra computing using Gaalop. In Geometric Algebra Computing for Engineering and Computer Science (E. Bayro-Corrochano and G. Sheuermann, Eds.), Springer, London. 97. Hitzer, E., and Mawardi, B. 2007. Uncertainty principle for the Clifford geometric algebra C ln;0 n D 3(mod 4) based on Clifford Fourier transform. In Wavelet Analysis and Applications, Series: Applied and Numerical Harmonic Analysis (T. Qian, M.I. Vai, and X. Yuesheng, Eds.), Springer, New York, pp. 45–54. 98. Hoffman, W.C. 1966. The Lie algebra of visual perception. Journal of Mathematical Psychology, 3:65–98. 99. Hollerbach, J., and Nahvi, A. 1995. Total least squares in robot calibration. In 4th International Symposium on Experimental Robotics IV, pp. 274–282. 100. Horaud, R., and Dornaika, F. 1995. Hand–eye calibration. International Journal of Robotics Research, 14:195–210. 101. Hornik, K. 1989. Multilayer feedforward networks are universal approximators. Neural Networks, 2:359–366. 102. Hough, P. 1962. Methods and means for recognizing complex patterns. U.S. Patent 3 069 654. 103. Hsu, C.W., and Lin, C.J. 2001. A comparison of methods for multi-class support vector machines. Technical report, National Taiwan University, Taiwan. 104. Hsu, C.W., and Lin, C.J. 2002. A simple decomposition method for support vector machines. Machine Learning, 46:291–314. 105. Jancewicz, B. 1990. Trivector Fourier transformation and electromagnetic field. Journal of Mathematical Physics, 31(8):1847–1852. 106. Joachims, T. 1998. Making large-scale SVM learning practical. In Advances in Kernel Methods-Support Vector Learning (B. Sch¨olkopf, C.J.C. Burges, and A.J. Smola, Eds.), MIT Press, Cambridge, MA. Journal of Machine Learning Research, 5:819–844. 107. Kaiser, G. 1994. A Friendly Guide to Wavelets, Birkh¨auser, Boston. 108. Kantor, I.L., and Solodovnikov, A.S. 1989. Hypercomplex Numbers: An Elementary Introduction to Algebras, Springer-Verlag, New York. 109. Kingsbury, N. 1999. Image Processing with Complex Wavelets. Phil. Trans. Roy. Soc. Lond. A, 357:2543–2560. 110. Knerr, S., Personnaz, L., and Dreyfus, G. 1990. Single-layer learning revisited: A stepwise procedure for building and training a neural network. In Neurocomputing: Algorithms, Architectures and Applications (J. Fogelman, Ed.), Springer-Verlag, New York. 111. Koenderink, J.J. 1990. The brain: a geometry engine. Psychological Research, 52:122–127. 112. Kunze, S. 1999. Ein Hand–Auge–System zur visuell basierten Lokalisierung und Identifikation von Objekten. Diploma Thesis, Christian–Albrechts–Universit¨at Kiel, Institut f¨ur Informatik und Praktische Mathematik. 113. Kutzler, B., and Sifter, S. 1986. On the application of Buchberger’s algorithm to automated geometry theorem proving. Journal of Symbolic Computation, 2:389–398. 114. Lasenby, A.N. 1994. A 4D Maple package for geometric algebra manipulations in spacetime. http://www.mrao.cam.ac.uk/eclifford 115. Lasenby, J., Bayro-Corrochano, E.J., Lasenby, A., and Sommer, G. 1996. A new methodology for computing invariants in computer vision. In IEEE Proceedings of the International Conference on Pattern Recognition (ICPR’96), Vienna, Austria, Vol. I, pp. 393–397. 116. Lasenby, J., and Bayro-Corrochano, E. 1997. Computing 3D projective invariants from points and lines. In Computer Analysis of Images and Patterns, 7th Int. Conf., CAIP’97 (G. Sommer, K. Daniilidis, and J. Pauli, Eds.), Kiel, Springer-Verlag, pp. 82–89.
608
References
117. Lasenby, J., and Bayro-Corrochano, E. 1999. Analysis and computation of projective invariants from multiple views in the geometric algebra framework. In Special Issue on Invariants for Pattern Recognition and Classification (M.A. Rodrigues, Ed.), Int. Journal of Pattern Recognition and Artificial Intelligence, 13(8):1105–1121. 118. Lee, Y., Lin, Y., and Wahba, G. 2001. Multicategory support vector machines. Technical report 1043, University of Wisconsin, Department of Statistics, pp. 10–35. 119. Li, H., Hestenes, D., and Rockwood, A. 2001. Generalized homogeneous coordinates for computational geometry. In Geometric Computing with Clifford Algebra (G. Sommer, Ed.), Springer-Verlag, New York, pp. 27–59. 120. Li, H., Hestenes, D., and Rockwood, A. 2001. Generalized homogeneous coordinates for computational geometry. In Geometric Computing with Clifford Algebra (G. Sommer, Ed.), Springer-Verlag, New York. 121. Li, M., and Betsis, D. 1995. Hand–eye calibration. In Proc. Int. Conf. Computer Vision, Boston, June 20–23, pp. 40–46. 122. Lina, J.-M. 1997. Complex Daubechies Wavelets: Filters Design and Applications, ISAAC Conference, University of Delaware. 123. Lorensen, W., and Cline, H. 1987. Marching cubes: A high resolution 3D surface construction algorithm. Computer Graphics, 21(4):163–169. 124. Lounesto, P. 1987. CLICAL software packet and user manual. Helsinki University of Technology of Mathematics, Research report A248. 125. Lounesto, P. 1997. Clifford Algebras and Spinors, Cambridge University Press, Cambridge. 126. Luong, Q.T., and Faugeras, O.D. 1995. The fundamental matrix: Theory, algorithms, and stability analysis. 127. Magarey, J.F.A., and Kingsbury, N.G. 1998. Motion estimation using a complex-valued wavelet transform. IEEE Trans. Image Proc. 6:549–565. 128. Mallat, S. 1989. A theory for multiresolution signal decomposition: The wavelet representation. IEEE Trans. Patt. Anal. Mach. Intell., 11(7):674–693. 129. Mallat, S. 2001. A Wavelet Tour of Signal Processing, 2nd ed., Academic Press, San Diego. 130. Maybank, S.J., and Faugeras, O.D. 1992. A theory of self-calibration of a moving camera. International Journal of Computer Vision, 8(2):123–151. 131. Maybeck, P. 1979. Stochastic Models, Estimation and Control, Vol. 1, Academic Press, New York. 132. Mawardi, B., and Hitzer, E. 2006. Clifford Fourier transformation and uncertainty principle for the Clifford geometric algebra C l3;0 . Advances in Applications of Clifford Algebras, 16(1):41–61. 133. Mawardi, B., and Hitzer, E. 2007. Clifford algebra C l3;0 -valued wavelet transformation, Clifford wavelet uncertainty inequality and Clifford Gabor wavelets. To appear in International Journal of Wavelets, Multiresolution and Information Processing. 134. Medioni, G., Lee, M., and Tang, C. 2000. A Computational Framework for Segmentation and Grouping, Elsevier Science, Burlington, MA. 135. Meer, P., Lenz, R., and Ramakrishna, S. 1998. Efficient invariant representation. International Journal of Computer Vision, 26:137–152. 136. Mehrotra, K., Mohan, C.K., and Ranka, S. 1997. Elements of Artificial Neural Networks, MIT Press, Cambridge, MA. 137. Miller, W. 1968. Lie Theory and Special Functions, Academic Press, New York. 138. Mishra, B., and Wilson, P. 2005. Hardware implementation of a geometric algebra processor core. In Proceedings of IMACS International Conference on Applications of Computer Algebra, Nara, Japan. 139. Mitrea, M. 1994. Clifford Wavelets, Singular Integrals and Hardy Spaces. Lecture Notes in Mathematics 1575, Springer-Verlag, New York. 140. Mundy, J., and Zisserman, A. (Eds.) 1992. Geometric Invariance in Computer Vision, MIT Press, Cambridge, MA. 141. Mu˜noz, X. 2002. Image segmentation integrating color, texture and boundary information. Ph.D. thesis, University of Girona, Girona, Spain.
References
609
142. Needham T. Visual Complex Analysis. Oxford University Press, New York. Reprinted 2003. 143. Nguyen, V., G¨achter, S., Martinelli, A., Tomatis, N., and Siegwart, R. 2007. A comparison of line extraction algorithms using 2D range data for indoor mobile robotics. Auton. Robots, 23(2):97–111. 144. Pan, H.-P. 1996. Uniform full information image matching complex conjugate wavelet pyramids. Proceedings of the 18th ISPRS Congress, Vienna, Vol. XXXI. 145. Passino, K.M. 1998. Fuzzy Control. Addison-Wesley, Reading, MA. 146. Pearson, J.K., and Bisset, D.L. 1992. Back propagation in a Clifford algebra. Artificial Neural Networks, 2 (I. Aleksander and J. Taylor (Eds.), pp. 413–416. 147. Pellionisz, A., and Llin`as, R. 1980. Tensorial approach to the geometry of brain function: Cerebellar coordination via a metric tensor. Neuroscience, 5:1125–1136. 148. Pellionisz, A., and Llin`as, R. 1985. Tensor network theory of the metaorganization of functional geometries in the central nervous system. Neuroscience, 16(2):245–273. 149. Perantonis, S.J., and Lisboa, P.J.G. 1992. Translation, rotation, and scale invariant pattern recognition by high-order neural networks and moment classifiers. IEEE Trans. Neural Networks, 3(2):241–251. 150. Perwass, C.B.U. 2000. Applications of geometric algebra in computer vision. Ph.D. thesis, University of Cambridge. 151. Perwass, C.B.U. 2006. CLUCal: http://www.clucal.info/ 152. Perwass, C., Gebken, C., Sommer, G. 2003. Implementation of a Clifford algebra coprocessor design on a field programmable gate array. In Clifford Algebras: Application to Mathematics, Physics, and Engineering (R. Ablamowicz, Ed.), 6th Int. Conf. on Clifford Algebras and Applications, Cookeville, TN, Progress in Mathematical Physics, Birkhauser, R Boston, pp. 561–575. 153. Platt, J.C., Cristianini, N., and Shaw-Taylor, J. 2000. Large margin DAGs for multiclass classification. In Advances in Neural Information Processing Systems, Vol. 12, MIT Press, Cambridge, MA, pp. 547–553. 154. Poelman, C.J., and Kanade, T. 1994. A paraperspective factorization method for shape and motion recovery. In European Conference on Computer Vision (J.-O. Eklundh, Ed.), Stockholm, pp. 97–108. 155. Poggio, T., and Girosi, F. 1990. Networks for approximation and learning. IEEE Proc., 78(9):1481–1497. 156. Porteous, R.I. 1995. Clifford Algebras and the Classical Groups, Cambridge University Press, Cambridge. 157. Pozo, J.M., and Sobczyk, G. 2001. Realizations of the conformal group. Geometric Algebra with Applications in Science and Engineering (E. Bayro-Corrochano and G. Sobczyk, Eds.), Birkh¨auser, Boston. 158. Press, W.H., Teukolsky, S.A., Vetterling, W.T., and Flannery, B.P. 1994. Numerical Recipes in C, Cambridge University Press, New York. 159. Quan, L. 1994. Invariants of 6 points from 3 uncalibrated images. In Proceedings of the European Conference on Computer Vision, Vol. II, pp. 459–470. 160. Ranjan, V., and Fournier, A. 1995. Union of spheres (UoS) model for volumetric data. In Proceedings of the 11th Annual Symposium on Computational Geometry, Vancouver, Canada, C2-C3, pp. 402–403. 161. Reyes-Lozano, L., Medioni, G., and Bayro-Corrochano, E. 2007. Registration of 3D points using geometric algebra and tensor voting. Journal of Computer Vision, 75(3):351–369. 162. Riesz, M. 1958. Clifford Numbers and Spinors, Lecture Series no. 38, University of Maryland. Reprinted as facsimile (E.F. Bolinder and P. Lounesto, Eds.), Kluwer, Dordrecht, 1993. 163. Rivera-Rovelo, J., and Bayro-Corrochano, E. 2009. The use of geometric algebra for 3D modelling and registration of medical data. Journal of Mathematical Imaging and Vision, 34(1):48–60. 164. Rooney, J. 1978. On the three types of complex number and planar transformations. Environment and Planning B, 5:89–99. 165. Rosenhan, B., Perwass, C., and Sommer, G. 2005. Pose estimation of 3D free-form curves. Journal of Computer Vision, 62(3):267–289.
610
References
166. Rumelhart, D.E., and McClelland, J.L. 1986. Parallel Distributed Processing: Explorations in the Microstructure of Cognition, MIT Press, Cambridge, MA. 167. Sabata, R., and Aggarwal, J.K. 1991. Estimation of motion from a pair of range images: A review. CVGIP: Image Understanding, 54:309–324. 168. Sangwine, S.J. 1996. Fourier transforms of colour images using quaternion or hypercomplex numbers. Electronic Letters, 32(21):1979–1980. 169. Selig, J.M. 1999. Clifford algebra of points, lines and planes. Technical report SBU-CISM99-06, South Bank University, School of Computing, Information Technology and Maths. 170. Selig, J. 2000. Robotics kinematics and flags. In Advances in Geometric Algebra with Applications in Science and Engineering (E. Bayro-Corrochano and G. Sobczyk, Eds.), Birkh¨auser, Boston, 2000. 171. Semple, J.G., and Kneebone, G.T. 1985. Algebraic Projective Geometry, Oxford University Press, Oxford. Reprinted by Oxford Science Publications. 172. Shashua, A. 1994. Projective structure from uncalibrated images: Structure from motion and recognition. PAMI, 16(8):778–790. 173. Shashua, A., and Werman, M. 1995. Trilinearity of three perspective views and its associated tensor. In Proceedings ICCV’95, MIT Press. 174. Shiu, Y.C., and Ahmad, S. 1989. Calibration of wrist-mounted robotic sensors by solving homogeneous transform equations of the form AX D XB. IEEE Trans. Robotics Automation, 5:16–27. 175. Siadat, A., Kaske, A., Klausmann, S., Dufaut, M., and Husson, R. 1997. An optimized segmentation method for a 2D laser-scanner applied to mobile robot navigation. Proceedings of the 3rd IFAC Symposium on Intelligent Components and Instruments for Control Applications, pp. 153–158. 176. Siciliano, B. 1990. Robot Kinematics, Universit`a degli di napoli, Italy. CRC Press Handbook, Vol. 1, pp. 1–50. 177. Sobczyk, G. 1997. The generalized spectral decomposition of a linear operator. The College Mathematics Journal, 28(1):27–38. 178. Sobczyk, G. 1997. Spectral integral domains in the classroom. Aportaciones Matem´aticas, Serie Comunicaciones 20:169–188. 179. Sobczyk, G. 2000. Universal geometric algebra. In Advances in Geometric Algebra with Applications in Science and Engineering (E. Bayro-Corrochano and G. Sobczyk, Eds.), Birkh¨auser, Boston, 2001. 180. Sorenson, H.W. 1966. Kalman filtering techniques. In Advances in Control Systems Theory and Applications, Vol. 3, (C.T. Leondes, Ed.), Academic Press, New York, pp. 218–292. 181. Sparr, G. 1994. Kinetic depth. In Proceedings of the European Conference on Computer Vision, Vol. II, pp. 471–482. 182. Stark, H. 1971. An extension of the Hilbert transform product theorem. Proc. IEEE, 59:1359– 1360. 183. Study, E. 1903. Geometrie der Dynamen, Leipzig. 184. Tait, P.G. 1890. Elementary Treatise on Quaternions, 3rd ed., Cambridge, 1890, p. vi. 185. Tamura, H., Mori, S., and Yamawaki, T. 1980. Textural features corresponding to visual perception. IEEE Transactions on Systems, Man and Cybernetics, 8(6):460–473. 186. Tomasi, C., and Kanade, T. 1992. Shape and motion from image streams under orthography: A factorization method. Int. J. Computer Vision, 9(2):137–154. 187. Triggs, W. 1995. Matching constraints and the joint image. In IEEE Proceedings of the International Conference on Computer Vision (ICCV’95), Boston, pp. 338–343. 188. Tsai, R.Y., and Lenz, R.K. 1989. A new technique for fully autonomous and efficient 3D robotics hand/eye calibration. IEEE Trans. Robotics and Automation, 5:345–358. 189. Tsochantaridis, I., Hofmann, T., Joachims, T., and Altun, Y. 2004. Support vector machine learning for interdependent and structured output spaces. International Conference on Machine Learning (ICML). 190. Vapnik, V. 1998. Statistical Learning Theory, Wiley, New York. 191. Wang, C. 1992. Extrinsic calibration of a vision sensor mounted on a robot. IEEE Trans. Robotics and Automation, pp. 161–175.
References
611
192. Weston, J., and Watkins, C. 1998. Multi-class support vector machines. In Proceedings of ESANN99 (M. Verleysen, Ed.), Brussels, D. Facto Press. Technical Report CSD-TR-98-04, Royal Holloway, University of London. 193. White, N. 1991. Multilinear Cayley factorization. Journal of Symbolic Computation, 11: 421–438. 194. Wiaux, Y., Jacques, L., and Vandergheynest, P. 2005. Correspondence principle between spherical and Euclidean wavelets. Astrophysical Journal, 632(1):15–28. 195. Xu, C. 1999. Deformable models with applications to human cerebral cortex reconstruction from magnetic resonance images. Ph.D. thesis, Johns Hopkins University, pp. 14–63. 196. Yaglom, M. 1968. Complex Numbers in Geometry, Academic Press, Leicester, UK. Zhang, L., and Ghosh, B.K. 2000. Line segment based map building and localization using 2D laser rangefinder. In Proceedings of the IEEE International Conference on Robotics and Automation, Vol. 3, pp. 2538–2543. 197. Zhang, Z. 1992. Iterative point matching for registration of free-form curves. Technical report 1658, INRIA. 198. Zhang, Z., and Faugeras, O. 1992. 3–D Dynamic Scene Analysis, Springer-Verlag, New York. 199. Zwicker, E. 1998. Psychoacoustics: Facts and Models, 2nd ed., Springer-Verlag, New York.
Index
A additive split, 150 affine – , n-dimensional plane, 126 – n-plane, 133 – plane, 316, 347 – transformation, 239 affine plane, 169 – , lines and planes, 170 algebra – , exterior is the quotient of T .V /, 589 – , full tensor T .V /, 589 – , graded, 589 – Grassmann–Cayley, 590 – of Grassmann–Cayley, 245 algebra automorphism, 579 algebra of incidence, 244, 381 algebras – of Clifford, 5 – of Grassmann, 5, 46 algorithm – , TPS-RPM, 567 – RANSAC, 537 alignment condition, 305 analytic signal – , quaternionic, 204, 214 analytical – , signal partial, 205 – , signal quaternionic, 207 angle – , dihedral, 90 angular momentum, 105 anticommutator product, 10 antisymmetric part, 9 Arthur Cayley, 590 associative algebra – , unital, 575 attitude condition, 305 automated visual inspection, 449
automorphism, 580 automorphisms, 86
B back-propagation algorithm, 285 backward transformation, 303 bi-quaternion, 64, 80 bilinear constraint, 260, 383 biological creatures, 277 bivector, 5, 9, 64, 66, 69 – , Faraday, 34 – , doubling, 172 blade, 6, 11 – , magnitude or modulus of a, 20 blade of grade r, 6 body–eye – , calibration problem, 497 body–eye calibration algorithm, 533 body–eye calibration problem, 498 bracket, 145, 245, 590 – homogeneous equations, 591 – ring, 590 – statement, 594 bracket polynomial, 379 brain tumor, 452 Brauer class of the algebra, 581
C camera self-localization, 390 carriers, 165 catadioptric – , image, 266 – , image plane, 269 – , projection, 265 – , sensor, 264 – , unify model, 267 Cauchy Riemann equations, 32
613
614 Cayley – , Arthur, 590 Cayley Factorization, 593 Cayley space, 592 Cayley-Menger determinant, 169 classification of the Pin representations, 584 Clifford – , convolution, 30 – , correlation, 30 Clifford Algebra – , structure and classification of, 579 Clifford algebra, 5, 46 – , complete classification of real, 581 – , tensor product of, 580 Clifford algebras – , Real and complex, 577 Clifford–Fourier Transform, 214 Clifford algebras Clp;q .C/, 583 Clifford group – , homomorphism from the, 583 – .V / (or Lipschitz group), 582 Clifford groups – Pin and Spin groups and Spinors, 581 Clifford multilayer perceptron, 280 Clifford product, 575 Clifford Support Vector Machines, 278, 430 CliffordSVM – , regression, 294 cognitive – systems, 4 collinearity – , quasi-, 398 – , test, 398 common dividend of lowest grade, 21, 247 commutator product, 10 commutator product of bivectors, 125 complex filters, 359 complex number, 595 composed number, 64 condition of Mercer, 288 conformal – , direct kinematics, 177 – , geometry, 155 – , main involution, 179 – , mapping, 170 – , spaces, 152 – , split, 150, 154 – , transformation, 173, 180 conformal geometric algebra, 149 conformal representation, 446 conic – , parameterized equation, 369 constraint – , bilinear, 260
Index – , quadrilinear, 264 – , trilinear, 261 contraction, 16 convex hull test, 399 convex quadratic problem, 437 convolution, 29, 211 coordinates – , barycentric, 115 coordinates Pl¨ucker, 244 coplanarity test, 399 Coriolis tensors, 344 cross product, 18, 585 cross-ratio, 253, 374 – , 1D, 268, 381 – , 2D, 269 – , generalization of the, 254 CSVM, 278, 430 – , for classification, 288 – , nonlinear classification, 292 CT image, 565 curl, 27 – , of a multivector field, 28
D degenerated motor, 82 Delaunay tetrahedrization, 557 Denavit–Hartenberg – parameters, 300, 307 – procedure, 300 dense vector field, 444 density theorem, 278, 286 derivative – , of a vector, 27 deterministic annealing, 568 differential geometry, 585 differential kinematics, 325 dilator, 179 Dirac theory, 39 direct kinematics, 300, 306 direct product, 595 direct sum, 592 directance, 151 directed distance, 141, 171 – of lines, 141 – of planes, 142 direction moment, 172 direction of the line, 244 directrix, 180 discrete quaternionic Fourier transform, 212 disparity estimation, 358 disparity images, 361 div, 27
Index divergence – , of a multivector field, 28 division algebra, 584 doubling technique, 212 dual – , equation of the point, 96 – , function of a variable, 63 – , unit shear number, 65 – angles, 301 – number, 64, 65 – spinor representation, 84 – terms of direction and moment, 83 duality, 19, 244, 246 dynamic equation, 107 dynamic motion model, 478 dynamics – of a robot arm, 340
E electromagnetic – , field strength, 34 end-effector, 300 endomorphisms, 127 endoscopes in surgeries, 501 endoscopic camera, 497 energy – , kinetic, 328 enforcing geometric constraint, 481 epipole, 392 estimation – , batch using SVD techniques, 459 – of 3D Euclidean transformation, 459 – of 3D rigid motion using 3D line observations, 470 – of the hand–eye motor using SVD, 464 – rotation, 476 estimation of 3D correspondences, 415 Euclidean – , covariant geometry, 150 – plane, 64 – signature, 240 Euler representation, 82 Euler Lagrange equations, 344 even subalgebra, 80 exact sequences, 582 exponential mapping, 128 extended Kalman Filter – , motor, 477 extended Kalman filter, 473 – rotor, 474 extensors, 592 exterior algebras, 586 exterior algebra, 575
615 exterior power of V , 587 extrinsic parameters, 239 F factorization method for shape and motion, 393 Fast Fourier Transform, 218 feature space, 288 FFT, 218 field of the complex numbers, 578 filter complex Gabor, 213 Flag, 595 flag, 115, 144 – , involving points, lines and planes, 111 – line–plane, 111 – point–line, 111 – point–line–plane, 111 flag, soma, 177 flat point, 270 flats, 165 flow vector, 348 focal length, 242 forward transformation, 303 Fourier – , 2D transform, 203, 204 – , inverse quaternionic transform, 204 – , quaternionic transform, 201 Fourier Transform – , Clifford, 214 – , Space and Time Geometric Algebra, 218 – , n Dimensional Clifford, 219 Fourier transform – , discrete quaternionic, 212 – , inverse discrete quaternionic, 212 – , one dimensional, 203 – , two dimensional, 203 frame – , nonorthonormal, 7 – , reciprocal frame, 7 Frobenius theorem, 73 fundamental matrix, 260, 382, 383 – observed, 260 fuzzy surfaces, 417 G Gaussian, 280 – kernel, 348 general linear Lie algebra, 129 generalized dot product, 131 Generalized Gradient Vector Flow (GGVF), 444 generalized homogeneous coordinates, 154 generalized inner product, 11, 13, 17
616 geometric – correlation operator, 283 – cross-correlation, 282 – neuron, 281, 425 – product, 281 geometric algebra, 5, 46 – , Euclidean, 67 – , degenerated, 299 Geometric calculus in 2D, 31 geometric constraint, 407, 481 geometric indicator, 111 geometric product, 6, 15, 69, 89 Gibb’s vector algebra, 584 Gibbs – Josiah Willard, 584 grad, 27 grade involution, 20 graded algebras, 589 gradient information, 452 Gramm matrix, 291 grasping operation, 304 Grassman algebra, 119 Grassmann – , reciprocal subalgebras, 118 – Hermann, 590 Grassmann algebra, 586 Grassmann–Cayley algebra, 591–593 Grassmann–Cayley algebras, 590 Grassmann–Cayley statements, 593 grid reference frame, 496 group – , orthogonal SO(2), 130 – , spinor Spin(2), 131 – , sub-one-parameter, 134 groups Spin.3/, 583 Growing Neural Gas (GNG), 444
H Haar measure, 232 Hahn, 208 Hamilton relations, 240 Hamilton W.R., 72 hand–eye – , using CGA, 545 hand–eye calibration – in CGA, 493 hand-eye problem, 460 – Euclidean transformation, 460 Hartley transform, 203, 210, 293 Heavised – Oliver, 584 Hermann Grassmann, 590 Hermitian
Index – , inner product, 78 – , quaternionic function, 74 – function, 204 – symmetry, 206 Hermitian matrix, 583 Hesse distance, 94, 96, 172, 314 Hilbert transform, 205, 207 – , partial, 206 – , total, 206 Hodge – , dual, 18 Hodge dual, 246 Hodge star operator, 17 homogeneous – coordinates, 139 – representant, 140 homogeneous coordinates, 80, 87, 237, 241, 243, 299 homogeneous degree, 589 homogeneous subspaces, 6 homogeneous vector, 6 horosphere, 126, 133, 154 – , homogeneous model, 154 Hough Transform, 421 Hough transform – for line detection, 552 hyperplane, 153 hyperplane reflection, 582 hypersphere, 155 hypervolume, 64 I incidence algebra, 381 incidence operators, 19 incidence relation – , point, line and a plane, 110 inertia of the link, 330 inner product, 9 – , generalized, 11 – or contraction, 15 Inner Product Null Space, 157 interior product, 588 internal world, 277 intersection, 139 – of line and plane, 248 – of two lines, 250 – of two planes, 249 invariant, 255, 376 – , p 2 , 398 – , point-permutation, 398 – , projective and permutation p 2 -, 401 – , recognition phase, 402 – , relative linear, 380
Index invariant theory, 591 invariants – , in the conformal space, 268 – , projective and permutation p 2 , 271 inverse discrete quaternionic Fourier transform, 212 inverse kinematic – using conformal geometric algebra, 318 – using motor algebra, 308 – using the 3D affine plane, 315 inverse kinematics, 300, 308, 316 inverse point projection, 267 inversion, 173 involution, 179, 579 IPNS – , Inner-Product Null Space, 157 IPNS representation – , of the line, 161 – , of the point,line,plane, circle,sphere, 162 – , of the sphere, 158 isomorphism, 588 isomorphisms, 287 iterated closest point (ICP), 568
J Jacobi identity, 79, 130 join, 21, 139, 247 – operation, 592 join-image, 393, 394 joins, 21 joint, 300 jointtransition, 300 Josiah Willard Gibbs, 584
K k-vector, 6 Kalman filter, 470 kernel – , Gaussian , 348 kinematics – , differential, 325 – direct, 306 – inverse, 308 kinetic energy, 328 – of a robot arm, 329 KKT conditions, 290 Klein Gordon equation, 38
L Lagrange multiplier, 290 landmark identification, 398
617 Laplace expansion, 378 laser scanner, 533 learning machines – , polynomial, 288 – , radial basis, 288 – , two- layer neural networks, 288 learning rate, 286 Lie – , algebra generator, 177 – , algebra generators, 75 – , bivector algebra, 77 – , group manifold, 75 – , group of rotors, 75 – , group theory, 74 – algebras, 127 – bracket, 129 – groups, 127 – operators, 349 – perceptrons, 350 Lie algebra, 172 – , of the conformal group, 172 line – direction, 95 – moment, 95 line motion model, 478 linear algebra, 22 – , derivations, 30 linear motion equation, 480 linear shift-invariant, 213 Lobachevsky Nikolai, 152 local quaternionic phase, 214 logical cube, 559 Long Term Memory (LTM), 441 Lorentz transformation, 65, 66 Lorentzian metric, 33 Lorentzian space–time, 243 Lorenz attractor, 428 loudness image, 350
M magnetic resonance (MR), 451 magnetic resonance image (MRI), 448 magnitude, 20 – of a rotor, 68 manifold, 127 – , high-dimensional, 443 manipulator – SCARA, 300, 307 – Stanford, 300, 302, 308 map 2D, 537 map building, 536 – using laser and stereo vision, 545 map building 3D, 538
618 MAPLE, 307, 311, 315 marching – cubes, 557 – spheres, 557 marching cubes algorithm, 415, 557 marching spheres – , object registration, 564 matching laser reading, 533 matrix – , Coriolis, 337 Maxwell equations, 32, 33, 35, 42 measurement model, 473 medical image analysis, 557 meet, 21, 22, 139, 247, 317 – operation, 592 meet dual to the join, 593 MEKF filter, 483 metric – , Euclidean, 13, 242 – , Minkowski, 13, 242 metric geometry, 13, 242 Minkowski plane, 14, 149, 172 MLP – , multivector-valued network, 427 – , real-valued network, 427 model of Cybenko, 283 model of motion of lines in 4D space, 485 modeling of conics, quadrics, 590 modulus, 20 moment of the line, 244 momentum, 286 motion, 396 – , 3D model of the line, 97 – , 3D model of the plane, 97 – , 3D model of the point, 97 – , 4D model of the line, 100 – , 4D model of the plane, 100 – , 4D model of the point, 100 – , motor model of the line, 98 – , motor model of the plane, 99 – , motor model of the point, 98 – , screw of the line, 98 motion equation – ,of a line, 110 – ,of a plane, 110 – ,of a point, 110 motor, 64, 80 – , degenerated, 85 – , dual part of the, 84 – reflections, 86 motor algebra – , spatial velocity, 108 Motor-Extended Kalman Filter, 477 multidimensional descent learning rule, 285
Index multilayer perceptron, 278 – , complex, 279 multilinear mapping of V , 588 multiresolution analysis, 221 multivalued function (field), 26 multivector, 6 – , dual of a, 19 – , magnitude of a, 21 – , valued function, 25 – anti-involution, 286 – inhomogeneous, 15 – valued MLP network, 426 multivector field, 28 multivector product, 8
N Neural Gas, 444 neuron – , McCulloch–Pitts, 281 – , geometric, 281 non-degenerated quadratic form, 577 non-spatial part, 244 nondegenerated quadratic form, 580 nonlinear mappings, 278 nonorthonormal frames, 7 norm of a multivector, 285 null basis, 149 null cone, 153, 154 Null Space – , Inner Product, 157 – , Outer Product, 157 null vector, 149, 150 null vectors, 154 number – , complex, 63, 65 – , composed, 63 – , dual, 63
O object registration, 564 observable system, 13, 242 odometry, 535 Oliver Heavised, 584 omnidirectional camera, 539 one-parameter group, 128 operator – , Hodge star, 17 OPNS – , Outer Product Null Space, 157 OPNS representation – ,of the circle, 160 – ,of the plane, 159
Index – ,of the point, 161 – ,of the point,line,plane, circle,sphere, 162 – ,of the sphere, 159 optical – plane, 256 – rays, 256 optical flow, 348 optimal hyperplane, 294 optimal Kalman gain matrix, 481 orthogonal automorphism, 582 orthogonal group, 583 orthogonal idempotents, 580 orthogonal transversions, 582 Outer Product Null Space, 157 outermorphism, 22 outstar output, 349
P p2 -invariants, 271 pair of points, 162 pan-tilt unit, 550 partial Hilbert transform, 206 Pascal theorem, 274, 370 Pascal theorem using brackets, 370 path following, 541 Pauli – , and Dirac theories, 32 – , spin matrices, 37 pencil of lines, 369 perception action cycle, 367 perceptrons, 349 perspective camera, 239 Peter Guthrie Tait, 585 Pin and Spin groups, 583 Pin group, 583 Pin group to the orthogonal group, 583 pitch, 82 Pl¨ucker coordinates, 95, 244, 259, 591 Pl¨ucker–Grassmann relations, 378 plane at infinity, 140 plunge, 166 – , and meet, 166 points at infinity, 387 polynomial ring, 591 potential energy, 335 prediction, 429 problem – , encoder–decoder, 426 – , xor, 425 product – , anticommutator, 10 – , associative graded, 589 – , commutator, 10
619 – , cross, 18 – , generalized inner, 13 – , inner, 19 – , outer, 19 projection, 12, 150 projective – , fundamental invariant, 251 – , invariant fundamental, 373 – basis, 377 – depth, 391 – geometry, 13, 237, 242, 369 – invariant, 370, 381 – rays, 139 – reconstruction, 395 – split, 13, 87, 240–242, 253, 259, 260, 375 projective group, 591 pseudoscalar, 19, 240
Q QFT, 230 – estimation of optical flow, 357 – estimation of quaternionic phase, 358 – preprocessing of speech, 352 – recognition of French phonemes, 355 quadratic form, 575 quadrifocal tensor, 264 quadrilinear constraint, 264 quantum mechanics, 37 quasi-collinearity, 398 quaternion, 69, 240 – , unit, 67 – algebra, 66 – Gabor filter, 353 quaternion algebra, 595 quaternion algebra of Hamilton, 578 Quaternion Fourier Transform, 230 quaternion Gabor kernel, 293 quaternionic – wavelet filters, 361 – wavelet pyramid, 361 quaternionic filters, 359 quaternionic Fourier transform, 201 quaternionic Gabor filters, 201, 230
R radial basis function, 284 – network, 279 reciprocal frame, 7 reciprocal frames – , with curvilinear coordinates, 31 reciprocal null cones, 118 reflection, 67, 68, 90, 92
620 region of interest (ROI), 449 registration methods, 567 rejection, 12, 150 relativistic phenomena, 38 rendezvous method, 309 representation – , 3D of the line, 93 – , 3D of the plane , 94 – , 3D of the point , 93 – , 4D of the line, 96 – , 4D of the plane, 96 – , 4D of the point, 96 – , asymmetric, 95 – , motor of the line, 95 – , motor of the plane, 95 – , motor of the point, 95 reversion, 20, 68 rigid body motion in CGA, 491 rigid motion, 299 robot navigation, 540 robust point matching (TPS–RPM), 568 Rodrigues formula, 89 Rodriguez – , formula, 104 Rodriguez formula, 114 rotation – , orthogonal, 70 rotor, 67, 68, 70, 81, 82, 240 – , algebra of, 69 – , two conformal reflections, 177 – , unit , 69 – , orthogonal, 70 – algebra, 67 rotor, angular velocity, 105 rounds, 165 ruled surface, 180 – , Pl¨ucker conoid, 183 – , cone and a sphere, 182 – , cycloidal curves, 181 – , helicoid, 182
S Schr¨odinger equation, 37 Schr¨odinger Pauli equation, 35 screw, 82, 300 – axis line, 82 – transformation, 301 screw axis, 478 segmentation, 559 – , boundary and region based, 559 – , brain tumor, 562 – , texture, 560 segmenting a ventricle, 451
Index self-adjoint, 23 Self-Organizing Maps (SOM), 444 sensor calibration, 533 shape, 396 – constraint, 379 sigmoid, 280 signature, 64, 87 – , Euclidean, 87 simplex, 23, 141, 187 – , operators, 25 Simplexes – , and sphere, 168 Simpson rule, 186, 274 singular value decomposition, 393 skewsymmetrization, 589 space–time split, 13, 242 space–time system, 13, 242 spatial part, 244 spatial velocity, 105 – , rigid body, 101 special orthogonal group, 583 spectral form, 128 sphere – , dual, 158 spin group, 65 spin group Spin.3/, 583 spinor, 81, 82 – , dual-number representation of the, 84 spinor norm Q, 582 spinor operator, 37, 38 spinor representation, 35 spiral 3D, 430 split – , additive, 150 – , conformal for points and simplexes, 151 – , multiplicative, 150 Stanford manipulator, 300, 309 statements – , algebraic synthetic, 590 – , projective synthetic , 590 stereographic projection, 155 stereoscopic views, 437 streamline, 445 subgroup of index 2, 582 subspace of k-tensors, 587 super algebras, 590 super symmetry, 590 support – multivector machine, 288 – vector machine, 288 surrounds, 165 symmetric part, 9 symmetry, 332, 409
Index T Tait – Peter Guthrie, 585 tangent operators, 129 Taylor expansion, 76, 78 Taylor series, 63, 84 tensor – , ellipsoid-type, 414 – , stick, 414 tensor calculus, 35 Tensor Voting, 405 – methodology, 409 tensors – , second-order symmetric, 409 test – collinearity, 398 – convex hull, 399 – coplanarity, 399 theorem – , Apollonius, 186 – , Artin–Wedderburn, 580 – , Desargues, 146, 187, 274 – , Mercer, 292 – , Pascal, 187 – , in projective and Euclidean geometry, 594 – , second fundamental of invariant theory, 591 – Cartan–Dieudonn´e, 584 Time Delay Neural Network, 357 touching condition, 305 tracker – endoscope, 491 – endoscope calibration, 494 – Polaris, 496 transform – , Hartley, 203 – , Hilbert, 205, 207 transformation, 64, 65, 69 – , Euclidean, 80, 81 – , backward, 303, 313 – , forward, 303, 313 – , group on the plane, 64 – , prismatic, 301, 303 – , revolute, 301, 303 – , screw, 301 – , shear, 65 – of points, lines, planes, 301 translation, 173 translator, 82, 302 transversor, 176 trifocal tensor, 262, 384 trilinear constraint, 261 trilinear constraint, 262 trivector, 10, 66
621 twist, 82 – , coupled, 180 – , exponential, 104
U union, 139 Union of Spheres, 557 unit pseudoscalar, 240 unit sphere, 595 universal algebra Clp;q .V /, 582 universal approximators, 279 universal geometric algebra, 119 universal table of geometric algebras, 287
V vector – , bases orthogonal null, 122 – , bases reciprocal, 118 – field, 137 vector derivative, 27 velocity – , linear and angular, 105 versor, 133, 173, 443 – , for dilation, 179 – , for inversion, 175 – , for transversion, 176 – , representation, 173 vertex, 180 visual invariants, 134 visual landmark construction, 399 voting – , ball field, 414 – , sparse stick, 416 voting fields, 411 Voting fields in 3D, 410
W wavelet – , 3-Dimensional Clifford Transform, 233 – , Clifford transform, 233 – , Euclidean, 235 – , Tridimensional Clifford Transform, 232 – , complex transform, 223 – , continuous conformal geometric algebra transform, 234 – , mother, 220 – , n-Dimensional Clifford Transform, 235 – , quaternion transform, 225 – , quaternionic analysis, 231 – , quaternionic pyramid, 228, 230 – , spherical, 235
622 wavelet pyramid, 223 wavelet transform – , continuous, 220 wedge product, 9 Weyl representations, 583 William K. Clifford, 575
Index Witte basis, 123 Wolfe dual programming, 290
X XOR problem, 425