Multibody Dynamics
Computational Methods in Applied Sciences Volume 12 Series Editor E. Oñate International Center for Numerical Methods in Engineering (CIMNE) Technical University of Catalunya (UPC) Edif cio C-1, Campus Norte UPC Gran Capitán, s/n 08034 Barcelona, Spain
[email protected] www.cimne.com
For other titles published in this series, go to www.springer.com/series/6899
Carlo L. Bottasso
Multibody Dynamics Computational Methods and Applications
123
Editor C.L. Bottasso Politecnico di Milano Dipartimento di Ingegneria Aerospaziale Via La Masa, 34 20156 Milano Italy
[email protected]
ISBN 978-1-4020-8828-5
e-ISBN 978-1-4020-8829-2
Library of Congress Control Number: 2008933572 All Rights Reserved c 2009 Springer Science + Business Media B.V. No part of this work may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, electronic, mechanical, photocopying, microfilming, recording or otherwise, without written permission from the Publisher, with the exception of any material supplied specifically for the purpose of being entered and executed on a computer system, for exclusive use by the purchaser of the work. Printed on acid-free paper
9 8 7 6 5 4 3 2 1 springer.com
Preface
Multibody Dynamics is an area of Computational Mechanics which blends together various disciplines such as structural dynamics, multi-physics mechanics, computational mathematics, control theory and computer science, in order to deliver methods and tools for the virtual prototyping of complex mechanical systems. Multibody dynamics plays today a central role in the modeling, analysis, simulation and optimization of mechanical systems in a variety of fields and for a wide range of industrial applications. The ECCOMAS Thematic Conference on Multibody Dynamics was initiated in Lisbon in 2003, and then continued in Madrid in 2005 with the goal of providing researchers in Multibody Dynamics with appropriate venues for exchanging ideas and results. The third edition of the Conference was held at the Politecnico di Milano, Milano, Italy, from June 25 to June 28, 2007. The Conference saw the participation of over 250 researchers from 32 different countries, presenting 209 technical papers, and proved to be an excellent forum for discussion and technical exchange on the most recent advances in this rapidly growing field. This book is a collection of revised and expanded versions of papers presented at the Conference. Goal of this collection of works is to offer an upto-date view on some of the most recent cutting edge research developments in Multibody Dynamics. Contributions have been selected from all sessions of the Conference, and cover the areas of biomechanics (Ackermann and Schiehlen, Millard et al.), contact dynamics (Tasora and Anitescu), control, mechatronics and robotics (Bottasso), flexible multibody dynamics (Cugnon et al., Lunk and Simeon, Betsch and S¨ anger), formulations and numerical methods (Jay and Negrut), optimization (Collard et al.), real-time simulation (Binami et al.), software development, validation, education (Pennestr`ı and Valentini), and vehicle systems (Ambr´ osio et al.). I hope you will find the reading of this collection enjoyable and stimulating, as we anxiously wait for the 2009 edition of this excellent Conference Series. Milano, May 2008
Carlo L. Bottasso V
Contents
Preface . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . V Physiological Methods to Solve the Force-Sharing Problem in Biomechanics Marko Ackermann and Werner Schiehlen . . . . . . . . . . . . . . . . . . . . . . . . . . .
1
Multi-Step Forward Dynamic Gait Simulation Matthew Millard, John McPhee, and Eric Kubica . . . . . . . . . . . . . . . . . . . . 25 A Fast NCP Solver for Large Rigid-Body Problems with Contacts, Friction, and Joints Alessandro Tasora and Mihai Anitescu . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 45 Solution Procedures for Maneuvering Multibody Dynamics Problems for Vehicle Models of Varying Complexity Carlo L. Bottasso . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57 Synthesis and Optimization of Flexible Mechanisms Frederic Cugnon, Alberto Cardona, Anna Selvi, Christian Paleczny, and Martin Pucheta . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81 The Reverse Method of Lines in Flexible Multibody Dynamics Christoph Lunk and Bernd Simeon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 95 A Nonlinear Finite Element Framework for Flexible Multibody Dynamics: Rotationless Formulation and Energy-Momentum Conserving Discretization Peter Betsch and Nicolas S¨ anger . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 119 A Second Order Extension of the Generalized–α Method for Constrained Systems in Mechanics Laurent O. Jay and Dan Negrut . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 143
VII
VIII
Contents
Kinematical Optimization of Closed-Loop Multibody Systems Jean-Fran¸cois Collard, Pierre Duysinx, and Paul Fisette . . . . . . . . . . . . . . 159 A Comparison of Three Different Linear Order Multibody Dynamics Algorithms in Limited Parallel Computing Environments Adarsh Binani, James H. Critchley, and Kurt S. Anderson . . . . . . . . . . . . 181 Linear Dual Algebra Algorithms and their Application to Kinematics Ettore Pennestr`ı and Pier Paolo Valentini . . . . . . . . . . . . . . . . . . . . . . . . . . 207 A Memory Based Communication in the Co-simulation of Multibody and Finite Element Codes for Pantograph-Catenary Interaction Simulation Jorge Ambr´ osio, Jo˜ ao Pombo, Frederico Rauter, and Manuel Pereira . . . 231
Physiological Methods to Solve the Force-Sharing Problem in Biomechanics Marko Ackermann and Werner Schiehlen 1
2
Department of Biomedical Engineering, Cleveland Clinic Foundation, 9500 Euclid Avenue/ND20, 44195 Cleveland, OH, USA E-mail:
[email protected] Institute of Engineering and Computational Mechanics, University of Stuttgart, Pfaffenwaldring 9, 70569 Stuttgart, Germany E-mail:
[email protected]
Summary. The determination of individual muscle forces has many applications including the assessment of muscle coordination and internal loads on joints and bones, useful, for instance, for the design of endoprostheses. Because muscle forces cannot be directly measured without invasive techniques, they are often estimated from joint moments by means of optimization procedures that search for a unique solution among the infinite solutions for the muscle forces that generate the same joint moments. The conventional approach to solve this problem, the static optimization, is computationally efficient but neglects the dynamics involved in muscle force generation and requires the use of an instantaneous cost function, leading often to unrealistic estimations of muscle forces. An alternative is using dynamic optimization associated with a motion tracking, which is, however, computationally very costly. Other alternative approaches recently proposed in the literature are briefly reviewed and two new approaches are proposed to overcome the limitations of static optimization delivering more realistic estimations of muscle forces while being computationally less expensive than dynamic optimization.
1 Introduction Inverse dynamics is used to compute the net joint moments required to generate a measured motion. Although giving a clue about the intensity of the actuation required to accomplish the observed motion, net joint moments fail in delivering information on the forces applied by the individual muscles and other structures spanning the joints. Because the skeletal system is redundantly actuated by muscles, i.e. there are many more muscles than actuated degrees of freedom, and many muscles are multi-articular, spanning more that one joint, the direct translation of net moments into muscle forces is not possible. Therefore, conclusions about muscle activity from net joint moments are not very reliable (Zajac et al. [33]). Furthermore, the energy consumption C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
1
2
M. Ackermann and W. Schiehlen
involved, represented by the metabolic cost during human motion cannot be accurately assessed. In order to solve the mathematically indeterminate problem and assess muscle forces, optimization approaches are employed. The classical static optimization approach is characterized by the search for muscle forces that minimize a cost function and fulfill constraints, given basically by bounded muscle forces and by the equations of motion or joint moments computed by inverse dynamics, respectively. The cost functions are mathematical expressions assumed to model some physiological criteria optimized by the central nervous system during a particular activity. In spite of being computationally efficient, the static optimization approach assumes an instantaneous optimal distribution of muscle forces suffering from two important limitations. Firstly, it neglects the muscle contraction and activation dynamics, what might lead to unphysiological estimations of muscle forces. Secondly, the cost functions must be an instantaneous measure of performance, what excludes the possibility of using time-integral criteria as for example total metabolic cost expended. The latter limitation is specially important for the analysis of human walking, since metabolic cost is accepted to play an important role during locomotion. The muscle activation and contraction dynamics can be taken into account by using dynamic optimization associated with the tracking of the prescribed kinematics. This approach is based on the search for optimal controls, in this case the neural excitations, that drive a forward-dynamics model of the musculoskeletal system to track the prescribed motion. Due to the several numerical integrations of the differential equations necessary, a prohibitive computational effort is required to achieve a solution. This drawback prevents this approach from being widely used and stimulated recent efforts to reduce the computational burden. Some strategies based on dynamic optimization are presented in more details in Section 3.2. An approach to solve the distribution problem in biomechanics is proposed in Section 4, see also Ackermann and Schiehlen [3]. It considers the muscle contraction and activation dynamics and permits the use of time-integral cost functions as the total metabolic cost. This approach is called extended inverse dynamics (EID) because it requires, in addition to the inversion of the skeletal system dynamics, the inversion of the muscle contraction and activation dynamics. Since no numerical integration of the differential equations is required, the extended inverse dynamics is computationally less costly than the dynamic optimization. A second, simplified approach, called modified static optimization (MSO), to permit computation of muscle forces that fulfill the constraints given by the activation and contraction dynamics is also proposed and presented in Section 5. The latter approach maintains computational effort similar to the ones for static optimization, while considering the dynamics involved in the muscle force generation process.
Force-Sharing Problem in Biomechanics
3
2 Musculoskeletal System Dynamics and Energetics The skeletal system is often modeled by a multibody system composed of rigid bodies whose dynamics is described by its equations of motion as ˙ f gr ) + R(y) f m , ¨ + k(y, y) ˙ = q(y, y, ˙ f gr , h) = q r (y, y, M (y) y
(1)
where M is the symmetric, positive definite f × f -mass matrix, k is the f × 1vector of generalized Coriolis forces, q r is the f ×1-vector of generalized forces other than the ones caused by the muscles, f m is the m × 1-vector of muscle forces, R is the f ×m-matrix that transforms the muscle forces into generalized forces, and h = A f m , where A is a k × m-matrix that contains the muscles moment arms. The vector q r includes the vector of ground reaction forces f gr for both feet since the contact forces between feet and ground are modeled as external applied forces. For more details see Schiehlen [23]. When large-scale musculoskeletal models are considered Hill-type muscle models, see e.g. Zajac [32], are almost exclusively used. Hill-type muscle models are composed by a contractile element CE that generates force and represents the muscle fibers, and by passive elements in parallel and series to the CE modeling the tissue in parallel and in series to the muscle fibers. Figure 1 illustrates the three-element Hill-type muscle model, showing the CE, a series elastic element SE and a parallel elastic element PE. Figure 2 shows a scheme of the dynamics of the complete musculoskeletal system having the neural excitations as controls. The activation dynamics describing the process leading to a muscle activation state a from the neural excitation u is modeled by a first order differential equation, see e.g. He et al. [13], as a˙ = a(u, ˙ a) . (2) The first order muscle contraction dynamics arising from the presence of the series elastic element (SE) and from the muscle force-length-velocity relation, refer e.g. to Ackermann [1], reads as f˙m = f˙m (a, v m , lm , f m ).
lc
e
(3)
αp f m CE
fm PE
SE lse lm
Fig. 1. Three-element Hill-type muscle model
4
M. Ackermann and W. Schiehlen
metabolic cost rate E˙
u
Activation Dynamics
a
Muscle fm Contraction Dynamics
Skeletal System Dynamics
y, y˙
˙ lm(y), vm(y, y) Fig. 2. Scheme of the musculoskeletal system dynamics
where f m means the muscle force. Note in Fig. 2 that the muscle contraction dynamics is coupled to the skeletal system dynamics because it depends on the muscle length lm and on the muscle shortening velocity v m . The vectors resulting from these values are defined by the geometry of the musculoskeletal system and are modeled as functions of the generalized coordinates y and their ˙ derivatives y˙ as lm (y) and v m (y, y). In order to estimate the metabolic cost rate E˙ consumed by the muscles, phenomenological muscle energy expenditure expressions recently proposed in the literature, e.g. by Umberger et al. [31], can be used as ˙ E˙ = E(u, a, v ce , lce , f ce , pm ) ,
(4)
where the muscle parameters are summarized in the vector pm , and the quantities of the contraction elements are found in the vectors v ce , lce and f ce .
3 Muscle Force-Sharing Problem in Biomechanics The knowledge of loads in individual joint structures is important in fields like medicine, sport science or prosthesis design. For instance, the determination of muscle and ligament forces is required for an analysis of the risk of damage of the ACL (anterior cruciate ligament) in specific sport activities, for the design of hip and knee endoprostheses or for the planning and evaluation of orthopedic surgeries, Delp [10]. However, muscle forces cannot be measured directly in-vivo without invasive techniques. This stresses the importance of techniques that permit the estimations of tissue loads from the skeletal system motion and the external applied forces, quantities that can be measured noninvasively. Since many muscles span each joint of the skeletal system, a redundant system arises because muscle forces may contribute differently to the same joint moments, i.e. muscle forces cannot be uniquely determined from joint moments. Under a mathematical point of view, there are more unknown muscle forces than equilibrium equations available, i.e. there is an infinite number
Force-Sharing Problem in Biomechanics
5
of solutions for the muscle forces that fulfill the equilibrium equations and can generate the same observed motions of the skeletal system. The dynamics of the skeletal system is described by the corresponding equations of motions of its multibody system model, (1). If the kinematics ˙ ¨ (t) and the ground forces applied f gr (t) are of the movement y(t), y(t), y known, (1) can be solved for h, since the number of degrees of freedom of the mechanical system f is greater or equal to the number k of unknown joint moments in the vector h. On the other hand, the number of muscles m is always greater than the number of degrees of freedom of the skeletal system f , leading to the mentioned underdetermination. It is reasonable to assume that the central nervous system distributes the muscle forces f m in such a way as to optimize some physiological criteria, for instance, energy, fatigue or pain. This assumption is the basis for the optimization procedures to determine muscle forces presented further on. A typical formulation of the optimization is as follows: find f m (t) that minimizes a cost function J, subject to equality constraints given by the equa˙ y ¨ , f gr ), which represent linear tions of motion in the form R f m = b(y, y, ¨ + k − q r . Additionally, inconstraints on the muscle forces, where b = M y equality constraints are given either by bounded muscle forces or by bounded neural excitations as explained further on. 3.1 Static Optimization Approach The static optimization is a computationally efficient approach to solve the muscle force-sharing problem presented in the previous section. Static optimization is based on the assumption of an instantaneous cost function. This allows the solution of the force-sharing problem for each time instants tj independently. The vector of muscle forces f m j at the instant tj is searched that minimizes a cost function Js (f j ). The optimization is subject to physiological lower and upper bounds for the muscle forces and constraints given by the equations of motion in (1). Some variations of this approach have been proposed in the literature mainly to better account for muscle physiology, refer to Tsirakos et al. [30], but the basic strategy remains the same. Several instantaneous cost functions Js have been proposed to solve the muscle force-sharing problem, refer, e.g. to Tsirakos et al. [30] and da Silva [24] for extensive reviews and applications. The cost function proposed by Crowninshield and Brand [8] is one of the most frequently employed due to its physiological background related to muscle fatigue and reads as 3 m m fij , (5) Js = P CSAi i=1 m is the force applied by muscle i at time instant tj , P CSAi is the where fij cross-sectional area of muscle i, and m is the number of muscles considered.
6
M. Ackermann and W. Schiehlen
3.2 Dynamic Optimization Approach and Alternative Methods The investigation of human motion coordination and muscle recruitment by solving the optimal control problem using neural excitations as controls was used in the past, e.g., in Hatze [11], Hatze and Buys [12] and in Davy and Audu [9]. Pandy et al. [20] proposed the use of an alternative computational method for these problems consisting in the conversion of the optimal control problem into a parameter optimization problem, where the neural excitation histories are parameterized using a set of nodal points. This approach is claimed to circumvent the numerical difficulties that arise to solve the twopoint boundary-value problem derived from the necessary conditions of optimal control theory. This approach has been successfully implemented to study normal walking using metabolic energy cost per unit of distance traveled as cost function, e.g. in Anderson and Pandy [5], Bhargava et al. [6], Umberger et al. [31]. These studies could mimic human normal walking patterns such as kinematics, optimal walking velocity and metabolic energy cost reasonably, using forward-dynamics models of the musculoskeletal system. Other application fields include simulation of vertical jumping (Sp¨ agele [25], Anderson and Pandy [4], Nagano and Gerritsen [17]), and cycling (Neptune and van den Bogert [18]). This approach is denoted dynamic optimization in opposition to static optimization. The advantage of using dynamic optimization over static optimization resides in the consideration of the muscle contraction and activation dynamics, and in the possibility of using a time-integral cost function such as total metabolic cost. However, performing dynamic optimization with large-scale musculoskeletal models is extremely costly in terms of computational effort, requiring as much as weeks for a 2-D musculoskeletal model with a reduced number of degrees of freedom, Menegaldo et al. [15], or months for a 3-D complex musculoskeletal model for walking using parallel super-computing facilities, Anderson and Pandy [5]. The high CPU times result mainly from the several numerical integrations of the differential equations required. The typical applications of dynamic optimization look simultaneously for optimal controls, muscle forces and optimal motion patterns that minimize a cost function. For the cases in which the motion and the external applied forces are completely or partially prescribed or measured, the mechanical model must additionally track the known kinematics and apply the prescribed forces on the environment. This is achieved by augmenting the cost function with a term that quantifies the deviation from the prescribed kinematics and applied forces, see e.g. Neptune and van den Bogert [18], Neptune and Hull [19], Strobach et al. [27], Davy and Audu [9]. The introduction of the tracking term to the cost function transforms the problem in a multi-criteria optimization, which compromises the interpretation of the results, since the solution of the problem depend on the weighting factors chosen. Because the objective criteria are usually competing, the use of different weighting factors leads to different solutions.
Force-Sharing Problem in Biomechanics
7
In order to reduce the prohibitive computational effort required to solve the muscle force distribution problem by dynamic optimization new methods are being proposed. For instance, Menegaldo et al. [16] propose recently a dynamic optimization approach based on the tracking of the moments at the joints, which are computed for example from measured kinematics by conventional inverse dynamics. By avoiding the necessity of forward integration of the skeletal system dynamics, it considerably reduces the computational time required while considering the muscle activation and contraction dynamics and allowing for the use of time-integral cost functions. Although this method seems very promising with respect to computational time, it involves the solution of a multi-criteria optimization. Thelen et al. [29] and Thelen and Anderson [28] propose an algorithm called Computed Muscle Control (CMC) to solve the problem of muscle force distribution for known movement kinematics, based on a control algorithm that tracks the kinematics of a measured movement and uses measured external forces as input. This method is much faster than dynamic optimization approaches, because it requires only one forward integration of the state equations. It efficiently enforces the musculoskeletal system dynamics, but, in order to solve the muscle redundancy, it still requires the use of an instantaneous cost function. Therefore, in opposition to dynamic optimization, the use of a time-integral cost function such as total metabolic cost is not possible. Two new methods are proposed here, which present some advantages over the other methods described to solve the muscle redundancy problem. Both methods depend on the inversion of the contraction and activation dynamics. The first one named extended inverse dynamics (EID) and described in detail in Section 4 permits the computation of muscle forces by using time-integral cost functions as total metabolic cost and is computationally more efficient than dynamic optimization. The second method is described in Section 5 and is called modified static optimization (MSO). It is based on the minimization of an instantaneous cost function and characterized by constraints on the muscle forces at the current time step derived from the muscle forces at the previous time step that arise due to the activation and the contraction dynamics. A more detailed comparison of the approaches proposed here with others from the literature is presented in Ackermann [1].
4 Extended Inverse Dynamics In this section a novel optimization procedure is proposed that considers the contraction and activation dynamics, and permits the use of time-integral cost functions as the total metabolic cost, while reducing the computation times compared to dynamic optimization. The proposed approach consists in formulating the problem as a large-scale optimization problem, whose optimization variables are the muscle forces at all time steps considered. The optimization is subject to equality constraints given by the equations of motion at
8
M. Ackermann and W. Schiehlen Optimization Variables Muscle Forces
Contr. Dyn.
Constraints Muscle Moment Arms
Joint Moments
Equations of Motion
Measured Kinematics & GRF
−1 Optimization Loop Muscle Activations
Act. Dyn.
−1
Metabolic Cost of Transport
Cost Function
Neural Excitations 0≤u≤1
Constraints
Fig. 3. Schematic representation of the extended inverse dynamics approach (EID) for walking
all time steps, and lower and upper bounds for the neural excitations. The goal is to minimize a time-integral cost function, which is here assumed to be the total metabolic cost, estimated using the recently proposed expressions of Umberger et al. [31], which can be used in conjunction with Hill-type muscle models. Because the approach is based on the inversion of the activation and contraction dynamics it is named extended inverse dynamics (EID). Figure 3 shows the general optimization scheme of the EID. The optimization variables are parameterized muscle force histories. The activation and neural excitations for each one of the muscles considered are computed by inverting the contraction and activation dynamics. The optimization is subject to two sets of constraints: (1) the constraints represented by neural excitations bounded by 0 and 1; (2) the constraints given by the equations of motion at all time steps, ensuring the compatibility between the muscle forces and the measured skeletal system motion and ground reaction forces. Thus, optimal (parameterized) muscle force histories are searched for that minimize total metabolic cost, fulfill the constraints given by the equations of motion for the given kinematics and measured ground reaction forces (GRF), and ensure physiological neural excitations bounded by 0 and 1. In the next sections, the elements of the optimization scheme proposed are explained in details.
Force-Sharing Problem in Biomechanics
9
4.1 Parameterization of Muscle Forces The parameterization of the force f m i (t) applied by the ith muscle is performed using a set of n nodes uniformly distributed along the duration of the motion of interest resulting in a vector of time steps t = [t1 · · · tj · · · tn ], where tj −tj−1 = m for all muscles i, ∆t. The optimization variables are the muscle forces fij i = 1 . . . m, at all nodes j, j = 1 . . . n, summarized in a mn × 1-vector of global muscle forces T F m = [f m 1
T fm 2
···
T fm j
···
T
T fm n ] ,
(6)
T is the vector of muscle forces at the time where f m j = [f1j · · · fij · · · fmj ] step tj .
4.2 Inversion of the Contraction and Activation Dynamics The inversion of the contraction and activation dynamics are important steps for the implementation of the approach proposed. The first step to invert the contraction dynamics (3) and find the time history of the muscle activation a(t), is to compute the total muscle length lm (including the tendon) and the total muscle shortening velocity v m from the generalized coordinates in y and their time derivatives in y˙ as lm = lm (y)
˙ . and v m = v m (y, y)
(7)
The second step consists in numerically differentiating the time history of the muscle force f m (t) obtaining f˙m (t). Since the value of f m (t) is only available at discrete time steps tj , f˙m (t) is estimated only at these time instants by using the centered finite-difference formula, Chapra and Canale [7], for internal nodes or by using forward and backward finite-difference formulas for the initial (j = 1) and final nodes (j = n). In the case of perfectly periodic motions, centered finite-divided-difference formulas for the extreme nodes are preferred, since they deliver more accurate estimations of the derivatives, see e.g. Chapra and Canale [7]. The muscle serial elastic element (SE) length lse and shortening velocity se v can be then computed, respectively, by lse = lse (f m ) and v se = v se (f˙m ) , (8) where lse (f m ) models the force-length relation of the SE. The contractile element (CE) shortening velocity, length and force, v ce , lce and f ce , are computed according to Fig. 1 and with αp ≈ constant, respectively, as v ce = (v m − v se ) cos αp , fm − f pe (lce ) , f ce = cos αp lm − lse . lce = cos αp
(9a) (9b) (9c)
10
M. Ackermann and W. Schiehlen
Finally, the muscle activation a is obtained by solving the muscle forcelength-velocity relation for a as f ce − f ce (a, v ce , lce ) = 0 .
(10)
For some models of the muscle force-length-velocity relation a can be written explicitly as function of v ce , lce and f ce . For other models, this is not possible and a has to be computed numerically through (10) using a zero-finder algorithm, which considerably increases the computation time required. The procedure explained is repeated for all n nodes and m muscles considered. The neural excitations are assessed by inverting the activation dynamics described by (2). The first step is to find the time derivative a(t) ˙ of the muscle excitation a(t). This is achieved by numerical differentiation of a(t) using finite-divided-difference formulas as the ones used to compute the time derivatives of muscle forces. The values for the muscle activation a(t) and its first time derivative a(t) ˙ are then inserted into (2) resulting in an algebraic equation, which is either linear or quadratic in u. If a linear equation in u arises, e.g. for the activation dynamics model of Zajac [32], solving (2) for u is trivial. If a quadratic equation in u arises, e.g. for the activation dynamics model of He et al. [13], the two roots of the polynomial are computed and one of them is chosen as solution. The choice of the appropriate root is in most cases straightforward. Checking which root is bounded by 0 and 1 and is most proximal to the muscle activation a proved to be efficient rules. 4.3 Constraints The optimization is subject to two kinds of constraints as depicted in Fig. 3, the constraints that ensure the fulfillment of the equations of motion and the lower and upper bounds for the required neural excitations. The fulfillment of the constraints is checked at the nodes considered so that small infringements in the region between nodes might occur. The magnitude of these infringements is dictated by the number of nodes used. A properly chosen ∆t leads to negligible inter-node infringements of the constraints. Fulfillment of Equations of Motion m The vectors of muscle forces f m j contained in F , (6), have to satisfy the equations of motion (1) at all time instants tj considered as m ¨ j + k(y j , y˙ j ) = q r (y j , y˙ j , f gr M (y j ) y j ) + R(y j ) f j ,
j = 1...n .
(11)
¨ j and the ground reaction Since the kinematics of the movement in y j , y˙ j , y are measured or specified, the only unknowns are the muscle forces in f gr j . Rearranging (11) yields a set of linear equations in the elements forces f m j , of f m j
Force-Sharing Problem in Biomechanics
Rj f m j = bj ,
11
(12)
¨ j + k(y j , y˙ j ) − q r (y j , y˙ j , f gr M (y j ) y j ).
Writing all the constraint where bj = equations given by the equations of motion at all time steps j in a single matrix equation yields (13) Aeq F m = beq , where Aeq is a f n × mn block diagonal matrix and beq is a f n × 1 vector, constructed as ⎤ ⎡ R1 ⎥ ⎢ R2 0 ⎥ ⎢ ⎥ ⎢ . .. ⎥ ⎢ ⎥ , Aeq = ⎢ (14a) ⎥ ⎢ Rj ⎥ ⎢ ⎥ ⎢ .. ⎦ ⎣ . 0 Rn beq = [bT1
bT2
···
bTj
···
T
bTn ] .
(14b)
Bounds for Neural Excitations The second group of constraints is represented by neural excitations bounded by 0 and 1 as 0 ≤ uij ≤ 1 ,
i = 1...m ,
j = 1...n .
(15)
The fulfillment of these constraints for the whole period considered guarantees the fulfillment of the lower and upper bounds for the activations and muscle forces. However, although not strictly necessary, constraints on muscle activations and muscle forces can be additionally formulated. This measure, in special the explicit bounds on the optimization variables, reduces the search space of the optimization variables, which can, depending on the optimization algorithm employed, reduce the number of iterations required to achieve convergence to an optimum. 4.4 Cost Function One of the advantages of using the extended inverse dynamics approach over static optimization is the use of time-integral cost functions as the total energy expenditure, which is accepted to be the primary performance criteria during walking, Ralston [21]. Hence, for the simulation results discussed further on, all dealing with walking, the total metabolic cost is adopted as cost function. The total metabolic cost can be estimated by recently proposed phenomenological muscle energy expenditure models, e.g. Umberger et al. [31], as a function of the neural excitation u, muscle activation a and muscle CE force f ce , length
12
M. Ackermann and W. Schiehlen
lce and shortening velocities v ce , and from a set of muscle specific parameters pm for all muscles m considered as
m tn (16) E˙ i (ui (t), ai (t), v ce (t), lce (t), f ce (t), pm )dt , JEID = i
i=1
i
i
i
t1
where E˙ i is the metabolic cost rate for muscle i. The integral in (16) is solved numerically using the values of the variables at the discrete time instants tj . Thus, the extended inverse dynamics approach proposed consists in an optimization scheme formulated as: find the optimal global vector of muscle m that minimizes the total metabolic cost (16), subject to linear forces Fopt equality constraints given by the equations of motion at all time steps (13), to the inequality constraints for the neural excitations (15), and, if advantageous, to additional lower and upper bounds for the activations and muscle forces, respectively. There are many different numerical methods that can be used to solve a nonlinear optimization problem with nonlinear constraints. Here, the Sequential Quadratic Programming (SQP) implemented in the fmincon R is used. function available in the Optimization Toolbox of Matlab
5 Modified Static Optimization The extended inverse dynamics approach proposed in the previous section accounts for the activation and contraction dynamics and uses a time-integral cost function. The price for these desirable features is a computational effort some orders of magnitude higher than the one required in static optimization, although being lower than the one required for dynamic optimization, see Ackermann [1]. The high computational effort is a limiting factor for the use of the more elaborate approaches dynamic optimization and extended inverse dynamics. Specially in applications in which a rather gross estimation of muscle forces is sufficient and instantaneous cost functions are assumed to be reasonable models of the underlying muscle force distribution laws adopted by the central nervous system (CNS) static optimization will still be the first choice. However, neglecting completely the activation and contraction dynamics, assuming the muscle as a perfect force generator, capable of delivering the required amount of force instantaneously, can lead to unphysiological solutions. In this section an alternative approach is proposed, which modifies the static optimization approach in such a way as to consider the muscle activation and contraction dynamics, while requiring a reduced computational effort. The approach, named modified static optimization, formulates the optimization problem in the same way as in the static optimizations for each time step considered with the difference of defining additional nonlinear constraints that ensure neural excitations bounded by 0 and 1. This measure guarantees the compatibility of the current muscle forces with the activation
Force-Sharing Problem in Biomechanics
f m(t)
fm j,max
m ∆f + max(f j−1, aj−1)
fm j−1
m ∆f − max(f j−1, aj−1)
fm j,min
tj−1
tj
13
Fig. 4. Schematic representation of the implicit additional lower and upper bounds on the muscle force fjm as a function of the states at the previous time instant tj−1 in the modified static optimization approach
t
and contraction dynamics. These additional constraints can be interpreted as additional upper and lower bounds on the current muscle forces obtained by the maximal allowed variations of muscle forces that are still compatible with the activation and contraction dynamics. The upper and lower bounds, m m and fj,min , respectively, are implicitly formulated depending on the fj,max m and activations aj−1 , at the previous time instant states, muscle forces fj−1 tj−1 as depicted in Fig. 4. Therefore, the formulation of the optimization problem for the time step j is identical to the one for the static optimization, Section 3.1, with additional upper and lower bounds for the neural excitations of the muscles considered as 0 ≤ uij ≤ 1 ,
i = 1...m ,
j = 2...n.
(17)
The computation of the neural excitations requires the inversion of the contraction and activation dynamics as done for the extended inverse dynamics approach, Section 4.2, with the difference of using only the information from the previous time steps. The procedure is briefly explained in the following focusing on the differences from the extended inverse dynamics. The index i referring to a specific muscle is omitted. The first step consists in computing the first derivatives of the muscle forces f˙jm from the values of the muscle forces at the previous and current m and fjm , respectively, using a backward finite-divided formula. time steps fj−1 The total muscle length ljm and shortening velocity vjm are computed from the generalized coordinates y j and their derivatives. With this information the contraction dynamics (3) is inverted as explained in Section 4.2 in such a way aj is computed by aj = aj (f˙jm , vjm , ljm , fjm ) ,
j = 2...n.
(18)
The next step consists in the inversion of the activation dynamics which requires the estimation of the first time derivatives of the activations a˙ j by
14
M. Ackermann and W. Schiehlen
using also a backward finite-divided formula. It follows the computation of the neural excitations, as explained in Section 4.2, by inserting aj and a˙ j into (2) and solving it for uj as a˙ j − a˙ j (aj , uj ) = 0 ,
j = 2...n.
(19)
The explained procedure to estimate the neural excitation at tj shows that m and activation aj−1 at the previous uj is a function of the muscle force fj−1 time step tj−1 , of the time step size ∆t, of the muscle total length ljm and shortening velocity vjm at tj , and of the searched muscle force fjm at tj . This results in implicit constraints on the searched muscle forces fjm given by the bounds on the neural excitation in the form m 0 ≤ uj (fjm , fj−1 , aj−1 , y j , y˙ j , ∆t) ≤ 1 ,
j = 2...n,
(20)
for all muscles considered. Therefore, muscle forces at tj are searched that minimize the instantaneous cost functions of Section 3.1, and fulfill the nonlinear constraints given by (20) and the linear constraints given by the equations of motion (11). The first time step in the modified static optimization receives a special treatment. The muscle forces f m 1 at the first time step t1 are computed by conventional static optimization without the additional constraints on the neural excitations. The activations ai1 , i = 1 . . . m, at t1 are approximately estimated by inversion of the contraction dynamics, as explained in Section 4.2, with the difference that the required estimations of the derivatives of the muscle forces f˙ 1 are obtained by a forward finite-divided-difference formula using m the muscle forces computed at the first and second time steps, f m 1 and f 2 , respectively, computed by conventional static optimization. The constraints on the rate of change of the muscle forces can be so restrictive, that infeasibilities may occur at a time instant for which no solution for the muscle forces can be found that fulfills all the constraints. This occurs to a great extent due to the fixed values of the computed muscle forces at the previous time steps. The incidence of such infeasibilities for the extended inverse dynamics approach is much lower, because there the complete time histories of the muscle forces can be accommodated in such a way as to guarantee fulfillment of the constraints at all time steps. One drawback of the modified static optimization is the necessity of using backward finite-divideddifference formulas to estimate derivatives numerically, which causes greater truncation errors in comparison to centered finite-divided-difference formulas. Furthermore, the activations at the initial time step have to be determined approximately, because no values for the muscle forces at the previous time step are available.
6 Application to Normal and Disturbed Gaits In this section both approaches to solve the muscle force-sharing problem in biomechanics proposed are applied to the normal and to mechanically
Force-Sharing Problem in Biomechanics
15
disturbed gaits measured in a gait analysis laboratory, Ackermann and Gros [2]. The extended inverse dynamics and the modified static optimization are compared to the static optimization. 6.1 Model of the Musculoskeletal System A 2-D mechanical model of the skeletal system of the right lower limb is adopted here, composed by three rigid bodies, the thigh, the shank and the foot. The motion is performed in the sagittal plane and is described by three generalized coordinates and two rheonomic constraints, refer to Fig. 5b. The generalized coordinates are the angle α describing the rotation of the thigh, the angle β describing the knee flexion, and the angle γ for the ankle plantar flexion. The two rheonomic constraints are the horizontal and vertical positions of the hip joint, xhip and zhip , respectively. The pelvis and trunk are assumed to remain in the vertical position throughout the gait cycle, what is reasonable for normal walking. The masses, center of mass locations, and the mass moment of inertia of the three segments in the sagittal plane are computed using the tables in de Leva [14] as functions of the subject’s body mass, stature, thigh length and shank length. The motion and the ground reaction forces were measured in a gait analysis laboratory. The eight muscle groups considered in this analysis are shown in Fig. 5a. The Hill-type muscle model is composed by a contractile element CE and a trunk
1
pelvis
3 2 5 4 thigh
1 2 3 4 5 6 7 8
− − − − − − − −
xhip
Iliopsoas Rectus Femoris Glutei Hamstrings Vasti Gastrocnemius Tibialis Anterior Soleus
hip
zhip α
knee 6 β
shank 7
ankle γ
8 foot a)
b)
Fig. 5. Musculoskeletal model of the lower limb: (a) muscle units; (b) mechanical model with generalized coordinates
16
M. Ackermann and W. Schiehlen
series elastic element SE, while the force of the parallel elastic element PE is set to zero, Fig. 1. In this model all the structures in parallel to the CE and the SE are represented by total passive moments at the joints, which include the moments generated by all other passive structures crossing the joints, like ligaments, too. The formula for the passive moments at the hip, knee and ankle are functions of α, β and γ as proposed by Riener and Edrich [22]. A linear damping is added to the knee and hip joints and their values are the approximate average values obtained by pendulum experiments performed by Stein et al. [26]. The models adopted for the muscle activation and muscle force-length-velocity relation are based to a great extent on the models in Nagano and Gerritsen [17] with some few modifications. For details on the models used and for the corresponding parameters refer to Ackermann [1]. 6.2 Application to the Normal Walking The results obtained by using extended inverse dynamics (EID), modified static optimization (MSO) and the static optimization (SO) for the normal gait of one subject are presented in Table 1. For the SO and for the MSO the cost function proposed by Crowninshield and Brand [8] is used, refer to Section 3.1. The cost function for the EID adopted is the total metabolic cost (16). The metabolic cost for the MSO is computed by inverting the contraction and activation dynamics after the computation of optimal muscle forces for each time instant j. The initial guesses for the muscle forces in the static optimization are zero. The initial guesses used in the many low-dimension optimizations involved in the MSO and in the unique large-scale optimization in the EID are the optimal muscle forces obtained as solutions of the SO. The analysis of the results in Table 1 shows that the computation time required for the EID approach is four orders of magnitude higher than the computation times required for the SO and MSO. This difference can be explained by the much higher dimension of the optimization problem in the EID with respect to the optimizations in the SO and MSO. While in the EID a unique large-scale optimization problem with several optimization variables is solved, the SO and the MSO require the solution of many low-dimension optimization problems. This is the cost that has to paid for the use of a Table 1. Solutions for the force-sharing problem for the measured normal walking using: Static Opt. – static optimization; Mod. Static Opt. – modified static optimization; Ext. Inverse Dyn. – extended inverse dynamics Initial guess Static Opt. (SO) fm j,0 = 0 Mod. Static Opt. (MSO) Solution SO Ext. Inverse Dyn. (EID) Solution SO
Computation time (s)
Opt. variables
Metabolic cost (J)
5.3 7.6 4.5 × 104
8 (67×) 8 (67×) 400 (8 × 50)
– 254.8 201.0
Force-Sharing Problem in Biomechanics
17
time-integral cost function as metabolic cost. Although the computational effort for the EID (33.6 h) can be considered high with respect to the SO and MSO, it is probably much lower than the required for a dynamic optimization with tracking of the kinematics as discussed in Section 3.2. The results show the computational efficiency of the MSO. The MSO required a computational effort comparable to the one required by the SO, while considering the activation and the contraction dynamics. This shows the potential of the MSO for applications that need fast and realistic estimations of muscle forces, muscle activation and neural excitations when instantaneous cost function are adopted. The MSO can, however, lead to infeasibilities due to the restrictive constraints imposed by the limitation on the amount of allowable changes in the muscle forces. Nevertheless, the possible infeasibilities cannot be attributed to a fail of the method, but rather to inconsistencies arising from errors in the measurements, oversimplification of the models or assumption of an incorrect cost function. The metabolic cost for the SO cannot be estimated using (16), because the muscle forces computed lead to muscle activations and neural excitations that infringe their lower and upper bounds, see Fig. 6, being out of the range to which these expressions are valid. This occurs because the SO does not consider the activation and contraction dynamics. On the contrary, the solution for the muscle forces delivered by the MSO can be used to compute muscle activations and neural excitations that fulfill the constraints by inverting the contraction and the activation dynamics. These values can then be inserted into (16), which delivers estimations of metabolic cost as shown in Table 1.
1.5 kN Ilio 0
fim
ai
ui
1
1
0
0
RF Glu Ham Vasti Gas TA Sol TOr
HSr
TOr
TOr
HSr
TOr
Fig. 6. Results for the static optimization (SO); TOr – toe off of the right foot; HSr – heel strike of the right foot
18
M. Ackermann and W. Schiehlen
1.5 kN Ilio 0
fim
ai
ui
1
1
0
0
RF Glu Ham Vasti Gas TA Sol TOr
HSr
TOr
TOr
HSr
TOr
TOr
HSr
TOr
Fig. 7. Results for modified static optimization (MSO)
The metabolic cost estimated from the muscle forces computed with the MSO is 27% greater than the metabolic cost estimated with the EID. A difference is expected, since the EID minimizes the metabolic cost, while in the MSO the instantaneous cost function of Crowninshield and Brand [8] is minimized, which is related to muscle fatigue. The relatively big differences show that the cost function of Crowninshield and Brand [8] is not well related to the metabolic energy consumption. The absolute values of the total metabolic cost of transport obtained with the EID, assuming the total metabolic cost expended during walking is approximately two times the expended for one of the legs, is 311 J/m, which agrees well with values found in the literature. The optimal muscle forces for the normal gait computed using SO, MSO and EID and the corresponding muscle activations and neural excitations are presented in Figs. 6, 7 and 8, respectively. Figure 6 shows clearly the infringement of the lower and upper bounds for the neural excitations indicating nonphysiological muscle force histories. Figure 7 shows that the MSO smoothes the muscle force curves avoiding unphysiological fast variations, maintaining the neural excitations bounded by 0 and 1. The smoothing effect is specially visible for the muscle groups rectus femoris (RF) and vasti (Vas). Also the results of the EID in Fig. 8 fulfill the bounds on the activations and neural excitations. The constraints on the neural excitations cause cocontraction, even in single joint antagonist muscles, although cocontraction of antagonists is clearly noneconomical. This occurs due to the fact that muscles are not ideal actuators and cannot be switched on and off instantaneously. Indeed, the single joint antagonistic muscle pairs iliopsoas (Ilio) and glutei (Glu), and tibialis anterior
Force-Sharing Problem in Biomechanics
fim 1.5 kN Ilio 0
ai
1
ui
1
0
19
0
RF Glu Ham Vasti Gas TA Sol TOr
HSr
TOr
TOr
HSr
TOr
TOr
HSr
TOr
Fig. 8. Results for the extended inverse dynamics (EID)
(TA) and soleus (Sol) present practically no cocontraction, i.e. almost no simultaneous activation, when SO is used, see Fig. 6. On the contrary, if MSO or EID are used cocontraction is observed for these muscles, refer to Figs. 7 and 8. In order to reduce energy consumption, the results of EID have less cocontraction than the results of MSO, but a considerable amount of cocontraction in the results of EID can still be observed for the mentioned muscles, specially at the regions of activation and de-activation. Application to the Walking with an Ankle Weight In this section the effect of adding a 1.7 kg ankle weight during the swing phase of the gait is investigated using the extended inverse dynamics with the total metabolic cost as cost function. The bars in Fig. 9 show the metabolic cost estimated for three different scenarios. The first bar from the left shows the metabolic cost, 48.3 J, computed for the measured kinematics of the lower limb of the subject during the swing phase without weight. The second bar shows the metabolic cost, 49.9 J, for the measured kinematics during the swing phase with a 1.7 kg ankle weight. The third bar from the left shows the metabolic cost using the kinematics measured for the normal walking and adding a 1.7 kg ankle weight to the skeletal system model, 59.6 J. As expected, the swing phase with an ankle weight requires a higher metabolic cost than the normal, although the observed difference of 3% is slight. An interesting result is shown by the third bar from the left. It shows an increase of about 23% in the metabolic cost if the kinematics is maintained the same as for the normal walking when the ankle weight is added.
20
M. Ackermann and W. Schiehlen Swing Phase Subject 2 80
Normal & Weight
70 60
E [J]
50
Normal
Weight
40 30 20 10
1.7 kg
0
Fig. 9. The bars in the diagram in the left show the metabolic cost for the swing phase obtained with the EID, from left to right, during normal walking, during walking with a 1.7 kg ankle weight, and during the walking with the normal kinematics and an added 1.7 kg ankle weight to the model. In the right a schematic representation of the musculoskeletal model of the lower limb with the ankle weight added is depicted. In the middle a picture of the lower limb of the subject with an ankle weight attached to his ankle is shown
This means that, after the addition of the ankle weight, the subject naturally adapts the motion of the lower limb in such a way as to reduce the metabolic cost required, leading to a slight increase of energy expenditure with respect to the undisturbed gait. If the adaptations in the kinematics were not performed, the increase in the metabolic cost would be much higher (third bar in Fig. 9). This observation evidence the importance of the motion adaptations in the reduction of energy expenditure during gait.
7 Conclusion The conventional method to solve the muscle force-sharing problem is called static optimization and, although being computationally efficient, suffers from two important limitations: (1) it neglects the dynamics involved in the muscle force generation process, what can lead to unphysiological muscle force histories, and (2) it requires the use of instantaneous cost functions, excluding the possibility of using time-integral cost functions as the total metabolic cost, which was shown to play a key role during walking. In order to overcome these limitations dynamic optimization associated with a tracking of the measured kinematics is an alternative. This approach requires, however, extremely high computational costs due to the several numerical integrations of the high-dimensional system equations necessary. Two alternative approaches are proposed to overcome the limitations of static
Force-Sharing Problem in Biomechanics
21
optimization delivering more realistic muscle force estimations while being computationally less expensive than dynamic optimization. One approach named extended inverse dynamics delivers physiological estimations of muscle forces by considering the muscle activation and contraction dynamics and by permitting the use of time-integral cost functions as total metabolic cost. Although the improvements provided by this approach makes it computationally much more expensive than static optimization, it is less expensive than dynamic optimization, because it does not require any numerical integration of the state equations. The second proposed approach, named modified static optimization, offers a viable alternative to static optimization by considering the muscle activation and contraction dynamics while requiring a low computational effort. The two proposed approaches are applied to estimate muscle force histories for the normal gait of a subject measured in a gait analysis laboratory using a musculoskeletal model of the lower limbs. The approaches are compared to the static optimization with respect to computational effort and fulfillment of constraints that guarantee the consideration of the dynamics involved in the process of muscle force generation. The extended inverse dynamics is used to investigate the walking with an ankle weight. It is shown that the subject naturally adapts the kinematics of the swinging leg to reduce energy consumption as a response to the addition of an ankle weight. The choice of a proper approach depends on the accuracy required, the computational facilities available, and the particularities of the problem.
References 1. Ackermann M (2007) Dynamics and energetics of walking with prostheses. Ph.D. thesis, University of Stuttgart, Shaker Verlag, Aachen 2. Ackermann M, Gros H (2005) Measurements of human gaits. Internal Report ZB-144, Institute B of Mechanics, University of Stuttgart, Stuttgart 3. Ackermann M, Schiehlen W (2006) Dynamic analysis of human gait disorder and metabolical cost estimation. Arch Appl Mech 75:569–594 4. Anderson FC, Pandy MG (1999) A dynamic optimization solution for vertical jumping. Comput Meth Biomech Biomed Eng 2:201–231 5. Anderson FC, Pandy MG (2001) Dynamic optimization of human walking. J Biomech Eng 123:381–390 6. Bhargava LJ, Pandy MG, Anderson FC (2004) A phenomenological model for estimating metabolic energy consumption in muscle contraction. J Biomech 37:81–88 7. Chapra SC, Canale RP (1985) Numerical methods for engineers. McGraw-Hill, New York 8. Crowninshield RD, Brand RA (1981) Physiologically based criterion of muscle force prediction in locomotion. J Biomech 14:793–801 9. Davy DT, Audu ML (1987) A dynamic optimization technique for the muscle forces in the swing phase of the gait. J Biomech 20:187–201
22
M. Ackermann and W. Schiehlen
10. Delp SL (1990) Surgery simulation: a computer graphics system to analyze and design musculoskeletal reconstructions of the lower limb. Ph.D. thesis, Department of Mechanical Engineering, Stanford University, Stanford, CA 11. Hatze H (1976) The complete optimization of a human motion. Math Biosci 28:99–135 12. Hatze H, Buys JD (1977) Energy-optimal controls in the mammalian neuromuscular system. Biol Cybern 27:9–20 13. He J, Levine WS, Loeb GE (1991) Feedback gains for correcting small perturbations to standing posture. IEEE T Automat Contr 36:322–332 14. de Leva P (1996) Adjustments to Zatsiorsky-Seluyanov’s segment inertia parameters. J Biomech 29:1223–1230 15. Menegaldo LL, Fleury AT, Weber HI (2003) Biomechanical modeling and optimal control of human posture. J Biomech 36:1701–1712 16. Menegaldo LL, Fleury AT, Weber HI (2006) A ‘cheap’ optimal control approach to estimate muscle forces in musculoskeletal systems. J Biomech 39:1787–1795 17. Nagano A, Gerritsen KGM (2001) Effects of neuromuscular stregth training on vertical jumping performance — a computer simulation study. J App Biomech 17:113–128 18. Neptune RR, van den Bogert AJ (1998) Standard mechanical energy analyses do not correlate with muscle work in cycling. J Biomech 31:239–245 19. Neptune RR, Hull ML (1998) Evaluation of performance criteria for simulation of submaximal steady-state cycling using a forward dynamic model. J Biomech Eng 120:334–341 20. Pandy MG, Anderson F, Hull DG (1992) A parameter optimization approach for the optimal control of large-scale musculoskeletal systems. J Biomech Eng 114:450–460 21. Ralston HJ (1976) Energetics of human walking. In: Herman RM et al. (eds) Neural control of locomotion. Plenum, New York, pp 77–98 22. Riener R, Edrich T (1991) Identification of passive elastic joint moments in the lower extremities. J Biomech 32:539–544 23. Schiehlen W (2006) Computational dynamics: theory and applications of multibody systems. Eur J Mech A-Solid 25:566–594 24. da Silva MPT, Ambrosio JAC (2004) Human motion analysis using multibody dynamics and optimization tools. Ph.D. thesis, Instituto de Engenharia Mecˆ anica, Lisboa, Portugal 25. Sp¨ agele T (1998) Modellierung, Simulation und Optimierung menschlicher Bewegung (in German). Ph.D. thesis, Institute A of Mechanics, University of Stuttgart, Stuttgart 26. Stein RB, Lebiedowska MK, Popovic DB, Scheiner A, Chizeck HJ (1996) Estimating mechanical parameters of leg segments in individuals with and without physical disabilities. IEEE T Rehabil Eng 4:201–211 27. Strobach D, Kecskemethy A, Steinwender G, Zwick B (2005) A simplified approach for rough identification of muscle activation profiles via optimization and smooth profile patches. In: Proceedings of MULTIBODY DYNAMICS 2005, ECCOMAS Thematic Conference, Madrid, Spain 28. Thelen DG, Anderson FC (2006) Using computed muscle control to generate forward dynamic simulations of human walking from experimental data. J Biomech 39:1107–1115 29. Thelen DG, Anderson FC, Delp SL (2003) Generating dynamic simulations of movement using computed muscle control. J Biomech 36:321–328
Force-Sharing Problem in Biomechanics
23
30. Tsirakos D, Baltzopoulos V, Bartlett R (1991) Inverse optimization: functional and physiological considerations related to the force-sharing problem. Crit Rev Biomed Eng 25:371–407 31. Umberger BR, Gerritsen KGM, Martin PE (2003) A model of human muscle energy expenditure. Comput Meth Biomech Biomed Eng 6:99–111 32. Zajac FE (1989) Muscle and tendon: properties, models, scaling, and application to biomechanics and motor control. Crit Rev Biomed Eng 19:359–411 33. Zajac FE, Neptune RR, Kautz SA (2003) Biomechanics and muscle coordination of human walking part II: lessons from dynamical simulations and clinical implications. Gait Posture 17:1–17
Multi-Step Forward Dynamic Gait Simulation Matthew Millard, John McPhee, and Eric Kubica Systems Design Engineering, University of Waterloo, 200 University Ave West, Waterloo ON, Canada E-mail:
[email protected]
Summary. A predictive forward-dynamic simulation of human gait would be extremely useful to many different researchers, and professionals. Metabolic efficiency is one of the defining characteristics of human gait. Forward-dynamic simulations of human gait can be used to calculate the muscle load profiles for a given walking pattern, which in turn can be used to estimate metabolic energy consumption. One approach to predict human gait is to search for, and converge on metabolically efficient gaits. This approach demands a high-fidelity model; errors in the kinetic response of the model will affect the predicted muscle loads and thus the calculated metabolic cost. If the kinetic response of the model is not realistic, the simulated gait will not be reflective of how a human would walk. The foot forms an important kinetic and kinematic boundary condition between the model and the ground: joint torque profiles, muscle loads, and thus metabolic cost will be adversely affected by a poorly performing foot contact model. A recent approach to predict human gait is reviewed, and new foot contact modelling results are presented.
1 Introduction Human and animal gait has been studied by using experiments to tease out the neural, muscular and mechanical mechanisms that are employed to walk. Inverse dynamic simulation is the most common simulation technique used to study human gait. Inverse dynamics works backwards from an observed motion in an effort to find the forces that caused the motion – inverse dynamics is not predictive. In contrast, forward dynamics can be used to determine how a mechanism will move when it is subjected to forces – making forward dynamics predictive. Forward dynamic human gait simulations usually only simulate a single step [4,11] in an effort to avoid modelling foot contact and balance control systems. The few multi-step forward-dynamic simulations in the literature have used a relatively fixed gait [24, 27]. In contrast, Peasgood et al.’s [23] forward dynamic simulation is predictive: the simulated gait is altered in an effort to C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
25
26
M. Millard et al.
find metabolically efficient or ‘human-like’ gaits, allowing it to estimate how a person would walk in a new situation – e.g. with a new lower-limb prosthetic, or more flexible muscles. A computer simulation that is able to reliably predict how a person would walk in a new situation would be extremely useful to many health care professionals and researchers studying human gait. Peasgood et al.’s system finds ‘human-like’ or metabolically minimal gaits by searching for joint trajectories for the hip, knee and ankle that minimize metabolic cost per distance traveled. The model is not supported or balanced by any artificial means, and so, poorly chosen trajectories can overwhelm the balance controller, causing the model to fall. This study was undertaken to evaluate and extend Peasgood et al.’s work, and to identify the shortcomings of current multi-step forward dynamic gait simulations.
2 Methods Peasgood et al.’s system represents the first attempt at developing a predictive, multi-step gait simulation that searches for metabolically efficient gaits. Nearly 1,000, ten-step simulations were required to find a metabolically efficient, ‘human-like’ gait. Originally the 1,000 gait simulations took 10 days to perform on a single computer using the popular mechanical modeling package MSC.Adams [21]. DynaFlexPro [9], another modeling package, developed since Peasgood et al.’s work, offers substantial performance advantages over Adams: the updated version of Peasgood et al.’s predictive system now takes only 8 hours to run. Peasgood et al.’s work was taken, carefully examined, analyzed, improved and implemented in DynaFlexPro. 2.1 Dynamic Model Peasgood et al. developed a predictive gait simulation using a 2D, seven segment, nine degree of freedom (dof), anthropomorphic model shown in Fig. 1 with a continuous foot contact model. This is a fairly standard model topology for gait studies. The upper body is simplified into a single body representing the head, arms and trunk (HAT); the thigh and shank are each one segment, as is the foot [1, 3, 13]. An additional simplification has been made in this model by fusing the HAT to the pelvis. There was an unintended error in Peasgood et al.’s original model: there was an extra body attached to the foot that had a moment of inertia of 1.5 kg m2 , which is comparable to the HAT segment. A convergence study was performed on both the DynaFlexPro and the corrected Adams gait models by dropping both unactuated models onto the floor from the same initial conditions. The convergence of each model was checked individually. The results from the DynaFlexPro model converged for every simulation, whereas the Adams model failed to converge with an integrator
Multi-Step Forward Dynamic Gait Simulation
27
Fig. 1. Peasgood et al.’s seven segment, nine degree of freedom, planar gait model with a 2-point continuous foot contact model Table 1. Performance comparison between the Adams and DynaFlexPro 2D seven segment gait models for a 10 second simulation. The Adams simulation with an integrator error tolerance of 10−5 failed to converge. The relative error increases from the hip position to the foot angle: the large mass of the HAT attenuates position error of the hip, while foot position is more sensitive to errors due to its light mass. The stiffness of the heel contact makes the simulated contact forces very sensitive to errors Adams DynaFlexPro Maximum relative error (%) Integrator GSTIFF (I3) ode15s (NDF) Left hip Right ankle Right heel error tol. Simulation time disp. (x) angle contact force 10−5 10−7 10−9
29 33 36
4.1 7.3 30
3.02 0.09 0.24
5.65 0.16 0.48
14.30 0.27 0.73
error tolerance of 10−5 . The maximum relative error between the Adams and DynaFlexPro result sets is shown in Table 1 for the horizontal position of the left hip, the angle of the right ankle and the contact force developed under the right heel. The relative error was computed by taking the largest absolute difference between the two simulations and dividing it by the largest absolute value from the DynaFlexPro result set. Interestingly, the simulations with an integrator error tolerance of 10−7 had the smallest relative error, and allowed the DynaFlexPro model to simulate four times faster than the Adams model as shown in Table 1.
28
M. Millard et al.
2.2 Foot Contact Foot contact forces were calculated using a two-point foot contact model, with a point contact located at the heel and metatarsal. Normal forces were calculated using the Adams implementation [22] of the continuous HuntCrossley [18] point contact model: ˙ fn = −ky p − c(y)x.
(1)
The Hunt-Crossley contact model calculates normal force (fn ) as a function of penetration depth (y), penetration rate (y), ˙ material stiffness (k, p), and material damping (c(y)). The implementation of the model ramps up damping (c(y)) as a function of penetration depth, to prevent an instantaneous normal force that would be created using a simple damping term such ˙ A dry Coulomb model was used to calculate the force of friction as (cmax y). between the points and the plane: ˙ n. ff = µ(x)f
(2)
This friction model has stiction (µs ) and dynamic friction (µd ) values that are interpolated using a cubic step function [22] between the stiction velocity ˙ as (vs ) and the sliding velocity (vd ) using the tangential contact velocity (x) an input. The particular contact and friction parameters used for the gait simulation were chosen by the pattern search routine (described later) to match the ground reaction forces created during healthy gait [26]. 2.3 Joint Trajectory Control Pre-computed joint trajectories are used to define the gait of the model at the position level. Each joint is actuated using a proportional-derivative (PD) controller that modifies and regulates the predefined joint trajectories. The initial joint trajectories were taken from an existing experimental data set of a healthy gait of an average-sized male [26] and interpolated using a five-term Fourier series: 5 2πkt 2πkt + Bk cos . (3) Ak sin θj (t) = C0 + period period k=1
Some adjustments were made to the trajectories in order to apply them to a sagittal plane gait model: the swing phase of the ankle trajectory had to be altered to prevent the foot from dragging on the ground. This makes sense because the 2D sagittal plane model cannot use hip roll and body sway in the frontal plane to adjust the floor clearance of the swing limb, unlike the subject used in the experiment data set. The interpolated joint trajectories were applied to the PD joint controllers to achieve an initial simulated gait. The optimization routine adjusts the values of the Fourier series coefficients for each limb to search for new gaits. The same Fourier coefficients are used for each limb, offset in phase by π radians, restricting the model to walk with a symmetric gait.
Multi-Step Forward Dynamic Gait Simulation
29
2.4 Balance and Velocity Control A balanced gait and a desired forward velocity is achieved by manipulating the pitch of the HAT. The pitch controller works by monitoring the orientation of the HAT relative to a desired set angle and speeding up or slowing down the progression of the legs through the joint trajectories to keep the HAT at a desired angle. When the HAT pitches forward (backward) beyond the desired set angle, the legs are driven faster (slower) to walk ahead (behind) of the HAT. The velocity controller is very similar to the pitch controller: when the model is moving too slowly (quickly), the reference angle for the pitch controller is increased (decreased), causing the model to lean forward (backward), making the balance controller force the model to walk faster (slower). A detailed account of the pitch and velocity controllers can be found in Peasgood et al.’s original paper [23]. The pitch and velocity controllers balanced the model, but only over a very narrow range: the model could not initiate gait from a stand still, but had to begin the simulation with carefully selected initial conditions. These initial conditions were used for every simulation. 2.5 Pattern Search Optimization Routine Peasgood et al. tuned the control system parameters and the joint trajectories using a pattern search optimization routine. The algorithm is conceptually described below. A more formal treatment of the material can be found in Lewis et al. [20]. 1. Repeat for all parameters: (a) Add amounts +∆ and −∆ (called the grid size) to one parameter. (b) Evaluate the objective function. Save parameter changes that improve the objective function for later use. 2. Update all parameters with the improved values from Step 1. 3. Evaluate the objective function. If it improves, accept the new parameter set from Step 2; else use the original parameter set. 4. Decrease ∆ by half, return to Step 1. Continue until ∆ is below a predefined tolerance. The performance of this algorithm relies on the assumption that a set of individual changes to the joint trajectories will collectively result in an improvement. This assumption is valid if the set of parameters are independent. Peasgood et al.’s assumption of independence does not hold when applied to joint trajectories: a beneficial change to the hip joint trajectory may cause the model to fall when combined with a beneficial change to the knee joint trajectory. Thus this search routine only ever improved the objective function when a set of individual parameter changes was found that just happened to collectively improve the simulated gait.
30
M. Millard et al.
The pattern search optimization routine was used to find joint trajectories that minimized metabolic cost. In an optimization run that had 717 simulations only once did all of the individual improvements found by the pattern search routine result in a more efficient gait when used collectively. This one single improvement was able to decrease the metabolic cost of the simulated gait by 21.5%. An examination of the optimization log file revealed that there were many individual parameter changes that improved the objective function but were ignored. Further investigation showed that a set of individually beneficial parameter changes caused the model to fall when applied simultaneously. The pattern search algorithm was adjusted to take advantage of good individual parameter changes immediately, resulting in a greedy pattern search routine. A further adjustment was made by allowing the pattern search to continue making adjustments to a single parameter that improved the objective function until the improvements ceased.
3 Results The joint angles for the final simulated gait and a healthy human gait [26] are shown in Fig. 2. The standard deviation of the joint angles, torques and ground reaction forces for the current results are negligible, indicating that the gait is very consistent. The joint trajectories of the knee and hip are similar between all three data sets, but the ankle joint trajectories, and torques are quite dissimilar. The log file of the optimization routine revealed that increasing the ankle extension led to a significant reduction in metabolic cost. The adjusted pattern search routine was able to find a gait that resulted in 47.6% less metabolic cost, a 26.1% improvement over Peasgood et al.’s original approach. The foot contact model produced ground reaction forces that differ substantially from those observed during normal human gait [26], as shown in Fig. 3. The poor performance of the foot contact model is partly responsible for the joint torque differences seen between healthy human gait and the simulated results in Fig. 2. The kinematics of the foot contact model also exhibited heel and metatarsal compressions exceeding 40.0 mm, far greater than compression levels of real human heel [10] and metatarsal pads [7]. The kinematics and kinetics of this gait differ from healthy human gait [26], and are highly influenced by differences between the simulated foot contact model and a human foot.
4 Discussion One of the biggest shortcomings of the current system is that the balance controller is so sensitive to changes in gait parameters, that very little of the gait space can be searched without making the model fall. The latest optimization run consisted of 721 simulations; 543 of these simulations resulted in the
Multi-Step Forward Dynamic Gait Simulation Ankle Joint Torque (Plantar Flexion)
Ankle Kinematics
0.3 Winter Millard et al + 1 Std −1 Std,
20
0
−20
Joint Torque/BW (m)
40 Joint Angle (degrees)
31
Winter Millard et al + 1 Std −1 Std
0.25 0.2 0.15 0.1 0.05 0
−40
0
20
40 60 80 Percent of Stride (%)
−0.05 0
100
0.15
60
Joint Torque/BW (m)
Joint Angle (degrees)
100
0.2
40 20 0
20
40 60 80 Percent of Stride (%)
0.1 0.05 0 −0.05 −0.1 −0.15 0
100
Hip Kinematics
20
40 60 80 Percent of Stride (%)
100
Hip Joint Torque (Extension) 0.2
10 5
Joint Torque/BW (m)
Joint Angle (degrees)
40 60 80 Percent of Stride (%)
Knee Joint Torque (Flexion)
Knee Kinematics 80
−20 0
20
0 −5 −10 −15
0.1
0
−0.1
−20 −25 0
20
40 60 80 Percent of Stride (%)
100
−0.2 0
20
40 60 80 Percent of Stride (%)
100
Fig. 2. Joint trajectory and torque comparison between Winter’s recordings of human gait [26], and the current results
model falling. As well, the current system is not well suited to making changes to single parameters without having potentially disastrous effects: changing any one of the Fourier coefficients will alter the entire gait cycle. A parameter change that improves the efficiency of the stance phase, may cause the model to fall during the swing phase. A more advanced balance control system that allows the swing and stance phases to be tuned separately would be a great improvement to the current system.
32
M. Millard et al. Normal Ground Reaction Force 4
0.4
3
Force/BW
Friction Force/BW
Horizontal Ground Reaction Force 0.6
0.2 0
2 1 0
−0.2 −0.4 0
Winter Millard et al + 1 Std −1 Std
20
40
60
80
Percent of Stride (%)
100
−1 0
20
40
60
80
100
Percent of Stride (%)
Fig. 3. Normal and friction force comparison between Winter’s recordings [26] and the current two-point foot contact model
The computationally efficient, but low-fidelity foot contact model produced ground reaction forces and foot pad compressions that were drastically different than those observed in healthy human gait, and negatively affected the simulated joint kinetics. A high-fidelity foot contact model is especially important for a predictive gait simulation: contact forces at the foot will affect the loads at the joints of the legs, and thus the metabolic cost of the leg muscles. If the model does not have a realistic foot contact model, it will be impossible to produce metabolic cost estimates that correspond to what one would expect from a human [28]. A predictive gait simulation without a high-fidelity foot contact model could not converge to a ‘human-like’ gait.
5 Foot Contact Modeling Foot contact models are typically not validated separately from the gait simulation [23, 24, 27]. This approach is problematic: if the ground reaction force representation is poor, it is impossible to know if its due to an error in the foot contact model or due to the way the foot is being used by the assumed control system. The only foot contact model that was validated separately from the gait simulation [12] was validated in a naive way: ankle joint torques and forces estimated from an inverse dynamics analysis were applied to a forward dynamic simulation of the foot model; the fidelity of the foot model was evaluated by comparing the kinematics of the simulated foot to the experimental data. This approach is naive because the quantization and measurement error that is inherent in an experimental inverse dynamics analysis will cause the forward dynamic simulation to diverge from the experimental observations, even if the model is perfect. None of the lumped-parameter foot contact models published to date [12, 23, 24, 27], provide convincing results of emulating a real human foot.
Multi-Step Forward Dynamic Gait Simulation
33
The approach taken in the current work to assess candidate foot contact models is different from previous attempts [12]: a contact model that was suitable for modeling heel tissue was first identified, then candidate foot contact models were created using this contact model. Ground reaction force profiles were used assess the fidelity each model: a realistic foot contact model should develop the same ground reaction forces as a human foot when driven through the same kinematic path. A simple experiment was undertaken to gather the data required to test the candidate foot contact models: a subject’s ankle position and ground reaction force profiles during normal gait were recorded using Optotrak infrared diodes (IREDs) and a force plate. The subject walked at three different subjective paces (slow, normal and quickly) in two different load conditions: bodyweight (BW) and 113% bodyweight. The different velocity and loading conditions were used to assess the sensitivity of the model to cadence and load. The heavier loading condition was achieved by having the subject carry a cinder block. The following sections will detail recent work to create and validate a new foot contact model. 5.1 Foot Pad Contact Properties Studies to determine the stiffness and damping properties of human foot pads have failed to produce consistent results. Traditionally in vivo experimental results disagree by orders of magnitude from in vitro experiments. In the past, in vivo experiments have measured the tissue compression and load by impacting an instrumented mass into a subject’s heel [19, 25]. As long as the skeletal system of the body acts like a perfect ground, the deceleration of the mass will be entirely due to the compression of the heel pad. Aerts et al. [2] was able to experimentally demonstrate that this assumption is invalid: significant amounts of energy is lost through the body, skewing the stiffness values reported from in vivo pendular experiments to be nearly onesixth the published in vitro values. In vitro stiffness and damping estimates obtained using an Instron material testing machine are also suspect because the tissue may not be representative of living foot pad tissue from the general population. An in vivo experimental procedure was developed to estimate foot stiffness and damping: 1. The compression of the heel pad was inferred by tracking the position of the fibular trochlea of the calcaneus using an Optotrak IRED. The fibular trochlea of the calcaneus is a bony protrusion on the lateral side (outside) of the heel bone. A marker was also placed on the medial (inside) side of the calcaneus. 2. The force acting on the heel pad was measured using a force plate. Only the heel was placed on the force plate. 3. The subject voluntarily lowered their heel on the force plate at three subjective speeds: slow, medium and fast. The heel was slowly raised. The fast trials had to be discarded due to undersampling, despite sampling the data at 200 Hz.
34
M. Millard et al.
This experimental method assumes that there is not significant IRED marker movement relative to the calcaneus. The distance between the lateral and medial calcaneus markers was examined to estimate skin stretch: the distance of 68.0 mm changed by 2.0 mm on average during a load cycle, indicating that skin stretch has likely skewed the data. The hysteresis loops obtained during the preliminary experiment have energy losses ranging from 21–37%. This level of energy dissipation is somewhat similar to the 17–19% reported by Gefen et al.’s in vivo study [10] and grossly lower than the 46.5– 65.5% reported by Aerts et al. Direct measurement of the heel pad tissue compression will be needed in order to produce more precise results. 5.2 Volumetric Contact Model Theoretical contact modelling is a very active research area [16], with relatively sparse experimental work [8,14]. Unstable normal directions is one of the numerical problems that can arise during the simulation of contacting bodies with complicated geometry. A new contact model based on interpenetration volumes [16] has been developed to overcome many of the numerical instabilities of existing contact models and is currently being used by the Canadian Space Agency to simulate Canadarm operations. This contact model was chosen as an ideal candidate for a new foot contact model because of its desirable numerical properties. Gonthier et al. [16] analytically derived expressions for the normal force fn , and rolling resistance τ t for a linearly elastic Winkler foundation of stiffness k and damping a impacted by a body with a normal velocity of vcn : ˆ fn = kV (1 + avcn )n, τ t = kaJc · ω t .
(4a) (4b)
These very general expressions assume it is possible to calculate the volume of interpenetration (V ), and its inertia matrix (Jc ). These parameters can be very challenging to compute for arbitrarily shaped bodies, and so analytical expressions for V and Jc were developed for spherical primitives. The foot contact model was then created out of an array of spherical elements. Vectors and matrices are shown in boldface; scalars in regular type. Although it is often reported that human heel tissue stiffness is dependent on strain [17] (using a penetration depth model), it is not clear if this dependence is due to geometry of the pad or the tissue itself: V is a nonlinear function of penetration depth, and might account for the nonlinearity of the heel response. A preliminary study using the experimental procedure described in the previous section was undertaken to garner in vivo hysteretic load curves of the heel. A single spherical element was chosen to represent the heel pad. The stiffness and damping parameters were tuned to try to make the response of the model match the experimental data set. Due to the nonlinearity of the problem, a full-enumeration optimization routine was used to
Multi-Step Forward Dynamic Gait Simulation Med. Load Rate: 3.1 cm/s (BW)
500
1200
400
1000 800
300
Force (N)
Force (N)
Slow Load Rate: 0.7 cm/s (BW)
200
600 400
100
200
0
0
−100 0
2
4 6 Compression (mm)
8
10
−200
0
500
500
400
400
300 200
200 100
0
0 4 6 8 Compression (mm)
10
15
300
100
2
5 10 Compression (mm) Med. Load Rate: 5.0 cm/s (113% BW)
600
Force (N)
Force (N)
Slow Load Rate: 1.1 cm/s (113% BW) 600
−100 0
35
12
−100 0
Heel Experiment Heel Model
2
4 6 8 Compression (mm)
10
12
Fig. 4. Compression load cycles of a tuned volumetric sphere vs experimental data. Stiffness and damping are constant. The label ‘BW’ stands for body weight, and the load rate reported is the maximum normal velocity the heel achieves as it contacts the floor
find a good set of parameters for the simulated heel pad. The results shown in Fig. 4 show that a single volumetric spherical contact element was able to achieve a good agreement with the experimental in vivo load curves in all but one of the trials. Since there were so few experimental trials undertaken it is impossible to know if the ill-fitting trials is a consequence of the ‘memory’ of foot tissue observed in vitro [2], or due to a fundamental difference between the contact model and the contact properties of human heel pads. The preliminary results were encouraging enough to pursue a foot contact model using Gonthier et al.’s linearly elastic volumetric contact model and spherical contact elements. 5.3 Friction Modeling Every foot contact model developed to date has made use of a Coulomb friction model without any experimental justification. There has not been any effort to date to develop experiments to determine the shear and friction properties of human heel pad in vivo or in vitro. Typically the tangential ground reaction forces found in simulated feet are accompanied by unrealistically high initial transient forces [12,24], or in the very least force profiles that
36
M. Millard et al.
deviate [23] from experimental ground reaction force recordings [26]. Initially a Coulomb model was adopted to see how it would perform with the new foot contact models. 5.4 Foot Contact Modelling Results Contact force computation and simulation usually represents a large computational burden when simulating a dynamic system. Thus a simple, yet high-fidelity foot contact model is very desirable. Accordingly foot contact model topologies began from the very simple and progressed in complexity as shown in Fig. 5 to achieve the desired fidelity. Two-dimensional foot contact models were driven at the ankle through experimentally gathered foot trajectories. An optimization routine was used to tune the contact and friction properties of every spherical element, and to make slight adjustments to the geometry of the foot. It is important to note that the slight flex through the mid foot at the tarsal joints [5] is not being modelled: simulating this flexure could be computationally expensive due to the stiff nature of the foot. 5.5 Two-Sphere Single Segment Foot Contact Model The first contact model tested consisted of a single-segment rigid foot with a volumetric spherical contact for the heel and the metatarsal shown in Fig. 5a, with a Coulomb friction model. The foot was tuned to fit all of the trials normal and friction force profiles. The normal ground reaction forces of best fit are shown in Fig. 6. Curiously the model was able to fit the faster paced trials far better than the slow trials, which show some significant deviations. The optimization routine found a solution that yielded metatarsal penetration depths that were nearly 20.0 mm – far greater than the 7.0 mm observed in other studies [7].
Fig. 5. Foot contact models consisting of two, three, and four spheres shown in a, b, and c. Models a and b have been tested; model c is hypothesized to be the least sensitive to changes in walking velocity
Multi-Step Forward Dynamic Gait Simulation Med Walk (BW) 1000
800
800
800
600 400
600 400 200
0.5 Time (s)
0 0
1
Slow Walk (113% BW)
0.5 Time (s)
0 0
1
600 400 200
1000
600 400 200
1
0 0
1
1200
800
Contact Force (N)
800
0.5 Time (s)
Fast Walk (113% BW)
1000
0.5 Time (sec)
400
Med Walk (113% BW)
1000
0 0
600
200
Contact Force (N)
0 0
Contact Force (N)
1000
200
Contact Force (N)
Fast Walk (BW)
1000
Contact Force (N)
Contact Force (N)
Slow Walk (BW)
37
Experiment Model
800 600 400 200
0.5 Time (s)
1
0 0
0.5 Time (s)
1
Fig. 6. The normal force developed between the two-sphere foot-contact model. ‘BW’ stands for ‘bodyweight’
When the foot was tuned to fit the normal forces seen in each trial individually, a better result was obtained, however the geometry of the foot was different at every trial: the metatarsal contact was placed closer to the heel for the slow trials. One explanation for this behaviour lies in the role of toes: during slow walking the toes contribute very little to the normal force profile, shifting the average center of pressure of the forefoot towards the ankle. During fast walking the toes contribute more heavily to the normal force profile, shifting the average center of pressure of the forefoot towards the toes. The friction forces predicted by the model were far below what was recorded during the experiment. Assuming that a Coulomb friction model was inadequate, a more advanced friction model was sought out. Bristle friction models [15] have been used as a substitute for Coulomb friction models in robotics simulations because both true stiction and conservative material shear can be simulated – which is in contrast to a Coulomb friction model that cannot model true stiction, nor conservative material shear. A bristle friction model mimics the forces developed between contacting bodies using the tangential displacement (z) and shear rate (z) ˙ of viscoelastic bristles to generate friction forces as shown in Fig. 7 and in Eq. 5: ˙ fbr = kbr z + abr ψ(z).
(5)
38
M. Millard et al.
Fig. 7. A bristle friction model relies on the state of imaginary bristles on the surface of the contacting bodies to develop friction forces
The function ψ(z) ˙ modulates the friction model from a bristle friction model for slip rates below the Stribeck velocity to a Coulomb friction model for slip rates above the Stribeck velocity. The 3D bristle friction model presented in [15] was employed, but without the dwell-time dependency. The model produced encouraging normal contact force profiles in Fig. 6. Its performance indicated that toes may indeed play an important role in foot contact, and that a Coulomb friction model appears to be inadequate for modelling the tangential forces developed between the foot and the ground. In addition, the optimization routine should be constrained to find solutions that limit the compression of the foot pads to realistic levels. 5.6 Three-Sphere Two Segment Foot Contact Model The foot contact model shown in Fig. 5b incorporated a toe segment (adding one dof to the model) to improve the normal ground reaction force profile, a bristle friction model to improve the friction force profile and parameter tuning routine was restricted to find solutions that had plausible compressions at the heel [10] and metatarsal [6] foot pads. Although the compressions seen at the heel and metatarsals contacts were kept to plausible levels, the normal ground reaction force profiles had a noticeably degraded performance relative to the previous model as shown in Fig. 8. The fast walking trials profiles resemble experimentally gathered normal ground reaction forces however, the model performs very poorly for the remainder of the trials – far worse than the previous model. Some insight into why the previous two models produce the best results for the fast walking trials and poor results for the slow walking trials can be obtained by examining center-of-pressure (COP) of the subject’s foot in the direction of travel shown in Fig. 9. The location of the COP has been plotted in Fig. 9 along with dividing lines marking the transitions from the heel to the mid-foot, the mid-foot to the 1st metatarsal pads and from the 1st metatarsal pad to the toes. Figure 9 clearly shows that the COP spends a significant amount of time in the mid-foot region for the slow trials, which progressively lessens as the walking speed increases. Figure 9 suggests that the models perform poorly for the slow trials because the mid-foot contributes
Multi-Step Forward Dynamic Gait Simulation
39
Fig. 8. The normal force developed between the three-segment foot-contact model with a toe contact
Fig. 9. The center of pressure in the direction of travel was segmented for each trial to show where the COP is relative to the foot. As the trials progress from slower to faster gaits, the COP is spending less and less time in the mid-foot region. This may explain why the contact models pictured in Fig. 5a, b. perform poorly at slow walking speeds: the tissue between the metatarsal and heel contacts is not being modelled
40
M. Millard et al.
Fig. 10. Experimental and simulated friction force profiles, making use of a bristle friction model [15]
significantly to the normal contact force profile, yet this area of the foot is not being modelled. The previous two segment model performed better than the current three segment model because the compression levels of the foot pads was not restricted: the additional compression seen at the metatarsal contact allowed the model to artificially include the contribution of the mid-foot. The results of this model appear to indicate that it is necessary to model mid-foot contact. Accordingly a candidate foot contact model is shown in Fig. 5c that includes a mid-foot contact. This candidate foot contact model has yet to be implemented and tested. The friction force profile shown in Fig. 10 was much improved over the Coulomb model, though failed to match all data sets shortly after foot contact and only roughly approximated the tangential forces towards the end of the foot contact. The poor representation of the friction force profiles may not be due to the model. The experimentally measured shear movements might also be too small to measure accurately: skin stretch and Optotrak IRED position error might be drowning the signal.
6 Conclusions Multi-step, forward-dynamic human gait simulations do not yet have the fidelity to create precise predictions of how humans would walk in new situations. Peasgood et al.’s [23] system was a first attempt at developing a predictive human gait simulation. Although Peasgood et al.’s system was the first to
Multi-Step Forward Dynamic Gait Simulation
41
show that prosthetic gait has a greater metabolic cost than healthy gait in silico using a forward dynamic simulation, the predicted kinetics of Peasgood et al.’s healthy model were significantly different from published joint kinetics of human gait found using inverse dynamics analysis [26]. A high-fidelity kinetic response is required for high-fidelity gait predictions since metabolic cost is a function of muscle stress and thus joint torque: if the kinetic response of the model is poor, the model will not be able to converge to a human-like gait. The joint torque profiles of the simulated gait are highly influenced by the ground reaction forces applied at the foot. Foot contact models were created using spherical elements and contact forces were calculated using Gonthier et al.’s volumetric contact model. The models were validated by driving the ankle joint through experimentally recorded ankle trajectories and examining the quality of match between the ground reaction forces developed at the simulated foot, and the human foot. Current modelling efforts indicate that it is important to represent the heel, mid-foot, and metatarsal foot pads in the contact model. It is important to note that it may not be possible to develop a 2D foot contact model that perfectly replicates the forces created between a 3D foot and the ground: foot roll in the frontal plane [5] is ignored in 2D. Additionally, a bristle friction model was found to predict foot friction forces better than a Coulomb friction model. The results of the bristle friction model were not ideal: it remains unclear if the friction model is inadequate or if noise in the experimentally collected foot kinematics is making this validation approach difficult. The sensitivity of Peasgood et al.’s balance controller to initial conditions and joint trajectories prevented the gait space from being searched widely. Nearly six out of seven simulations resulted in a fall. High-fidelity predictive human gait simulations would benefit greatly from an advanced balance controller that is not sensitive to initial conditions, and is able to initiate, maintain, and terminate gait as adeptly as a human. There is a lot of fundamental research that needs to be conducted before high-fidelity predictive human gait simulations can be developed. The predicted kinetics of a healthy human gait should follow those found using inverse dynamics [26]. Since the kinetics of the leg joints are highly influenced by the ground reaction forces at the foot, a high-fidelity foot contact model is required. Additionally, the model needs an advanced balance controller in order to search the gait space to find a metabolically minimal, or ‘human-like’ gait.
Acknowledgements This research was supported by the Natural Sciences and Engineering Research Council of Canada. The authors would like to thank Mike Peasgood for his assistance with the MSC.Adams model and Mike MacLellan for his assistance with the experiments described in this paper.
42
M. Millard et al.
References 1. Ackermann M, Schiehlen W (2006) Dynamic analysis of human gait disorder and metabolic cost estimation. Arch Appl Mech 75:569–594 2. Aerts P, Kerr RF, De Clecq D, Illsley DW, McNeil AR (1995) The mechanical properties of the human heel pad: a paradox resolved. J Biomech 28:1299–1308 3. Anderson F, Pandy M (1999) Static and dynamic optimization solutions for gait are practically equivalent. J Biomech 34:153–161 4. Anderson F, Pandy M (2001) Dynamic optimization of human walking. J Biomech Eng 123:381–390 5. Carson MC, Harrington ME, Thompson N, O’Connor JJ, Theologis TN (2001) Kinematic analysis of a multi-segment foot model for research and clinical applications: a repeatability analysis. J Biomech 34:1299–1307 6. Cavagna GA, Heglund NC, Taylor CR (1977) Mechanical work in terrestrial locomotion: two basic mechanisms for minimizing energy expenditure. Amer J Reg Integ Comp Phys 233(5):243–261 7. Cavanagh P (1999) Plantar soft tissue thickness during ground contact in walking. J Biomech 32:623–628 8. Crook AW (1952) A study of some impacts between metal bodies by a piezoelectric method. Proc Roy Soc Lond, Ser A 212:377–390 9. DynaFlexPro http://www.maplesoft.com/dynaflexpro. Accessed May 2, 2008 10. Gefen A, Megido-Ravid M, Itzchak Y (2001) In vivo biomechanical behavior of the human heel pad during the stance phase of gait. J Biomech 34:1661–1665 11. Gerritsen KGM, Bogert AV, Nigg BM (1995) Direct dynamics simulation of the impact phase in heel-toe running. J Biomech 28:661–668 12. Gilchrist L, Winter D (1996) A two-part viscoelastic foot model for use in gait simulations. J Biomech 29(6):795–798 13. Gilchrist L, Winter D (1997) A multisegment computer simulation of normal human gait. IEEE Trans Rehab Eng 5(4):290–299 14. Goldsmith W (1960) Impact: the theory and physical behavior of contacting solids. Edward Arnold, London 15. Gonthier Y, McPhee J, Piedboeuf J, Lange C (2004) A regularized contact model with asymmetric damping and dwell-time dependent friction. Mult Syst Dyn 11:209–233 16. Gonthier Y, McPhee J, Lange C, Piedboeuf JC (2007) On the implementation of coulomb friction in a volumetric-based model for contact dynamics. In: Proceedings of ASME IDETC, Las Vegas, NY, USA 17. Guler H, Berme N, Simon S (1998) A viscoelastic sphere model for the representation of plantar soft tissue during simulations. J Biomech 31:847–853 18. Hunt K, Crossley F (1975) Coefficient of restitution interpreted as damping in vibroimpact. Trans ASME J App Mech 42(E):440–445 19. Kinoshita H, Francis PR, Murase T, Kawai S, Ogawa T (1996) The mechanical properties of the heel pad in elderly adults. Eur J Appl Phys 73:404–409 20. Lewis R, Torczon V (1999) Pattern search algorithms for bound constrained minimization. SIAM J Optim 9(4):264–269 21. MSC.Adams http://www.mscsoftware.com/products/adams.cfm. Accessed May 2, 2008 22. MSC Software 2005r2 Contact. In: Adams/Solver Fortran help 23. Peasgood M, Kubica E, McPhee J (2007) Stabilization and energy optimization of a dynamic walking gait simulation. ASME J Comp Nonl Dyn 2:65–72
Multi-Step Forward Dynamic Gait Simulation
43
24. Taga G (1995) A model of the neuro-musculo-skeletal system for human locomotion. Biol Cybern 73:97–111 25. Valiant G (1984) A determination of the mechancial characteristics of the human heel pad in vivo. Ph.D. thesis, The Pennsylvania State University, State College, PA 26. Winter D (2005) Biomechanics and motor control of human movement. Wiley, Hoboken, NJ, 3rd edition 27. Wojtyra M (2003) Multibody simulation model of human walking. Mech Based Design Struct Mach 31(3):357–377 28. Zarrugh MY, Todd FN, Ralston HJ (1974) Optimization of energy expenditure during level walking. Eur J App Phys 33:293–306
A Fast NCP Solver for Large Rigid-Body Problems with Contacts, Friction, and Joints Alessandro Tasora1 and Mihai Anitescu2 1
2
Universit` a degli Studi di Parma, Dipartimento di Ingegneria Industriale, 43100 Parma, Italy E-mail:
[email protected] Mathematics and Computer Science Division, Argonne National Laboratory, 9700 South Cass Avenue, Argonne, IL 60439, USA E-mail:
[email protected]
Summary. The simulation of multibody systems with rigid contacts entails the solution of nonsmooth equations of motion. The dynamics is nonsmooth because of the discontinuous nature of noninterpenetration, collision, and adhesion constraints. We propose a solver that is able to handle the simulation of multibody systems of vast complexity, with more than 100,000 colliding rigid bodies. The huge number of nonsmooth constraints arising from unilateral contacts with friction gives rise to a nonlinear complementarity problem (NCP), which we solve by means of a highperformance iterative method. The method has been implemented as a high-performance software library, written in C++. Complex simulation scenarios involving thousands of moving parts have been extensively tested, showing a remarkable performance of the numerical scheme compared to other algorithms.
1 Introduction When simulating nonsmooth systems affected by discontinuities, such as those imposed by frictionless or frictional contacts, the direct application of numerical methods for ordinary differential equations and differential algebraic equations can be hard, if not impossible [10]. Some naive numerical approaches try to circumvent the complexity of nonsmooth dynamics by introducing a smooth and stiff approximation of the problem, however at a cost of requiring prohibitively small time steps to achieve stability. Other approaches try to solve the nonsmooth integral by piecewise integration, but such methods cannot be applied to systems with a large number of contacts because stop-and-restart schemes can easily enter infinite or illposed loops. These limitations motivate our research on innovative methods that deal directly with the discontinuities of the equations of motion. Differently from C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
45
46
A. Tasora and M. Anitescu
the case of smooth dynamics, where for the past decade efficient numerical methods have been able to compute solutions in linear time [12], the case of nonsmooth dynamics introduces many numerical difficulties that are still under active and fertile investigation in the mathematical community. Among the various potential applications of nonsmooth dynamics are granular flows, rock soil dynamics, refueling of pebble-bed nuclear reactors, interaction between robot and large environments, and real-time simulations for augmented reality problems that are challenging or still impossible to simulate because of the severe computational requirements. We formulate the problem in terms of differential inclusions over the speedimpulses space [9], a fertile approach that expresses the problem as a differential complementarity problem (DCP), requiring the solution of a single NCP per time step [4]. Posing the problem in terms of speeds and impulses rather than in terms of accelerations and reaction forces has various advantages. However, here we do not enter into the details of the integration scheme, except to mention that our focus is a recently-proposed optimization-based scheme [5], for which details will be presented as needed for setting up the NCPs. For additional information about such schemes the interested reader can examine the bibliography [11]. Since the main bottleneck is by far represented by the NCP, which must be solved at each step, we will focus on this part. The difficulty of this process rests with the fact that the number of complementarity constraints increase with the amount of contacts and collisions, leading to higher and higher computational overhead [13]. Previous efforts in solving NCP problems arising in nonsmooth dynamics were based on approximating them by Linear Complementarity Problems (LCPs), since solvers for LCPs are available, such as the simplex-based pivoting methods of Lemke or Dantzing; nonetheless these pivoting methods suffer exponential explosion and cannot practically handle systems with more than a few hundred contacts [14]. A more promising way to handle NCP problems, followed in this work, is iterative methods, in particular, the projected-SOR and projected-GaussSeidel [7, 8]. Although the idea of solving frictional problems by means of iterative schemes with projections is not new [1, 16], our iterative method is based on an improved projected-block-symmetric-overrelaxed scheme which aims at the best computational performance when applied to a large number of contacts with friction. With some requirements on the spectral radius of the iteration matrix, and since the scheme from [5] results in convex subproblems, we can prove that, when applied to our formulation, the iteration is a fixedpoint contraction that converges monotonically to the NCP solution [6].
2 Implementation Let us introduce the position vector q ∈ Rnq , the speed vector q˙ ∈ Rnq , the mass matrix [M] ∈ Rnq ×nq , and the sum of external applied forces and inertial forces ft (t(l) , q (l) , q˙ (l) ) ∈ Rnq .
Fast NCP Solver for Large Rigid-Body Problems
47
The vector of Lagrangian multipliers, representing the unknown impulses in constraints and in contact points, is the vector γ ∈ Rnc , which can be T partitioned into p subvectors γi ∈ Rni such that γ = γ1T , γ2T , . . . , γpT . In detail, for each bilateral scalar constraint there is a corresponding one-dimensional impulse γi ∈ R1 , ∀i ∈ GC , along with the corresponding bilateral constraint equation Ci (q, t) = 0 ∈ R1 and Jacobian [Cq (q, t)]i = ∇q Ci (q, t)T ∈ Rnq ×1 . Similarly, for each unilateral constraint we have a corresponding onedimensional impulse γi ∈ R1 , ∀i ∈ GD , along with the corresponding bilateral constraint equation Di (q, t) > 0 ∈ R1 and Jacobian [Dq (q, t)]i = ∇q Di (q, t)T ∈ Rnq ×1 . If we also introduce contact constraints with nonlinear friction cones in three-dimensional space, the γ vector includes the vectors of unknown normal and tangential impulses, that is, the three-dimensional γi ∈ R3 , ∀i ∈ GF , along with the corresponding equations Fi (q) = 0 ∈ R3 and Jacobian [Fq (q)]i = ∇q Fi (q, t)T ∈ Rnq ×3 . These vectors and matrices have a block structure corresponding to the three components n, u, v (normal, u-tangential, v-tangential), that is, γi = T T {γi,n , γi,u , γi,v } and [Fq ]i = Fq Ti,n |Fq Ti,u |Fq Ti,v . Each contact point has a friction coefficient µi , ∀i ∈ GF . With a time step h and a stabilization coefficient 0 < K < 1, we expand the original first-order DCP formulation of [4] as follows: [Cq ]Ti γi + [Dq ]Ti γi + [Fq ]Ti γi M (q˙ (l+1) − q˙ l ) = i∈GC
i∈GD (l)
(l)
i∈GF
(l)
(1a) + hft (t , q , q˙ ), K ∂Ci + [Cq ]i q˙ (l+1) , i ∈ GC , 0 = Ci (q (l) , q˙ (l) , t) + (1b) h ∂t K 0 ≤ Di (q (l) ) + [Dq ]i q˙ (l+1) ⊥ γi ≥ 0, i ∈ GD , (1c) h K 0 ≤ Fi,n (q (l) ) + [Fq ]i,n q˙ (l+1) ⊥ γi,n ≥ 0, i ∈ GF , (1d) h (γi,u , γi,v )i∈GF = argminµ γ ≥√(γ )2 +(γ )2 q˙ T (γi,u [Fq ]Ti,u + γi,v [Fq ]Ti,v ) , i n
u
v
(1e) q
(l+1)
−q
(l)
= hq˙
(l+1)
.
(1f)
The DCP problem above is affected by complementarity constraints (1c) and (1d), where only one of the inequalities can hold at a single time. An alternative way to write complementarity constraints a > 0 ⊥ b > 0 is by inner product: a > 0, b > 0, a, b = 0. The extremely nonlinear nature of complementarity conditions is the main cause of computational overhead. Moreover, the introduction of the original Coloumb model in the DCP leads to an NCP problem that be nonconvex
48
A. Tasora and M. Anitescu
under some circumstances [3]. A possible workaround is to relax the Coloumb original assumptions and to replace complementarity Eq. (1d) with the following: K ˙ 2 + ([Fq ]i,v q) ˙ 2 ⊥, 0 ≤ Fi,n (q (l) ) + [Fq ]i,n q˙ (l+1) − µi ([Fq ]i,u q) (2) h 0 ≤ γi,n , i ∈ GF . This modification to the original scheme has negligible effects for small values of q˙ or µ, converges to the same class of weak solutions as the original scheme described in [2, 11] for h → 0, and has the positive effect of making the NCP problem always convex [5]. The resulting NCP is also a convex cone complementarity problem (CCP), hence a special case of convex VI, variational inequality. For a more compact notation, we assume that constraints are ordered such that i ∈ GC < i ∈ GD < i ∈ GF . Hence, we can introduce the following vectors and matrices: T T T T γE = γC |γD |γT ∈ Rnc , (3a) T ∈ Rnq ×nc , (3b) [Eq ] = [Cq ]T |[Dq ]T |[Fq ]T T T T T ∈ Rnc , (3c) bE = bC |bD |bT where, if we assume mC bilateral constraints, mD unilateral constraints, and mF frictional contacts, we have the following: T K K ∂C1 K ∂C2 ∂CmC bC = C1 + , C2 + , . . . , CmC + , (4a) h ∂t h ∂t h ∂t T K K K D1 , D2 , . . . , DmD , (4b) bD = h h h T K K K F1,n , 0, 0, F2,n , 0, 0, . . . , FmF ,n , 0, 0 . (4c) bF = h h h We now introduce k = [M]q˙ l + hft , [N] = [Eq ][M]−1 [Eq ]T and r = [Eq ][M]−1 k + bE . We also introduce the convex cones Υi , defined as follows ⎧ i ∈ GC , ⎪ ⎨ R,+ , i ∈ GD , (5) Υi = R ⎪ ⎩ (γn , γu , γv ) ∈ R | µi γn ≥ γu2 + γv2 , i ∈ GF . Their polar cones Λi = Υi◦ , where we denote by ◦ the immediately calculated as ⎧ ⎪ ⎨ 0,+ R , −Λi = ⎪ ⎩ (sn , su , sv ) ∈ R | sn ≥ µi s2u + s2v ,
polar set, can be i ∈ GC , i ∈ GD , i ∈ GF .
(6)
Fast NCP Solver for Large Rigid-Body Problems
49
Denoting set G = GC ∪ GD ∪ GF , the direct sum of these cones ! the total index ! o Υ , and Λ = Υ = i∈G i i∈G Λi = Υ , and using Eqs. (1a–6) we obtain a canonical CCP (for full details, see [6]): ([N]γE + r) ∈ −Λ,
γE ∈ Υ,
([N]γE + r) , γE = 0,
(7)
which can be solved for unknowns γE . One can then obtain the unknown speeds with the simple formula q˙ = [M]−1 (k + [Eq ]T γE ).
(8)
To solve Eqs. (7), which represents the most relevant computational burden, we propose an iterative scheme. We make the following assumptions: A1 Matrix [N] of the NCP problem is symmetric and positive semidefinite. A2 There exists a positive number α > 0 such that, at any iteration r, r = 0, 1, 2 . . ., we have that B r αI. A3 There exists a positive number β > 0"such that, at any# iteration r, r = −1 (γ r+1 − γ r ) ≥ 0, 1, 2 . . ., we have that (γ r+1 − γ r )T (λω[B]r ) − [N] 2 $2 $ β $γ r+1 − γ r $ . With these assumptions, we can prove the convergence to the NCP solution by means of the fixed-point iteration r+1 r r = ΠΥ + (γE − ω[Br ] ([N]γE + r)) , γE
(9)
where we introduced the nonsmooth projection mapping ΠΥi (·) : Rnc → Rnc onto the boundary of a ni th dimensional convex cone Υ . See [6] for a detailed proof. For the iteration matrix [Br ] we use the following block-diagonal structure: ⎡ ⎤ ··· 0 η1 In1 0 ⎢0 ⎥ η2 In2 · · · 0 ⎢ ⎥ [Br ] = ⎢ .. (10) ⎥. .. . . . . ⎣. ⎦ .. . 0 0 · · · ηni Inni The complete projection operator ΠΥ + : Rnc → Rnc can be expressed as T . ΠΥ + = ΠΥ1 (γ1 )T , . . . ΠΥ p (γp )T Each single ΠΥi (·) projection operator behaves as ΠΥ (γi ) = argminζ∈Υi ||γi − ζ|| in order to be globally nonexpansive for projection on convex subspace Υi . For the multipliers introduced by friction constraints, it is enough to introduce a straightforward mapping ΠΥi (·) : R3 → R3 , i ∈ GF , to be applied multiple times on all the triplets of contact multipliers γi , i ∈ GF . Such a projection can be implemented as depicted in Fig. 1, where three subcases are detected:
50
A. Tasora and M. Anitescu γr<µγn
Υi
Υi A
µf
ΠΥ(γP) C Dn
Dg
B
γP
f
γn
ΠΥ(γP)
Dg Πn
γP
E
γn
ϕ Du Ψ
γu
Πr
O
Dv Υ0i
γv
γr
De De γr
γ 0i γr<-(1/µ)γn
Fig. 1. Nonexpansive projection on the ith convex friction cone
• When γi is inside the Υi cone, the vector is left untouched. • When γi is inside the Υio polar cone, it maps to the origin {0, 0, 0}. • Otherwise γi is projected to the nearest point on the Υi cone. One can verify that such a mapping exhibits ||ΠΥ (a) − ΠΥ (b)|| ≤ ||a − b||. Hence the nonexpansive property holds. This orthogonal projection onto the surface of the cone can be easily obtained by means of a few operations. First we compute γr = γu2 + γv2 . The reaction is inside the friction cone for γr < µγn and inside the polar cone for γr < − µ1 γn . For the remaining case, applying the Pitagora theorem on triangle OCA and similarity between triangles OCA and OCD, we easily get Πn =
γr µ + γn Πr Πr , Πr = µΠn , Πu = γu , Πv = γv . 2 µ +1 γr γr
(11)
The projective operator, modified for the complete case including bilateral constraints C, generic unilateral constraints D, and frictional contacts F , becomes the following nonsmooth mapping: ⎧ Πi = γi ∀i ∈ GC ⎪ ⎪ ⎪ ⎪ ⎪ Πi = γi ⎪ ⎪ ∀i ∈ GD ∧ γi > 0 ⎪ ⎪ ⎪ ∀i ∈ G ∧ γ ≤ 0 Πi = 0 ⎪ D i ⎪ ⎪ ⎪ ⎨ ∀i ∈ GF ∧ γr < µi γn Πi = γi . (12) ΠΥ + 1 ∀i ∈ GF ∧ γr < − µi γn Πi = {0, 0, 0}T ⎪ ⎪ ⎪ ⎪ +γn ⎪ ∀i ∈ GF ∧ γr > µi γn ∧ γr > − µ1i γn Πi,n = γrµµ2i+1 ⎪ ⎪ i ⎪ ⎪ µ Π ⎪ ⎪ Πi,u = γu i γri,n ⎪ ⎪ ⎪ µ Π ⎩ Π = γ i i,n i,v
v
γr
Fast NCP Solver for Large Rigid-Body Problems
51
Iterations can be computed storing the [N] matrix explicitly, except for the sparse Jacobians of the constraints, hence resulting in strictly O(n) space requirements. To this end, we have developed efficient data structures and sparsity-preserving multiplication algorithms. Below we present the final algorithm with some improvements, most noticeably, with the addition of an acceleration parameter λ and implementing immediate symmetric updates as in SSOR fixed-point methods. For reasons of space, we omit the simplifications that allow us to compute the iteration without explicitly building the [N] matrix, and we do not discuss the details of how to choose the η values. ALGORITHM 1. For i = 1, 2, . . . , p compute the matrices si = [M]−1 [Eq ]Ti and matrices gi = [Eq ]i si (the latter being simple scalars for i ∈ GC ∪ GD ). 3 . 2. For i ∈ GF , compute ηi = Trace (g ) i
3. 4. 5.
6. 7.
8. 9.
For i ∈ GC ∪ GD , compute ηi = g1i . If warm starting with with some initial guess γ ∗ , initialize reactions as = 0. γ 0 = γ ∗ , otherwise γ 0 % nc si γi0 + [M]−1 k. Initialize speeds: q˙ = i=1 For i = 1, 2, . . . p, perform the updates δir = (γir − ωηi ([Eq ]i q˙ r + bi )); γir+1 = λΠΥ (δir ) + (1 − λ)γir ; ∆γir+1 = γir+1 − γir ; q˙ := q˙ + si ∆γir+1 . r := r + 1. For i = p, . . . , 2, 1, optionally perform symmetric updates δir = (γir − ωηi ([Eq ]i q˙ r + bi )); γir+1 = λΠΥ (δir ) + (1 − λ)γir ; ∆γir+1 = γir+1 − γir ; q˙ := q˙ + si ∆γir+1 . r := r + 1. Repeat the loop 5 until convergence or until maximum number of iterations.
3 Examples In this section, we present results from the application of our algorithm to two examples: a small-scale spider robot simulation example and a large-scale granular flow example. 3.1 Spider Robot A walking robot has been modeled and simulated (see Fig. 2). The robot has six legs, each being actuated by two servomotors, for a total of 12 actuators.
52
A. Tasora and M. Anitescu
Fig. 2. Example: frames from a real-time simulation of a walking robot with 37 articulated parts, 12 motors, and frictional contacts between legs and floor
Multiple linkages and revolute joints are used to constrain the gait motion of the leg, hence leading to a total of 37 moving parts. Each leg can collide with the environment. In detail, the feet collide with the ground in an intermittent pattern, introducing nonsmooth phenomena that must be handled by the solver. The rigid contacts are affected by a friction coefficient µ = 0.6. Since this numerical approach does not introduce artificial stiffness into the system (unlike approaches that use spring-dashpot regularization at the point of contact), large integration time steps are allowed: we use h = 0.01 s. Although 50 iterations were used for each NCP problem, the CPU time for computing each time step is about 1 ms on a 2 GHz laptop. That is, the simulation is almost ten times faster than the real-time requirement. For simpler systems (pendulum, four-bar linkages, etc.) where a smaller number of constraints are used and a lower number of iterations can meet the tolerance requirements, the simulation can run almost 100 times faster than real time. 3.2 Granular Flow As a benchmark for measuring the performance of the solver, we used the granular flow simulation depicted in Fig. 3. A box is filled with a cascade of small spheres, which fill the inner space and slowly flow out from a lateral opening. All spheres are subject to rigid contacts with friction. This benchmark represents one of the most critical scenarios because, in general, the stability of the integration and the speed of convergence of the solver are both affected negatively by dense packing of objects. Multiple tests were performed with an increasing number of spheres, up to 50,000 colliding objects. The solver is able to handle problems with more than two millions unknowns. On average, the CPU time for a single time step, with 120 iterations and 50,000 objects, is 18 s.
Fast NCP Solver for Large Rigid-Body Problems
53
Fig. 3. Stress test: granular flow of spheres falling into a shaker. The figure shows the case with 9,000 spheres. A system of more than 100,000 complementarity constraints is solved at each step
Note that the large number of iterations is mandatory if a strict tolerance is required: in this benchmark, the error of interpenetration between the colliding geometries does not exceed 0.002d, with d being the radius of the spheres. If a less precise solution were tolerated, a smaller number of iterations could be used, hence improving the CPU performance even further. As a side note, we remark that the NCP solver acts on potential contacts fed by a collision-detection algorithm. This algorithm should be as efficient as possible; otherwise the time spent in computing the points of contacts will be comparable to the time used for solving the NCP problem.1
4 Conclusions We have developed a fast computational method to handle multibody systems encompassing nonsmooth phenomena, such as those caused by contacts and friction. The method has been implemented as a C++ API in the Chrono::Engine middleware and extensively tested with complex simulation scenarios [15]. Benchmarks and comparisons show the superior performance of the method compared to other algorithms. Our algorithm can run in O(n) space and time. Thus it can simulate in real-time even the most complex mechanical systems.
1
Although this case of contact between spheres is trivial insofar complexity of contact features, we stress the importance of implementing a fast and reliable collision-detection engine for handling also more general cases. For example, our collision engine is able to compute collisions between generic compound shapes, convex or not, and exploits a sweep-and-prune broad phase that avoids superlinear algorithmic complexity.
54
A. Tasora and M. Anitescu
Acknowledgements Mihai Anitescu was supported by the Mathematical, Information, and Computational Sciences Division subprogram of the Office of Advanced Scientific Computing Research, Office of Science, U.S. Department of Energy, under Contract DE-AC02-06CH11357.
References 1. Alart P, Curnier A (1991) A mixed formulation for frictional contact problems prone to Newton-like solution methods. Comput Meth Appl Mech Eng 92:353– 375 2. Anitescu M, Potra FA (1997) Formulating dynamic multi-rigid-body contact problems with friction as solvable Linear Complementarity Problems. Nonlinear Dyn 14:231–247 3. Anitescu M, Hart GD (2003) Solving nonconvex problems of multibody dynamics with contact and small friction by sequential convex relaxation. Mech Based Design Mach Struct 31(3):335–356 4. Anitescu M, Hart GD (2004) A constraint-stabilized time-stepping approach for rigid multibody dynamics with joints, contact and friction. Int J Numer Meth Eng 60(14):2335–2371 5. Anitescu M (2004) Optimization-based simulation of nonsmooth dynamics. Math Progr 105(1):113–143 6. Anitescu M, Tasora A (2007) An iterative approach for cone complementarity problems for nonsmooth dynamics. Argonne National Laboratory Preprint ANL/MCS-P1413-0507 7. Kocvara M, Zowe J (1994) An iterative two-step algorithm for linear complementarity problems. Numer Math 68:95–107 8. Mangasarian OL (1977) Solution of symmetric linear complementarity problems by iterative methods. J Optim Theory Appl 22:465–485 9. Monteiro-Marques MDP (1993) Differential inclusions in nonsmooth mechanical problems: Shocks and dry friction. Progress in Nonlinear Differential Equations and Their Applications, vol 9. Birkhauser Verlag, Berlin 10. Pfeiffer F, Glocker C (1996) Multibody Dynamics with Unilateral Contacts. Wiley, New York 11. Stewart E (1998) Convergence of a time-stepping scheme for rigid body dynamics and resolution of Painleve’s problems. Arch Rat Mech Anal 145(3):215–260 12. Tasora A (2001) An optimized Lagrangian-multiplier approach for interactive multibody simulation in kinematic and dynamical digital prototyping. In: Casolo F (ed) Proceedings of VIII ISCSB. CLUP, Milano 13. Tasora A (2006) An iterative fixed-point method for solving large complementarity problems in multibody systems. In: Proceedings of XVI GIMC, Bologna
Fast NCP Solver for Large Rigid-Body Problems
55
14. Tasora A, Manconi E, Silvestri M (2006) Un nuovo metodo del simplesso per il problema di complementarit` a linerare mista in sistemi multibody. In: Proceedings of AIMETA 2005, 11–15 September 2005, Firenze 15. Tasora A (2006) Chrono::Engine site: www.deltaknowledge.com/chronoengine 16. Tonge R, Zhang L, Sequeira D (2006) Method and program solving LCPs for rigid body dynamics. U.S. Patent 7079145 B2
Solution Procedures for Maneuvering Multibody Dynamics Problems for Vehicle Models of Varying Complexity Carlo L. Bottasso Dipartimento di Ingegneria Aerospaziale, Politecnico di Milano, Via La Masa 34, 20156 Milano, Italy E-mail:
[email protected] Summary. We describe a suite of solution procedures for the solution of optimal control problems in vehicle dynamics, which find applicability in the modeling of maneuvers at the boundaries of the vehicle operating envelope. The procedures described here cater for a wide range of multibody vehicle models of varying complexity, from coarse models capable of only capturing the slower scale solution components, all the way to fine scale models characterized by many degrees of freedoms and faster solution scales. The methods are of general applicability, and are readily applicable to existing multibody software implementations. Numerical applications in rotorcraft flight mechanics are used to illustrate some of the features of the proposed procedures.
1 Introduction Maneuvering Multibody Dynamics (MMBD) is concerned with the off-line simulation of transient maneuvers at the boundaries of the operating envelope of a vehicle, using multibody virtual prototypes. Such problems fall within the category of Optimal Control (OC) problems. Given a vehicle model, a Maneuver Optimal Control Problem (MOCP) defines a maneuver in terms of a cost function and constraints, and seeks a solution for the model control inputs and associated vehicles states which minimize the cost while satisfying all constraints, including the vehicle equations of motion. The capability to solve MMBD problems can be a valuable asset, for example in aerospace and automotive applications. In fact, limit operating conditions often drive the vehicle design, while its safety in limit cases determines best practices and procedures for its conduction. Yet, algorithms to solve OC problems are not available in general purpose multibody software codes, which are in fact designed as initial value solvers. In this paper we discuss three solution procedures which support a large spectrum of vehicle models and of MMBD problems. The three procedures C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
57
58
C.L. Bottasso
described here enable the efficient solution of MOCPs, accounting for the growing complexity and cost of the numerical processes as one increases the level of modeling detail of the vehicle and the desire to resolve faster scale solution components. Multibody models of vehicles typically involve multiple interacting fields (structural, aerodynamic, hydraulic, etc.), and are often based on general purpose software implementations. Given the complexity of such computer programs, and the widespread use of these codes in industry, it is important to consider MMBD solution procedures which can be coupled to such codes with minimum changes. For this reason, the present paper limits its attention to approaches which can be coupled with black-box multibody solvers, i.e. methods which do not require the manipulation of the model equations of motion. Previous and related work by the author and his colleagues in this area, with specific reference to rotorcraft aeromechanics, has been described in Refs. [10–12, 14] and references therein.
2 Vehicle Models Consider a generic multi-physics vehicle model M described in terms of a set of non-linear differential algebraic equations written as fSD (x˙ SD , xSD , λ, xC , u, t) = 0, c(xSD , t) = 0, fC (x˙ C , xC , xSD , u, t) = 0,
(1a) (1b) (1c)
where xSD are the structural dynamics states, λ are constraint-enforcing Lagrange multipliers, xC are the states describing the coupled fields (e.g. aerodynamic, hydraulic, etc.), and u is the control input vector. Equations (1a) group together the equations of dynamic equilibrium and the kinematic equations. Equations (1b) represent mechanical joint constraint equations, while Eq. (1c) are the coupled physics governing equations. Finally, the notation ˙ = d(·)/dt indicates a derivative with respect to time t. (·) For the sake of a simpler discussion, in the following we will consider that the Lagrange multipliers λ and redundant structural dynamics states can always be formally eliminated in favor of a minimal set of coordinates [17]; this is done only to lighten the notation, since in general other considerations often make the use of minimal sets of coordinates not convenient.1 Therefore, the governing equations will be assumed to be of the ordinary differential type. Furthermore, it will be convenient to use a more synthetical form of the above equations in the following pages, and hence we will write the vehicle model simply as 1
The interested reader can refer to [8] for a description of the solution of optimal control problems for multibody models in redundant form.
Solution Procedures for Maneuvering Multibody Dynamics Problems
˙ x, u, t) = 0, f (x, y = h(x),
59
(2a) (2b)
where x ∈ Rn , u ∈ Rm and f : Rn × Rn × Rm × R≥0 → Rn . We notice that, although n can be large, m is in general small for a vehicle to be controllable by a human operator (e.g., m = 4 for many fixed and rotary wing vehicles). In addition, Eq. (2b) defines a vector of outputs y ∈ Rl , and h(·) : Rn → Rl . The outputs will typically represent some global vehicle states which describe its gross motion, such as position, orientation, linear and angular velocity of a vehicle-embedded frame with respect to an inertial frame of reference, or other quantities useful for formulating the MOCP. Hence, also l is typically a small number. Equations (2a) are solved for the forward simulation problem by providing a time history of control inputs u(t) and initial conditions on the states x(0) = x0 . In terms of its states, the response of system M to u(t) can be formally written as (3) x(t) = Φu (x0 , t), for t ∈ Ω, where Φ(·) (·, ·) is the state flow function. Accordingly, one obtains also the associated values of the outputs through (2b) as y(t) = h(Φu (x0 , t)) = Ψu (y0 , t),
(4)
where Ψ(·) (·, ·) is the output flow function and y0 = h(x0 ).
3 The Maneuver Optimal Control Problem Consider a plan to transition in finite time model M from one state condition to another while achieving a given task. The starting and arrival states are often, although not necessarily, trim states, i.e. relative equilibria for the vehicle dynamics [16].2 Clearly, given starting and arrival states, there is an infinite number of ways to transition between the two. A way to remove this arbitrariness is to formulate a maneuver as a MOCP [9, 11]: min
x,y,u,T
s.t.:
2
J = J(y, u, T, Ti ),
(5a)
˙ x, u, t) = 0, f (x, y = h(x), g(x, y, u, t, T, Ti ) ≤ 0.
(5b) (5c) (5d)
Initial and final states are typically determined by solving a separate trim problem off-line, whose details depend on the specifics of the vehicle model being considered. See for example Refs. [13, 22].
60
C.L. Bottasso
The problem is defined on the interval Ω = [0, T ], t ∈ Ω, where the final time T is typically unknown and must be determined as part of the solution. Specific events might be associated with unknown time instants Ti , 0 < Ti < T (for example, the jettisoning of part of the cargo of a flying vehicle). The tobe-minimized cost is noted J, and, depending on the problem might account for maneuver duration, control activity, fuel consumption, etc., or some other given functions of interest. Furthermore, the maneuver definition is completed by providing a set of problem-dependent equality and inequality constraints (Eq. (5d)) which translate the operating envelope of the vehicle, the performance and procedural requirements as dictated by norms and regulations (for example, in the case of the certification of flying vehicles), and all other necessary maneuver-defining constraints. All such constraints are typically expressed in terms of the outputs y. Finally, Eq. (5d) also include the initial and finally conditions on the vehicle states x. The solution of the MOCP for system M can be formally written as [u∗ (t), T ∗ ] = MOCP(M),
(6)
where u∗ (t), t ∈ Ω ∗ = [0, T ∗ ], is the cost minimizing control signal and T ∗ the cost minimizing maneuver duration. The associated state and output responses are x∗ (t) = Φu∗ (x0 , t),
(7a)
y ∗ (t) = Ψu∗ (y0 , t),
(7b)
for t ∈ Ω ∗ . 3.1 Problem Complexity In the context of the present discussion, a meaningful measure C of the MOCP complexity is given by the expression C=n
T , τ
(8)
where n is the number of states in the model, T is the maneuver duration and τ is the characteristic time length associated with the fastest solution scale component that one needs to resolve. Problems of modest complexity use vehicle models with a small number of states n, and a solution is sought which captures only those components which are slow compared to the overall maneuver duration (T /τ small). On the contrary, problems of high complexity use models with many states (n large) which capture fine scale solution components, that are possibly very fast compared to the maneuver duration (T /τ large).
Solution Procedures for Maneuvering Multibody Dynamics Problems
61
4 Direct vs. Indirect Methods The indirect approach to the solution of the MOCP (5) amounts to augmenting the cost (5a), by adjoining the governing equations (5b) with a set of co-states, and adjoining all other constraints conditions (5d) with Lagrange multipliers (possibly using slack variables in the case of inequality constraints). By imposing the stationarity of the augmented cost, one derives a new set of equations with their associated boundary conditions, which govern the optimal control problem [15]. The resulting boundary value problem can then be discretized using a suitable numerical method defined on a computational grid. This transforms the infinite dimensional boundary value problem into a discrete problem, whose unknowns are the values of the variables (states, co-states, Lagrange multipliers, controls) on the computational grid. It is clear that the implementation of this classical approach requires the manipulation of the governing equations (5b) in order to derive the optimal control equations. This is a non-trivial task for most modern complex multiphysics vehicle models, which might necessitate the use of symbolic manipulation or automatic differentiation tools to be carried out effectively. More importantly, this approach must be ruled out whenever one does not have access to the analytical expression of Eqs. (5b) or their software implementation, as it is the case whenever the vehicle model is implemented in a black-box computer program. To overcome these limitations of the indirect approach, one can use the direct method [7]. In this case, instead of first optimizing and then discretizing, one first discretizes the model equations (5b) through a numerical method. The choice of a numerical method has the effect of discretizing also states, inputs and outputs, which become algebraic parameters. This in turn allows one to express cost function (5a) and constraints (5d) in terms of the discrete parameters. This process defines a discrete parameter optimization or nonlinear programming (NLP) problem [18], which can be written in general as min z
s.t.
K(z),
(9a)
a(z) = 0, b(z) ≤ 0,
(9b) (9c)
where z is a set of algebraic unknowns, and K is a scalar objective function which represents an approximation of the cost J of Eq. (5a). The equality constraints (9b) are generated by the discretization of the equations of motion (5b), while the inequality constraints (9c) by all maneuver-defining constraints (5d). The specific form of the vector of algebraic unknowns and of the constraints depends on the method used for performing the discretization, as detailed below. Necessary conditions for a constrained optimum for problem (9) are obtained, similarly to the optimal control case, by combining the objective K
62
C.L. Bottasso
with the constraints through the use of Lagrange multipliers, and imposing the stationarity of the augmented objective function. By using the direct approach, one does not need to manipulate the equations of motion of the vehicle, since all that is required is the evaluation of the discretized equations on a step or sequence of steps, and this enables the interfacing to black-box vehicle simulators. For further possible advantages of the direct method, the interested reader is referred to Ref. [7]. For robustness, it is usually better to consider a scaled version of the NLP problem (9), where the NLP variables z are replaced by scaled variables zˆ = diag(wz )z, wz being scaling coefficients chosen so that the new unknowns are zˆ ≈ O(1). Likewise, also the constraints (9b, c) are similarly scaled. Problem (9) can be solved by using single or multi-model approaches: the former are described in Section 5 and use the sole vehicle model M, while the latter, described in Section 6, approximate the solution of (9) by iterating between M and a suitable reduced model of it.
5 Single-Model Approaches There are two main single-model solution procedures for the direct approach described above, namely direct transcription and direct multiple shooting, which are reviewed next. Both methods lead to sparse NLP problems (9), which can be efficiently solved using Sequential Quadratic Programming (SQP) [4]. 5.1 Direct Transcription We consider the partition 0 = t0 < t1 < . . . < tN = T of the time interval Ω, where the generic time element is noted Ω i = [ti , ti+1 ], i = (0, N − 1), of time step size hi = ti+1 −ti . Here and in the following, quantities associated with the generic element vertex i are indicated using the notation (·)i , while quantities associated with the generic element i are labeled (·)i . Clearly, hi = hi (T ), i.e. the time step size is a function of the final time, when T is unknown. To avoid further complicating the discussion, we do not consider here and in the following the presence of possible internal events Tj , although their inclusion does not pose any conceptual difficulty. In each time element Ω i , the governing equations (5b) are discretized using a suitable numerical method. The resulting discrete equations are expressed here as i = (0, N − 1), (10) fh (xi+1 , xi , ui , hi ) = 0, where fh is an algorithmic approximation of function f , xi , xi+1 are the values of the state vector at ti and ti+1 , respectively, while ui represents the value of the control vector within the step. In general there might be additional internal stages for both the state and the control variables, depending on
Solution Procedures for Maneuvering Multibody Dynamics Problems
63
the numerical method. For notational simplicity we do not consider that case here. With respect to this point, note further that in the case of higher order schemes with internal stages, Eq. (10) might have been obtained by static elimination of these stages at the element level. In the direct transcription case, the NLP problem (9) is defined as follows. First, the NLP variables are chosen as: z = (xi=(0,N ) , ui=(0,N −1) , T )T ,
(11)
i.e. they are defined as the discrete states and control values on the computational grid, and the final time. Next, the cost J of Eq. (5a) is discretized in terms of z as given by (11), obtaining the discrete cost K of Eq. (9a). Then, the model governing ODEs (5b) and their associated outputs (5c) are discretized over each step using Eq. (10), and become the set of NLP equality constraints appearing in Eq. (9b). Finally, all other problem constraints and bounds, Eq. (5d), are expressed in terms of the NLP variables z and become the NLP inequality constraints of Eq. (9c). The optimality conditions of the resulting discrete NLP problem converge to the optimality conditions of the optimal control problem (5) as the grid is refined and the number of discrete optimization variables goes to infinity [20]. The resulting problem is potentially large; in fact, the problem size is proportional to the complexity C defined in Eq. (8), i.e. it grows not only with the number of states n, but also with the number of time steps N , which increases with the ratio T /τ . This means that, although the problem is very sparse and has a banded nature due to the fact that each time step couples together only a few of the discrete parameters, the direct transcription approach is applicable only to problems of not excessive complexity. The NLP problem Jacobian can be efficiently evaluated using automatic differentiation (e.g., see Ref. [19]), but in the case of black-box models this is not possible and the Jacobian must be obtained by sparse finite differencing. This has typically a substantial impact on the computational cost, a fact that once again somewhat hinders the use of this approach for problems with a large number of states and fast solution components. 5.2 Direct Multiple Shooting We consider a partition of the time domain Ω given by 0 = t0 < t1 < . . . < tM = T with Ω j = [tj , tj+1 ], j = (0, M − 1), where each Ω j is a shooting segment. Here and in the following, quantities associated with the generic vertex between segments j are indicated using the notation (·)j , while quantities associated with the generic segment j are labeled (·)j . In each shooting %Ncj segment Ω j , the controls are discretized as uj (t) = i=1 si (t)uji , where si (t) j are basis functions, for example cubic splines, and ui are Ncj unknown discrete control values. Notice that we confine the control approximations on each shooting segment, instead of considering interpolations across segment
64
C.L. Bottasso
boundaries; this has the effect of decreasing the computational cost of finite differencing by increasing the problem sparsity. Constraints are enforced at the shooting segment boundaries to enforce the continuity of the controls. In the case of direct multiple shooting, the NLP problem (9) is defined as follows. First, the set of NLP variables is chosen as: j=(0,M −1)
z = (xj=(0,M ) , ui=(1,N j ) , T )T ,
(12)
c
i.e. they are defined as the discrete values of the states at the interfaces between shooting segments, the discrete values of the controls within each segment, and the final time. Next, the governing ODEs (5b) are marched in time within each shooting segment Ω j , starting from the initial conditions provided by the values of the states xj at the left boundary of the segment. The effect of the forward integration is to generate a discrete time history of states within Ω j , which we label xji , i = (1, N j ), where N j is the number of steps taken in that &j+1 = xjN j , and represents segment. The last value of this sequence is named x the new estimate of the state variables at the right boundary of the shooting segment. Segments are then glued together by imposing the following equality constraints &j = 0, j = (2, M ). (13) xj − x In the direct multiple shooting case, the cost J of Eq. (5a) is discretized in terms of z as given by (12) and evaluated using the segment time histories xji ; this yields the discrete cost K of Eq. (9a). Next, the gluing conditions (13) are used to express the set of NLP equality constraints appearing in Eq. (9b). All other problem constraints and bounds, Eq. (5d), are expressed in terms of the NLP variables z and become the NLP inequality constraints of Eq. (9c). Multiple shooting segments are introduced for curing the well known instabilities of the single shooting method [3]. In fact, when using single shooting, small changes in the solution early in the shooting segment can produce dramatic effects at the end of the segment itself, because the system nonlinearities provide a mechanism for amplifying these changes; similar problems are found when analyzing unstable systems, as for example when considering rotorcraft vehicles. Consequently, convergence becomes very difficult if not impossible with single shooting. In most practical cases, the rather heuristic approach of breaking the problem domain into multiple segments alleviates these problems. Considering the implementation of trajectory optimization solution procedures using black-box software, the direct multiple shooting method offers the advantage of a higher level interaction with the program as compared to the previously discussed direct transcription case. In fact, we just need to (a) set the initial conditions at the beginning of each shooting segment and march in time till the end of the segment, under the action of given control inputs, and (b) gather the solution of the forward simulation within and at the end
Solution Procedures for Maneuvering Multibody Dynamics Problems
65
of each segment. It is reasonable to assume that any black-box simulator will at least provide these minimum features. In general, the NLP problem Jacobian must be computed through sparse centered finite differencing by perturbation of the unknowns. With multiple shooting, it is common to use adaptive step procedures to advance the solution in time within each segment, so that highly accurate solutions can be obtained. The size of the NLP problem clearly depends on the number of segments but, since the segments are consecutive, the resulting problem is sparse. Even more importantly, the size of the NLP problem does not depend on the number of internal steps taken in each segment, as in the case of direct transcription, and hence it does not depend on T /τ . This is crucial if the vehicle model M has fast dynamic components which need small time steps to be resolved. In fact, treating such problems with the direct transcription approach will typically lead to very dense grids and hence to extremely large NLP problems, which would imply overwhelming computational costs. Although this is an important advantage with respect to the problem complexity of direct multiple shooting over direct transcription, we remark that, for the applications in rotorcraft flight mechanics that we have studied so far, obtaining a converged solution typically requires of the order of ≥ 102 n ÷ 103 n or more forward simulations, including those necessary for computing the necessary gradients by finite differencing. Although the NLP problem size now grows only with n (and M , although this is typically rather small, say of the order of 101 ), the overall computing cost will still be substantial for models of very high complexity because of the cost implied by the large number of necessary forward simulations. A way to circumvent this problem is to use a multi-model approach, as described in Section 6. State Dependent Constraints One possible issue with the multiple shooting approach is the solution of problems with state dependent constraints [6]. In fact, only state variables at the shooting segment boundaries enter into the definition of the optimization unknowns (see the definition of z given in Eq. (12)). Therefore, one can not directly impose constraints or bounds on the values assumed by the states within the shooting segments; in other words, what happens to the solution within the segments is outside of the control of the optimization procedure, in marked contrast with the direct transcription method. This also implies that, contrary to the direct transcription case, it is not possible to prove that the numerical solution of the discrete problem converges to the optimal control solution when the number of time steps goes to infinity and state constraints are active. This problem can be handled by looking more closely at the difference between the direct transcription and multiple shooting methods. In fact, consider the assembly of the two consecutive steps i − 1 and i using the direct transcription method. This can be written as
66
C.L. Bottasso
fh (xi , xi−1 , ui−1 , hi−1 ) = 0,
(14a)
fh (xi+1 , xi , ui , hi ) = 0,
(14b)
which implies the boolean identification of the state vector at the end of the first step with the state vector at the beginning of the second. Consider now the following alternative form of the assembly: xi , xi−1 , ui−1 , hi−1 ) = 0, fh (&
(15a)
xi+1 , xi , ui , hi ) = 0, fh (&
(15b)
&i = 0. xi − x
(15c)
&i , is regarded as In this case the state vector at the end of the first step, x a separate variable from the state vector at the beginning of the second, xi , and the continuity of the states between the two steps is ensured by condition (15c). Clearly, the two forms of the assembly are perfectly equivalent, in the sense that at convergence they yield exactly the same solution. However, using the redundant state vector formulation of the problem, one can &i , for example using a Newton-like iteration, and solve Eq. (15a) in terms of x replace the result in Eq. (15c); the same procedure, repeated for all steps, effectively eliminates all the transcription constraints (15a), leaving only the gluing conditions (15c). Consider now the multiple shooting case, with a time partition characterized by Nj = 1 for each shooting segment Ω j , j = (1, M − 1). In other words, we march one single time step for shooting segment, ending up with many short segments instead of a few long ones, as in the classical multiple shooting method. Notice that the length of the segments will now have to be small, since we are taking a single step within each one of them; however, to avoid too stringent constraints on the time step length, we recommend in this case the use of an implicit time marching scheme. It is clear at this point that the resulting gluing conditions (13) for the multiple shooting approach with one step per segment are identical to the gluing conditions (15c) for the direct transcription approach after elimination of the transcription constraints. In conclusion, an implementation of direct transcription using the redundant state vector form with static condensation of the transcription constraints is equivalent, at convergence, to a multiple shooting approach based on the same temporal discretization scheme using one time step per segment. Therefore, it appears that a way to deal with state inequalities using a direct multiple shooting approach is as follows. At each time step within each segment, the state dependent inequality constraints are evaluated in terms of the segment-internal state values xji . If violations are detected within a shooting segment, that segment is broken into small segments in the violation area. Next, the NLP solution is restarted from the current solution projected onto the new grid. This time however, on the newly introduced segments the vehicle model will be advanced of one step covering the whole segment length
Solution Procedures for Maneuvering Multibody Dynamics Problems
67
using an implicit scheme. The procedure of segment update is repeated until convergence. As a matter of fact, this means that one is switching to a direct transcription treatment of the area where the violation was detected, which now enables the rigorous enforcement of the inequality constraints. Notice however that if the solution has fast solution components, this approach might not always be feasible because of the computational cost associated with a large number of direct transcription steps of small size. An alternative, simpler but approximate way of dealing with the problem is to simply break the segment where violations are detected into small ones while maintaining a shooting strategy on each one of them, and continue with refinement until no more internal violations are found or when they become smaller than some given acceptable tolerance. This approach is approximate in nature, since here again segment internal quantities are not constrained directly, and therefore one can not guarantee that constraint violations will be avoided. Yet another approximate way to handle state inequalities is to introduce segment internal control points. Indicating with tjci the ith control point in segment j, the states x(tjci ) at that time instant are evaluated in terms of the solution of the segment time marching problem. These quantities can be regarded as functions of the segment initial states and of the segment controls, i.e. x(tjci ) = x(tjci , xj , uj ). Constraints can at this point be written in terms of these quantities, and appended to the NLP problem. Both of these two approximate approaches work in practice for practical problems of engineering interest, and the first of the two is demonstrated later on in this work in the examples section.
6 Multi-Model Approaches For MOCPs of high complexity the procedures described so far become excessively expensive. In order to substantially reduce the computational cost, the only option is to drastically reduce the number of forward numerical simulations over the maneuver of interest conducted with the vehicle multibody model. In this section we describe a procedure that achieves this goal by introducing a second coarse model of the same vehicle; the basic idea is to iterate between the two models, conducting the expensive operations at the coarse level, while leaving the cheaper operations at the fine one. 6.1 Reduced Models ¯ described by some system of governing equations Consider a model M ¯˙ , x, ¯ u, t) = 0, f¯(x ¯ x), ¯ y¯ = h(
(16a) (16b)
68
C.L. Bottasso
¯ : Rn¯ → Rl .M ¯ ¯ ∈ Rn¯ , f¯ : Rn¯ × Rn¯ × Rm × R≥0 → Rn¯ , y¯ ∈ Rl , and h(·) where x is a reduced model of M if n ¯ n and ¯ = y(t) + erec (t), y(t)
(17)
¯ where the output flows are y(t) = Ψ¯u (y¯0 , t) and y(t) = Ψu (y0 , t), erec (t) being the reconstruction error, small in an appropriate norm. Equation (17) states that, starting from the same given initial conditions ¯ and plant M produce output responses which y¯0 = y0 , reduced model M differ by the reconstruction error erec (t) when subjected to the same input ¯ to signal u(t). Clearly, the reconstruction error measures the fidelity of M ¯ M. Typically, M will give a good approximation of M at the slower scales, while the error erec (t) will be mostly due to unmodeled or unresolved faster solution components. 6.2 Approximating MOCP Solutions with the MMSA Algorithm ¯ whose solution can be formally written as Consider now the MOCP for M, ¯ [u∗ (t), T ∗ ] = MOCP(M),
(18)
where u∗ (t) is the minimizing control signal and T ∗ the maneuver duration. This problem can be solved efficiently with the single-model methods described in Section 5, since its complexity C is now drastically reduced with ¯ is described respect to the same problem formulated for model M. In fact, M by a much smaller number of states and is a characterized by slower solution scales than in the case of M. Ultimately, the choice of using direct transcription or direct multiple shooting for the solution of problem (18) will depend ¯ on the complexity of M. There are now two possible uses of the minimizing solution expressed by Eq. (18). A first option would be to use open-loop steering, by feeding u∗ (t) directly to the fine-scale model M. This would give the fine-scale output response (19) y ∗∗ (t) = Ψu∗ (y0 , t). However, if M is unstable, as it is often the case, the open-loop strategy leads to catastrophic divergence of the steering process. Hence, a better approach in practical applications is to adopt closed-loop steering. In this second case, one considers the output response induced in the ¯ by the minimizing control u∗ , i.e. reduced model M y¯∗ (t) = Ψ¯u∗ (y¯0 , t).
(20)
This output signal is now fed to a tracking controller which generates control inputs ucl which steer the fine-scale model M along the goal trajectory y¯∗ (t). This closed-loop control signal can be formally written as
Solution Procedures for Maneuvering Multibody Dynamics Problems
ucl (t) = Θy¯∗ (y¯0 , t),
69
(21)
where Θ(·) (·, ·) is the control law function. The application of this control law induces an output response on the fine scale model M which writes y ∗∗ (t) = Ψucl (y0 , t).
(22)
The difference between the output response y ∗∗ (t) and the goal signal y¯∗ (t) is termed the tracking error : etrack = y ∗∗ (t) − y¯∗ (t).
(23)
Recall now that y ∗ (t) indicates the solution of the MOCP for M as given by Eq. (7b). However, due to the complexity of the problem, we have instead computed a solution y ∗∗ (t) for M, which has been obtained by solving first ¯ and then applying closed loop steering. The error emm in the MOCP for M the computed outputs due to this multi-model approach can be estimated as emm = y ∗∗ (t) − y ∗ (t), = y¯∗ (t) + etrack − y ∗ (t), = etrack + erec .
(24a) (24b) (24c)
This shows that the error due the use of the multi-model solution procedure instead of the single-model one depends on the tracking and reconstruction errors. Clearly, in addition to these effect, one should not forget that the procedure will be based on numerical processes (e.g., for the discretization of problem (18) and the time marching of (22)). Grouping for simplicity all these effects in a numerical discretization error eh , the computed multi-model error becomes (25) emm,h = eh + etrack + erec . This shows that the multi-model solution converges to the single-model one (which is however not computable in practice for problems of excessive complexity), if one can make the three error terms above to converge to zero. The numerical discretization error eh will converge to zero if the underlying numerical processes converge to the true solution for their respective grid sizes which tend to zero, which is true for the methods used here. The tracking error term etrack depends on the “goodness” of the control law ucl (t), assuming that the to-be-tracked signal y¯∗ (t) is a feasible trajectory for M (more on this later). Here again, it is conceivable that, by choosing an appropriate controller and its activation frequency, one can make this error as small as desired. Finally, Eq. (25) shows that one also needs to render the reconstruction error erec appropriately small, and vanishingly small in the limit. This means being capable of identifying a reduced model which can represent the output behavior of the plant with sufficient accuracy. Since this is in practice hard if
70
C.L. Bottasso
not impossible to do a priori, a practical way to accomplish this goal is to use the iterative Multi-Model Steering Algorithm (MMSA) [11], whose steps are as follows: 1. 2. 3. 4.
¯ Solution of the MOCP for M Closed-loop steering of M ¯ Update of reduced model M If ||erec || small, break, else go to 1
The reduced model update step 3. above can be based on a system identification [11, 21] procedure, whose goal is to try to reduce the reconstruction error. This can be done in several different ways, for example by parameterizing the reduced model output function such that Ψ¯u (y¯0 , p, t) = Ψu (y0 , t) + erec (t),
(26)
where p are free parameters which are identified at each iteration so as to reduce the reconstruction error erec (t) [11, 12]. MMSA can be interpreted as a divide and conquer approach: instead of attacking directly the original maneuver problem on the fine scale model, one solves the same problem on a reduced model. Next, this information is used for steering the fine scale model in closed-loop along the computed solution, which can be done at acceptable computational cost since it only amounts to a standard time marching closed-loop problem. Next, one uses the difference between fine scale and reduced model outputs to improve the reduced model, by using system identification. Next, the reduced model problem is solved again, etc. If the system identification is able to reduce (and in the limit, eliminate) the reconstruction error, then the tracking signal for the next steering problem is guaranteed to be a feasible trajectory for the fine scale one. Hence, if a suitable tracking controller is used, the tracking error can be made vanishingly small. This, for small (in the limit, vanishingly small) numerical discretization errors, means that the solution in terms of system outputs computed with the multi-model approach converges to the single-model one. 6.3 State Dependent Constraints Since the MMSA works at two levels, if constraints are present in the maneuver problem, they must be enforced at both of them. ¯ This The coarse level pass amounts to the solution of the MOCP for M. can be handled either by direct transcription or by direct multiple shooting, for which the treatment of constraints has already been covered in the previous sections. The fine level pass amounts to the solution of a tracking problem for M, based on the use of a suitable controller. The incorporation of state constraints in tracking controllers depends heavily on the specific control approach being used. One formulation which allows one to rigorously treat generic state
Solution Procedures for Maneuvering Multibody Dynamics Problems
71
constraints during tracking is Non-Linear Model Predictive Control (NMPC). The approach amounts to solving at each time step a tracking optimal control problem over a prediction window using a reduced model; this leads to a ¯ which can be solved problem that is formally identical to the MOCP for M, efficiently using either direct transcription or direct multiple shooting. Since the prediction window is typically much shorter than the maneuver, one can often use direct transcription in this case, which rigorously allows for the handling of constraints. The detailed explanation of the tracking NMPC formulation goes beyond the scope of the present paper, but the interested reader is referred to [10, 11] for details.
7 Numerical Examples In this section we report a few numerical examples which highlight some of the key features of the procedures described above. These problems, which are more fully described in Refs. [11, 14], all deal with the flight mechanics of a generic medium-size multi-engine four-bladed utility helicopter. At first, we demonstrate the handling of state constraints and the use of refinement procedures, both in the context of direct transcription and direct multiple shooting. To this end, we consider a moderate complexity multibody helicopter model implemented with the FLIGHTLAB code [2] performing a 90-deg turn in minimum time, starting and returning to straight and level trimmed flight at 50 m/sec (see Fig. 1). Throughout the maneuver, we set a bound of ±20 deg/sec for the body attached components of the angular velocity. We consider a cost function taking the form 1 T u˙ · W u˙ dt. (27) J =T + T 0 The first term enforces the minimum time condition, while W = diag(wu˙ ) is a diagonal matrix of tunable weighting factors which penalize the control rates. In Fig. 2 we show the time history of the roll rate obtained with the direct transcription method. The left diagram refers to the solution obtained on
Fig. 1. Minimum time 90-deg turn
72
C.L. Bottasso 20
20 Upper bound Lower bound
15 10
10 p [deg / sec]
p [deg / sec]
Upper bound Lower bound
15
5 0 −5
5 0 −5
−10
−10
−15
−15
−20
−20 0
2
4
6 Time [sec]
8
10
12
0
2
4
6 Time [sec]
8
10
12
Fig. 2. Minimum time 90-deg turn, direct transcription method. Left: roll rate computed on initial uniform grid; right: solution for adaptively refined grid
5 Final mesh Initial mesh
4.5 4th refinement
4 3.5 3 2.5
3rd refinement
2 1.5
2nd refinement
1 first refinement
0.5 0 0
2
4
6 Time [sec]
8
10
Fig. 3. Minimum time 90-deg turn, direct transcription method. Initial uniform grid and representation of the grid evolution throughout four local refinement iterations
an initial uniform grid of 24 steps, while the right one presents the solution computed after four adaptive refinement iterations, which led to a grid having a total of 117 steps. Refinement was obtained by using the rather crude, but simple and practical, approach of repeating each time step. More precisely, in this case transcription was performed using the implicit mid point rule, which has no step-internal state stages and uses a centered mid-step value of the control inputs. Nodal values of the controls were first obtained by recovery of the computed mid-step values, and then linearly interpolated within each time step. Next, using the re-interpolated control inputs, each step was repeated starting from the initial conditions provided by the solution at the left node of each time step, using a fourth-order Runge-Kutta scheme. The percent difference between the original solution at the right node of each time step and the newly obtained one was assumed as a local error indicator to drive time step refinement by simple bisection. Figure 3 shows the initial grid and a representation of its evolution throughout the local refinement iterations. Comparing
Solution Procedures for Maneuvering Multibody Dynamics Problems
73
this figure with the solution, it is clear that local refinement clusters the time steps where there are rapid variations in the helicopter response. It appears that the overall behavior of the solution is reasonably captured also on the initial grid, although the local details are clearly different. The optimum time changes from the value T ∗0 = 12.0 sec obtained on the initial grid to the value T ∗4 = 11.3 sec of the final one. These results are somewhat typical of what can be achieved with direct transcription: reasonable answers can be obtained quickly even on rather coarse and simple grids, although, if one is interested in the details of the solution, then adaptive refinement must be used in order to cluster the time steps in the areas of sharp gradients, so as to achieve a greater accuracy while containing the computational cost. Note that in all cases, state constraints are straightforwardly enforced and rigorously satisfied by the solution. Next, we solve the same problem using direct multiple shooting, covering the computational domain with 16 shooting arcs. The time history of the computed roll rate is shown in Fig. 4. It appears that the roll rate bounds are exceeded within some of the shooting arcs, as clearly shown in the zoomed detail of the plot reported in the right part of the same figure. In fact, as previously noticed, the state time history computed within the shooting arcs can not be constrained during optimization. On the other hand, the states variables at the arc interfaces, shown using ◦ symbols, do satisfy the required bounds, as shown in the figure, since they appear explicitly among the variables of the discrete optimization problem. Here again, local adaptive refinement can help in improving the quality of the solution, in this case by reducing the effect of the local constraint violations. Figure 5 shows the time history of roll rate after 2 refinement steps, which led to a total of 26 shooting arcs. In this case, the refinement strategy was based on checking if there were violations within each arc, and splitting the arc in two in case violations of the constraints were detected. It appears from the figure that violations are much reduced in this case, and the states constraints can be considered satisfied to a reasonable engineering tolerance. 25 24
20 Upper bound Lower bound
15
Upper bound Lower bound
23 p [deg/sec]
p [deg/sec]
10 5 0 −5
−10
22 21 20
−15 −20 −25
19 0
2
4
6 Time [sec]
8
10
12
1.5
2
2.5 3 Time [sec]
3.5
Fig. 4. Minimum time 90-deg turn, direct multiple shooting method. Time history of roll rate, and zoomed view in a region of local constraint violations (right)
74
C.L. Bottasso 25 20 Upper bound Lower bound
15
p [deg/sec]
10 5 0 −5 −10 −15 −20 −25
0
2
4
6 Time [sec]
8
10
12
Fig. 5. Minimum time 90-deg turn, direct multiple shooting method. Time history of roll rate computed on the final adapted grid 20
20 Collective Pedal Lateral Longitudinal
15
Collective Pedal Lateral Longitudinal
15
10 [deg]
[deg]
10
5
5
0
0
−5
0
2
4
6 Time [sec]
8
10
12
−5
0
2
4
6 Time [sec]
8
10
12
Fig. 6. Minimum time 90-deg turn, control input time histories. Left: direct multiple shooting method; right: direct transcription method
Finally, in Fig. 6 we compare the control inputs obtained on the final grids for both methods. The solutions appear quite similar, differences being mainly due to the dissimilar control input discretization and distribution of the associated degrees of freedom on the computational domain. Furthermore, in the case of multiple shooting the optimum maneuver time was found to be T ∗ = 11.7 sec, in reasonable agreement with the one computed using direct transcription. In terms of computational cost, we have noticed that, when one considers rather coarse discretizations and loose accuracy, the direct multiple shooting strategy is typically significantly faster than multiple shooting. For example, but with no aim at giving general indications, in the present problem the ratio between the cpu time for the two methods on the initial respective grids was close to four in favor of direct transcription, although we have observed cases were this ratio can grow up to about ten. On the other hand, when more accurate solutions are required and the refinement procedures are activated, this difference tends to reduce, because the computational cost grows more rapidly in the case of direct transcription than in the case of direct multiple shooting, as previously explained.
Solution Procedures for Maneuvering Multibody Dynamics Problems
75
Fig. 7. Category-A take-off procedures [1]
To demonstrate the convergence properties of the MMSA, we consider next the vertical take-off of the same vehicle from a confined area under Category-A certification requirements, as described in Ref. [1] and synthetically illustrated in Fig. 7. In order to meet the certification requirements, the helicopter must be able to continue the take-off maneuver after an engine failure takes place passed the critical decision point [1]. Furthermore, the vehicle must also be able to safely land when an engine failure occurs during the climb to the decision point. The satisfaction of these requirements implies stringent limitations on the maximum take-off weight. Here we consider for brevity the sole rejected take-off case. The objective of the analysis is to find the maximum altitude from which a safe landing is still possible, in the sense that the resulting impact velocity with the ground is compatible with the energy absorption capacity of the vehicle. This requirement can be translated into an optimal control problem with unknown final time, where the rotorcraft fall is maximized while satisfying maximum allowable limits on the final horizontal and vertical velocity components. The multibody helicopter model M used for this example is based on the comprehensive finite element-based formulation of Ref. [5] coupled with ¯ is a rigid-body with Peters-He aerodynamics, while the reduced model M momentum-theory rotor aerodynamics [23].
76
C.L. Bottasso
The MOCP cost function is defined as 1 T J = −Z(T ) + u˙ · W u˙ dt. T 0
(28)
The first term in the cost enforces the maximum altitude loss from the initial hover condition, while the second term penalizes high input rates. The MOCP input and output constraints include bounds on the collective and cyclic controls and their rates, bounds on the final time, on the rotor speed and on the available power. A simple engine model describes the maximum power available during an emergency. The initial conditions are given by the trimmed hovering flight states, while the final conditions are given by the maximum allowable impact velocity components, VX (T ) < 3.05 m/sec, VZ (T ) < 3.66 m/sec, and by the request for a null final pitch rate q(T ) = 0 deg/sec. This problem was solved with MMSA, with a tracker based on direct transcription NMPC and system identification based on the solution of a variant of the output error method [11,21]. The MOCP solution for the reduced model was obtained by direct transcription. Three iterations of the MMSA (reduced model MOCP solve, closed-loop steering, model update) were required to converge the tracking errors. Figure 8 shows the fuselage pitch attitude. The line marked with the symbol corresponds to the reduced model solution, while the solid line shows the corresponding multibody outputs obtained during steering. Good matching between the reduced and full solutions is observed throughout the maneuver, which indicates convergence of the MMSA, i.e. the achievement of small tracking and reconstruction errors. These results are indicative of the performance of the procedure, as also the other reduced vehicle states showed similarly small errors. 6
4
Pitch Attitude [deg]
2
0
−2
−4
−6
−8
−10
0
1
2
3
4
5
6
7
8
9
Time [s]
Fig. 8. Optimal helicopter rejected take-off maneuver. Fuselage pitch attitude for the reduced ( line) and multibody (solid line) models
Solution Procedures for Maneuvering Multibody Dynamics Problems
77
8 Conclusions We have described a suite of solution procedures for maneuver problems in vehicle dynamics. The problem complexity, which is dictated by the number of degrees of freedom in the vehicle model and by the ratio of the maneuver duration to the fastest solution scales, determines which is the most efficient solution procedure for the problem at hand. In particular, we have argued that: •
•
•
The direct transcription method is best suited for problems of low to moderate complexity. The method is characterized by good robustness and handles straightforwardly all constraint types. It is typically able to rather quickly yield solutions of moderate accuracy, but it may become prohibitive when one seeks high accuracy or when dealing with problems of higher complexity. The direct multiple shooting method is best suited for problems of moderate complexity. Being based on forward time marching processes, it is in general less robust than direct transcription, especially for unstable problems. Moreover, it can not ensure the exact satisfaction of state constraints throughout the whole solution, unless one uses a hybrid method, switching to direct transcription in the areas where constraints are active. Since this can in general be expensive, the practical alternative is to enforce constraints only at specific control points. Although this is typically acceptable for engineering solutions of practical problems of industrial relevance, it may occasionally require some degree of attention on the part of the user. The multi-model steering algorithm is best applicable to those problems of high complexity where direct multiple shooting becomes excessively costly. The approach amounts to a two-level solution procedure, where the costlier operations (i.e. the solution of the MOCP) is relegated at the coarse level, while the cheaper ones (i.e. forward time marching) at the fine one. The key ingredients for an effective implementation are represented by a tracking controller and by a system identification procedure, which must be capable of improving the predictive capabilities of the coarse reduced model, a problem which deserves attention for achieving good performance.
We finally remark that these methods can be used in conjunction with a hierarchy of models of the same vehicle: solutions can be computed starting from the crudest model and then used as initial guesses for initializing the computation on the next level model, using each time the most appropriate solution procedure. This synergistic use of the various methods has still to be fully explored, but some preliminary experiences have shown good promises for further performance improvements and better robustness.
78
C.L. Bottasso
Acknowledgements The present research is supported by Agusta-Westland. Further support is provided by the US Army Research Office, through a grant with the Georgia Institute of Technology and a sub-contract with the Politecnico di Milano, with Dr. Gary Anderson as technical monitor. The author gratefully acknowledges the contribution of several collaborators and former students, and especially of Alessandro Croce, Domenico Leonello, Giorgio Maisano and Luca Riviello, who have been instrumental in the implementation of the software procedures and the development of all numerical results.
References 1. Anonymous (1999) Advisory circular 29-2C, certification of transport category rotorcraft. Federal Aviation Administration, Department of Transportation, Washington, DC, USA 2. Anonymous (2008) Advanced Rotorcraft Technology, Inc., 1685 Plymouth Street, Suite 250, Mountain View, CA 94043, http://www.flightlab.com 3. Ascher UM, Mattheij RMM, Russell RD (1995) Numerical solution of boundary value problems for ordinary differential equations. Classics in applied mathematics, 13, SIAM, Philadelphia, PA 4. Barclay A, Gill PE, Rosen JB (1997) SQP methods and their application to numerical optimal control. Report NA 97–3, Department of Mathematics, University of California, San Diego, CA 5. Bauchau OA, Bottasso CL, Nikishkov YG (2001) Modeling rotorcraft dynamics with finite element multibody procedures. Math Comp Model 33:1113–1137 6. Betts JT (1998) Survey of numerical methods for trajectory optimization. J Guid Contr Dyn 21:193–207 7. Betts JT (2001) Practical methods for optimal control using non-linear programming. SIAM, Philadelphia, PA 8. Bottasso CL, Croce A (2004) Optimal control of multibody systems using an energy preserving direct transcription method. Mult Syst Dyn 12:17–45 9. Bottasso CL, Croce A, Leonello D, Riviello L (2005) Optimization of critical trajectories for rotorcraft vehicles. J Am Hel Soc 50:165–177 10. Bottasso CL, Croce A (2005) Two-level model-based control of flexible multibody systems. In: Proceedings of the 6th European Conference on Structural Dynamics (EURODYN 2005), Paris, France 11. Bottasso CL, Chang CS, Croce A, Leonello D, Riviello L (2006) Adaptive planning and tracking of trajectories for the simulation of maneuvers with multibody models. Comp Meth Appl Mech Eng 195:7052–7072 12. Bottasso CL, Croce A, Leonello D (2007) Neural-augmented planning and tracking pilots for maneuvering multibody dynamics. In: Garc´ıa Orden JC, Goicolea JM, Cuadrado J (eds) Multibody dynamics, computational methods and applications. Springer, Dordrecht, The Netherlands 13. Bottasso CL, Riviello L (2007) Rotorcraft trim by a neural-augmented modelpredictive auto-pilot. Mult Syst Dyn 18:299–321
Solution Procedures for Maneuvering Multibody Dynamics Problems
79
14. Bottasso CL, Maisano G, Scorcelletti F (2008) Trajectory optimization procedures for rotorcraft vehicles, their software implementation and applicability to models of varying complexity. In: Proceedings of the AHS 64th Annual Forum and Technology Display, Montreal, Canada 15. Bryson AE, Ho YC (1975) Applied optimal control. Wiley, New York 16. Frazzoli E (2001) Robust hybrid control for autonomous vehicle motion planning. Ph.D. thesis, Massachusetts Institute of Technology, Cambridge, MA 17. Geradin M, Cardona A (2000) Flexible multibody dynamics, a finite element approach. Wiley, New York 18. Gill PE, Murray W, Wright MH (1981) Practical optimization. Academic, London/New York 19. Griewank A, Juedes D, Mitev H, Utke J, Vogel O, Walther A (1996) ADOL-C: a package for the automatic differentiation of algorithms written in C/C++. ACM TOMS 22:131–167 20. Hull DG (1997) Conversion of optimal control problems into parameter optimization problems. J Guid Contr Dyn 20:57–60 21. Jategaonkar RV (2006) Flight vehicle system identification: a time domain methodology. Progess in Astronautics and Aeronautics, AIAA, Reston, VA 22. Peters DA, Barwey D (1996) A general theory of rotorcraft trim. Math Prob Eng 2:1–34 23. Prouty RW (1990) Helicopter performance, stability, and control. R.E. Krieger Publishing, Malabar, FL
Synthesis and Optimization of Flexible Mechanisms Frederic Cugnon1 , Alberto Cardona2 , Anna Selvi1 , Christian Paleczny3 , and Martin Pucheta2 1
2
3
Samtech S.A., Rue des Chasseurs Ardennais 8, 4031 Li`ege, Belgium E-mails:
[email protected],
[email protected] Cimec-Intec (UNL-Conicet), G¨ uemes 3450, 3000 Santa Fe, Argentina E-mails:
[email protected],
[email protected] Snecma – Groupe Safran, Rond-Point Ren´e Ravaud, R´eau, 77550 Moissy Cramayel, France E-mail:
[email protected]
Summary. This paper resumes researches done in two European projects: SYNAMEC and SYNCOMECS. As a result of this activity, an integrated computer-aided tool for the synthesis and design of flexible mechanisms has been developed. Within the SYNAMEC project, a software system for the mechanism type synthesis (choice of topology), the integration into existing mechanism and the preliminary dimensional synthesis (choice of sizes and physical properties) was developed. This tool was improved in the SYNCOMECS project to include compliance in its components library. Once preliminary design is achieved, the combined use of SAMCEF-Mecano, SAMCEF-Field and BOSS-Quattro software allows optimizing detailed models of the system taking into account all mechanical effects. Aeronautical applications are shown, where starting from an initial configuration suggested by the integrated type synthesis tool, the users can optimize and design its system from early pre-design sizing to detailed stress analysis using a single software in a GUI environment.
1 Introduction The conception of aeronautical mechanisms is an activity in which the designer is confronted with the difficult task of managing a wide range of variables and mechanism configurations, subjected to severe aerodynamic and manufacturing constraints. The purpose of this work is to present an integrated tool to eliminate the lengthy trial and error procedure usually followed during the conceptual design phase of mechanical systems. Also, since all operations going from type synthesis to detailed stress analysis are performed with the same software tool, problems related to data transfer are avoided. C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
81
82
F. Cugnon et al.
Compared to classical methods based on rigid multibody solvers, the use of Finite Element software for mechanism simulation allows to easily introduce flexibility at a very early stage in the design process. In this way, we avoid missing unexpected dynamical effects from the beginning, whose correction could require costly design modifications if confronted later in the conception process. An efficient optimization tool is finally used to solve inverse problems in which values for any variables (positions, dimensions, etc.) are computed, based on asking that results of the nonlinear transient problem (like the response of a flexible mechanism submitted to dynamic loading) match given specifications.
2 Design Process The proposed design process is the major result of SYNAMEC and SYNCOMECS projects and is divided in three phases [2, 12]. Firstly, starting from early design specifications, type synthesis aims to select a mechanism topology: e.g. number of bodies and type of linkages. Selection criteria are mainly the positions of some trajectory points and the definition of an allowed area where the mechanism should be contained. At this level, the real (CAD) geometry is not known and the simulation must be fast, since a large number of solutions have to be investigated. The second phase, dimensional synthesis, takes into account more criteria and variables, like simple properties (sections, . . . ) but also geometrical dimensions coming from the CAD. Simulation is done using parametric models. Optimization techniques, involving both gradient-based and evolutionary algorithms are used. During the final phase, detailed design, additional non-linear effects like material behavior, contact conditions or similar are taken into account using the capability of SAMCEF-Mecano for mixing mechanism and structural Finite Element analysis in the same software. 2.1 Type Synthesis Type synthesis is based on the exploitation of a type linkages library and on the exploration of alternatives with optimization techniques for nondifferentiable objective function, like genetic algorithms [10, 11]. In order to define data for the type synthesis solver, the kinematics task is discretized into a number of precision points, where initial, final and some intermediate states are defined for the prescribed parts. These settings are easily defined using the SAMCEF-Field graphics interface in a FEM-like description of mechanisms (Fig. 1). The synthesis software is then launched allowing to solve the type synthesis and initial dimensioning problems in two different substages.
Synthesis and Optimization of Flexible Mechanisms 1
83
Design specifications
I. Synthesis 2 Type synthesis (layout design)
3 Integration into mechanism
5
4
Dimensional synthesis
Preliminary analysis
II. Detailed design 6
7
Detailed mechanical design
Detailed analysis
Optimal mechanism design
Fig. 1. Process flow chart
In the first substage, the type synthesis solver produces a graph representation of the kinematics problem called initial graph. Then, it runs a subgraph search of such initial graph into an atlas of mechanisms also represented by graphs. Non-isomorphic subgraph matchings are saved as feasible alternatives (up to a number of solutions defined by the user). Each mechanism alternative is afterwards analyzed and decomposed into several Single-Open Chains (SOCs): dyads and triads [10,11]. The kind and number of the free parameters needed for each SOC solver module is next determined, e.g. bounds on boxes for pivot locations, missing angles or stretching factors, etc. The sketch of the mechanism as well as the default bounds of the variables involved are presented to the user for the subsequent substage. We remark that this substage is completely achieved in an automatic way. The second substage, initial sizing, is then launched. The design space is defined by the set of free parameters, if any. The user can set the values of their bounds, and a genetic algorithm is used to sweep the design space. The fitness function consists in the minimization of the size of the mechanisms together with three weighted constraints: minimal length of link dimensions, non-inversion of transmission angle, and allowed space violation. Eventually, instead of considering non-inversion of transmission angle as a constraint, a
84
F. Cugnon et al.
full kinematics analysis is made for each individual to compute the fitness function. The obtained solution (an alternative inserted in the environment, with some dimensions and properties) is matching user’s requirements first approximation (trajectories, obstacles and allowable space given by points or planes) and is the starting point for the next step, the dimensional synthesis. 2.2 Dimensional Synthesis Once the mechanism topology and initial linkages dimensions are defined, the next step in the design process is to use gradient-based optimization techniques to make a fine-tuning of parameters by including more design criteria. Typically, this design step aims to find optimal values, in a complete parametric model, for dimensions (and/or point positions), by taking into account the above-refined criteria, using the geometrical information (curves, surfaces and solids: for trajectory, interferences, etc.). During this phase optimization gradient-based algorithms are used. Usually such techniques are limited to few design variables because sensitivities are computed by finite differences, where, for each variable, a new complete simulation is performed with a small change on the variables values. In order to save computation time, a semianalytic method is integrated in SAMCEF-Mecano to compute the values of the sensitivities of nodal and elemental results during the nominal simulation. 2.3 Detailed Design Full non-linear flexible systems can also be modelled using the same software to perform a detailed analysis of the system, taking into account additional CAD details, loadings and physical effects. The user is thus able to check the system behavior under real conditions and to perform stress analysis with realistic dynamic loads.
3 Coupling Kinematical Joints and Finite Elements 3.1 Multibody Simulation Using Finite Element Approach In the past, most mechanisms were modelled as rigid multibody systems. Today large mechanisms requiring high speeds, lightweight and increase of external loading, need to be modelled taking account members flexibility. In most multibody simulation tools, flexibility is introduced by decomposing motion into a rigid body displacement and small deformations. This capability is usually introduced by using super elements generated by some external Finite Element software. Another approach consists in considering mechanism analysis as a branch of structural dynamics. A very general method based on the use of Cartesian coordinates to represent bodies kinematics and
Synthesis and Optimization of Flexible Mechanisms
85
on the representation of finite rotations by the rotational vector theory, was introduced in Refs. [4, 7]. The Finite Element concept allows then to generate easily a complete library of mechanical elements (members and joints) that can be used for mechanism analysis. This approach was adopted in the software SAMCEF-Mecano [1]. Introduction of constraints in the system of equations of motion is briefly described below. The weak instability of constrained dynamics is avoided by introducing some numerical damping for the high frequencies, using algorithms like Hilber-Hughes-Taylor [8] or Chung-Hulbert [5]. The flexibility can thus easily be introduced using non-linear beams and super elements, but also meshing some members with finite elements (volumes, shells, . . . ) that allow to take into account both geometrical and material non linearity. 3.2 Contribution of Kinematical Joints to Discrete Equations of Motion A constrained dynamics problem is represented by a system of differentialalgebraic equations that can be written (in case of having only holonomic constraints) in the following discrete form: ' ˙ t) = gext , M q¨ + gint (λ, q, q, (1) Φ(q, t) = 0, where M is the mass matrix, gint are the internal loads (which include the constraint forces), gext the external loads, q the positions, t the time, λ the Lagrange multipliers and Φ represents kinematics constraints. In the SAMCEF-Mecano approach, each joint is treated in a way similar to finite elements; that is, relations between its local degrees of freedom representing the equations of constraint of the joint are computed, and then are added to the global system of equations. Since the Newton-Raphson method is employed to solve the non-linear system of equations, the joints will add the above-mentioned relations in the form of an internal forces vector gint and a tangent stiffness S. The expressions for the internal forces vector and tangent stiffness are given below for holonomic constraints: T
B (pΦ + kλ) pB T B kB T . , S= gint = kB 0 kΦ The matrix B=
∂Φ ∂q
is the Jacobian matrix of the constraints equations Φ(q, t), while factors p and k are respectively penalty and scaling factors of constraints. The scaling factor is used to make constrains of the same order of magnitude as the internal elastic forces, while the penalty term is added to improve convergence in an augmented Lagrangian approach.
86
F. Cugnon et al.
4 Mechanism Optimization During the type synthesis phase, a first approximation to the solution is obtained on which gradient-based optimization methods can be applied. These methods are to be employed to make the dimensional synthesis because of their better efficiency. The required number of simulations and the computer cpu-time are reduced if the solver is able to provide the dynamic solution together with its derivatives with respect to design parameters, in a single calculation. The semi-analytic capability implemented in SAMCEF-Mecano offers the possibility to compute the values of the sensitivities with respect to all the variables at once. This procedure is more advantageous than using a finite differences method, with the possibility of dealing much faster with a with a large number of variables. 4.1 Sensitivity Analysis The general form of the dynamic equilibrium equations for constrained systems was given in Section 3.2. The method of computation of sensitivities consists in differentiating Eq. (1) with respect to design variables x [3], thus obtaining the general sensitivity problem: ' ˙ t) = Gext , ¨ + Gint (Λ, Q, Q, MQ Ψ (Q, t) = 0, with: Gint = C t Q˙ + K t Q + B T Λ , ∂gint ∂M ∂gext − − Gext = q¨ , ∂x ∂x ∂x ∂Φ , Ψ = BQ + ∂x ˙ Q ¨ indicate x-derivatives of displacements, velociwhere capital letters Q, Q, ties and accelerations. Notice that Kt =
∂gint , ∂q
Ct =
∂gint , ∂ q˙
B=
∂gint ∂Φ = , ∂λ ∂q
are the same tangent matrices defined for the initial problem. In the case of kinematics analysis, inertial terms are neglected and the problem is simplified to: ' ˙ t) = Gext , Gint (Λ, Q, Q, Φ(Q, t) = 0.
Synthesis and Optimization of Flexible Mechanisms
87
The tangent matrix and the internal forces vector are known at the end of each time step, so the only terms to evaluate are the perturbed internal forces. In practice, the converged tangent matrix, internal forces vector and perturbed internal forces vector are all computed by making an additional iteration when the Newton-Raphson scheme has converged. In doing so, we guarantee that the sensitivities are calculated with the converged tangent matrix. This technique allows obtaining sensitivities on all selected nodal results (displacements, speeds, accelerations, forces, reactions, . . . ) for all the defined perturbations. The sensitivities on elemental results (forces, stresses, . . . ) are also available after some postprocess. 4.2 Application to Mechanism Optimization The selected optimization example is a flap-tab mechanism system that has to be designed to position flap bodies in the airflow in such a way that these surfaces give the highest efficiency for some pre-defined setting angles. The basic components of those mechanisms used to guide and structurally support these moving element are hinges, spherical bearings, linkages, levers and tracks. Starting from the optimum settings of flap bodies, required to achieve aerodynamic performance, we select some control points on the flap bodies and derive a set of discrete positions that are used to build ideal trajectories. Distance sensors are introduced in the model to measure distances to trajectories, to some control points and to evaluate interferences with an obstacle. The optimization problem is thus to minimize distances to trajectories with some constraints on initial and final settings; used variables are the positions of kinematical joints synchronizing the second flap body with the first one. The model is shown on Fig. 2, while Fig. 3 shows the windows used to define the optimization problem. Starting with a non-working mechanism, an optimized solution (satisfaction of desired kinematics without interference) is obtained after 12 iterations of a gradient based optimizer. Thanks to the semi-analytical computation of sensitivities only 12 non-linear finite elements analyses are needed, which allows reducing the calculation time by a factor 5 compared to a finite differences approach. The improvement of the system is illustrated by Fig. 4 where both trajectories before an after optimization are plotted.
5 Complete Design Process of an Aeronautical System A nozzle system is selected to illustrate the complete design process from early pre-design type synthesis to detailed stress analysis (Fig. 5). Further details about this system are available elsewhere [6, 9].
88
F. Cugnon et al.
Fig. 2. Flap-tab model
Fig. 3. BOSS Quattro optimization window
At earliest stage, the problem consists in finding a mechanism that can synchronize flaps positions for a given actuation. Nozzle sections are given for four operating configurations, which are the input data for the type synthesis
Synthesis and Optimization of Flexible Mechanisms
89
Fig. 4. Trajectory of second flap body before and after optimization
A2 A3
Secondary flap
Actuator L3
L2 A1
L1
Primary flap
S1
S2
Fig. 5. Nozzle problem definition
tool. This first step will provide the user with a set of possible rigid solutions that matches most of prescribed kinematical conditions. Some solutions are shown on Fig. 6. At this stage, each proposed topology can be imported in the graphical user interface, and a kinematics simulation can be performed to check if the system meets all requirements. If needed, the dimensional synthesis phase is used to improve the mechanism. After reaching a satisfying solution from the kinematics point of view, an enhanced model can be defined and analyzed for instance, by introducing some flexibility, loading or taking into account dynamic effects as shown on Fig. 7, where Alternative 4 has been selected and analyzed. We remark that the synthesis, validation and optimization processes with detailed analysis can thus be carried on using the same environment. Finally, once the concept is defined, a detailed 3D design using CAD tools can be achieved and the validation by simulation performed, again using the SAMCEF Field environment.
90
F. Cugnon et al. 12
12 P
18
P
18
0
0
8
10 17
17
10
18
12
8
8
8
12 17 10
18
17 10
Alternative 0
Alternative 1
10 12
18 18
P
0
P
0
12 10
17
17 8
8 8
17
8 12
12
18
18
17
10
Alternative 2
10
Alternative 3 8
10 18
18
P
P
0
12
17
10
8
8
8
12 17
12 18
Alternative 4
0
12
17
17
18
10
10
Alternative 5
Fig. 6. First six solutions proposed by the type synthesis tool
The studied structure is submitted to thermal and aerodynamic loading. Since nozzle cylindrical surfaces are made of several components, one of the critical points of these systems is to guaranty the air tightness between hot
Synthesis and Optimization of Flexible Mechanisms
X Prismatic joint Motor Condition for the motion.
91
Beams, circular profile Flexible elements Shells, Flexible elements
Z Hinges joints (rigid)
Loads : Pressure on Flaps
Fig. 7. Simplified flexible model for optimization
Actuator
Controlled flap
Sealing flap
Controlled flap with out actuator
Fig. 8. Detail of rigid model from CAD
and cold fluxes. This particular context requires the definition of complex articulated flexible models. In this case hot components were meshed with volume elements in order to define contact conditions and to map non-uniform temperature field. Flexibility was introduced in bodies connecting the two nozzle surfaces by using super element techniques, while cold components were assumed rigid to maintain calculation cost at a reasonable level. Figures 8 and 9 display the problem in final form, and the finite elements mesh of parts of the mechanism.
6 Conclusions This work proposed a procedure to design, optimize and validate mechanisms. An integrated computer-aided tool has been developed, to allow the users to work in the same environment during all the design process.
92
F. Cugnon et al.
Fig. 9. Detailed flexible model
This integrated software uses finite elements techniques to describe and optimize the response of flexible mechanisms. It can be used from the very early pre-design stage to the final design validation including stress analysis. The design process was illustrated by aeronautical applications. However, it can be applied to design any mechanical system as classical or compliant mechanisms, deformable structures and more generally to solve most mechanical inverse problems.
Acknowledgements The authors greatly acknowledge support from European Community through grants SYNAMEC (SYNthesis tool for Aeronautical MEChanisms design), project UE 2001-001-0058, G4RD-CT-2001-00622, and SYNCOMECS (SYNthesis of COmpliant MEChanical Systems) project UE FP6-2003-AERO-1516183. Alberto Cardona and Mart´ın Pucheta also acknowledge support from Consejo Nacional de Investigaciones Cient´ıficas y T´ecnicas and from Universidad Nacional del Litoral.
References 1. Anonymous (2006) SAMCEF Users’ Manual, Samtech SA 2. Anonymous (2007) SYNCOMECS project web-site www.syncomecs.org
Synthesis and Optimization of Flexible Mechanisms
93
3. Behar L, Braibant V (1996) Sensitivity analysis for optimization of flexible multi-body systems. Proceedings of the 1996 ASME Design Engineering Technical Conference and Design for Manufacturing Conference August 18–22, 1996, Irvine, California, paper 96-DETC/DAC-1454 4. Cardona A (1989) An integrated approach to mechanism analysis. Ph.D. thesis, Universit´e de Li`ege, Belgium 5. Chung J, Hulbert J (1993) A time integration algorithm for structural dynamics with improved numerical dissipations: the generalized-α method. J Appl Mech 60:371–375 6. Gabellini E, Selvi A, Paleczny C (2003) Three-D kinematics study with Samcef Mecano Motion on a nozzle of a turbojet engine — pre and post-processing in Samcef Field. In: Proceedings of the Samtech Users’ Conference, Toulouse, France, February 3–4 2003 7. G´eradin M, Cardona A (2001) Flexible multibody dynamics: a finite element approach. Wiley, Chichester 8. Hilber HM, Hughes TJR, Taylor RL (1977) Improved numerical dissipation for time integration algorithms in structural dynamics. Earthquake Eng Struct Dyn 5:283–292 9. Pr´etot P, Gabellini E, Paleczny C, Lombard JP (2004) Influence de la mod´elisation des m´ecanismes coupl´es ` a la structure dans le dimensionnement des structures a´eronautiques. In: Proceedings of the MICADO 2004 Conference, Porte de Versailles, Paris, France, March 30, 31 and April 1 2004 10. Pucheta MA, Cardona A (2005) Type synthesis and initial sizing of planar linkages using graph theory and classic genetic algorithms starting from parts prescribed by user. In: Proceedings of the Multibody Dynamics 2005, ECCOMAS Thematic Conference, Madrid, Spain 11. Pucheta MA, Cardona A (2007) An automated method for type synthesis of planar linkages based on a constrained subgraph isomorphism detection. Mult Syst Dyn 18:233–258 12. Remouchamps A (2006) Synthesis tool for aeronautical mechanisms design (SYNAMEC project). In: Proceedings of the Fifth Community Aeronautical Days, Vienna, Austria, 19–21 June 2006
The Reverse Method of Lines in Flexible Multibody Dynamics Christoph Lunk1 and Bernd Simeon2 1
2
Zentrum Mathematik, TU M¨ unchen, Boltzmannstraße 3, 85748 Garching, Germany – Currently at iwis motorsysteme E-mail:
[email protected] Zentrum Mathematik, TU M¨ unchen, Boltzmannstraße 3, 85748 Garching, Germany E-mail:
[email protected]
Summary. Adaptivity is a crucial prerequisite for efficient and reliable simulations. In multibody dynamics, adaptive time integration methods are standard today, but the treatment of elastic bodies is still based on an a priori fixed spatial discretization. This contribution introduces a basic algorithm in the fashion of the reverse method of lines that is able to adapt both the spatial grid and the time step size from step to step. Two examples, a catenary with a moving pantograph head and a flexible slider crank mechanism, illustrate the approach.
1 Introduction The field of flexible multibody dynamics has seen a fast growing demand in recent years due to a strong trend towards lightweight and high-precision mechanical systems. Since flexible multibody systems contain both rigid and elastic bodies as well as the usual interconnections like joints or springs and dampers, one faces here a specific combination of models and simulation methods from rigid body mechanics and from structural analysis. In particular, the mathematical model of a flexible multibody system is heterogeneous by nature as elastic bodies are governed by partial differential equations (PDEs) and rigid bodies by ordinary differential equations (ODEs) or differential-algebraic equations (DAEs). The literature on flexible multibody systems is rich, and various simulation codes offer corresponding features nowadays. To give some basic references, we mention among others the monographs of Bremer and Pfeiffer [8], G´eradin and Cardona [14], Schwertassek and Wallrapp [23], and Shabana [24]. Recent developments in the numerical analysis of PDEs, however, have so far found only little attention in the field, and it is the objective of this contribution to show the potential for further improvement by including new ideas such as adaptive procedures for error control in time and space. C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
95
96
C. Lunk and B. Simeon
Simulation methods for flexible multibody systems typically employ first a space discretization of the PDEs. This “first space then time”-approach is also known as the method of lines in the field of time-dependent PDEs. It reduces the overall equations to an extended system of ODEs or DAEs. Standard interfaces between finite element and multibody codes facilitate this step considerably. Beams are the most frequent elastic members, but plates, shells, and even full three-dimensional structures have also become widespread in practice. The resulting semidiscretized system has basically the same structure as the equations of motion in rigid multibody dynamics. The only difference is that it features two types of state variables, namely, those for the gross motion, i.e., translation and rotation, and those for the elastic deformation. It can be solved by the ODE or DAE time integrators that have become state-of-the-art in the field, see Eich-Soellner and F¨ uhrer [12] for an extensive overview. Gross motion and elastic deformation, however, may have widely different time scales, and this can turn the time integration into a challenging problem, which means that stability properties and numerical dissipation are important issues. Besides the time integration task, flexible multibody systems lead to another problem that deserves particular attention from the point of numerical analysis. The usual goal of the semidiscretization is to approximate the elastic deformation with very few degrees of freedom. In complex applications, the result and the quality of this step depend strongly on engineering judgement since there is no error control in space available. On the other hand, the error in time is controlled by the time discretization method, and this feature has been accepted as a standard in various applications. We observe thus a discrepancy in the quality of the numerical methods used in flexible multibody dynamics. In order to close this gap, we take up an idea that has been successfully used in time-dependent PDEs, the so-called reverse method of lines, which applies “first time then space” and provides a means to include adaptivity in space. The algorithmic framework was originally designed for parabolic partial differential equations and goes back to Bornemann [5] and Lang [20]. As further reference we mention also Bangerth and Rannacher [3] who concentrate mainly on fluid dynamics applications such as the Navier-Stokes equations. In the following, our main reference is the work of Bornemann and Schemann [6] on solving the wave equation. The paper is organized as follows: as a starting point, Section 2 summarizes the equations of motion both in the rigid and elastic case. Section 3 introduces the basic idea of the reverse method of lines, illustrated by the implicit Euler method. Thereafter in Section 4, second-order time integration schemes for the unconstrained and the constrained equations of motion are presented, among them an extension of the α-method of structural dynamics [10] to the differential-algebraic case. In Section 5, the overall algorithm is outlined, and remarks on the usage of a posteriori error estimators in space are given.
The Reverse Method of Lines in Flexible Multibody Dynamics
97
Finally, in Section 6 simulation results for a catenary system with moving pantograph head and for a flexible slider crank mechanism are reported.
2 Equations of Motion In this section, we briefly state the underlying equations of motion and discuss the typical modelling approaches in flexible multibody dynamics. We start with the rigid body case and concentrate then on elastic bodies and corresponding constraints. 2.1 Systems of Rigid Bodies Consider a system of rigid bodies where the vector q(t) ∈ Rnq denotes the position coordinates of all bodies depending on time t. According to Euler and Lagrange, the equations of constrained mechanical motion read ¨ = f (q, q, ˙ t) − GT (q)λ, M (q) q 0 = g(q).
(1a) (1b)
˙ t) ∈ Rnq for Here, M (q) ∈ Rnq ×nq stands for the mass matrix, f (q, q, the applied forces, g(q) ∈ Rnλ for the holonomic constraints and G(q) = ∂g(q)/∂q ∈ Rnλ ×nq for the constraint Jacobian. Besides the position coordinates q, the Lagrange multipliers λ(t) ∈ Rnλ , nλ < nq , are also unknowns. It is well-known that the equations of motion (1) form a differential-algebraic equation (DAE) of index 3 if the constraint Jacobian G has full rank and the mass matrix M is invertible. As a prelude to the treatment of constraints for elastic bodies below, we give an alternative interpretation of the full rank criterion for the constraint Jacobian. We can reformulate this criterion in terms of the singular values of G, which are given by the singular value decomposition [15, Section 2.5] U T G(q)V = diag (σ1 , . . . , σnλ ) ∈ Rnλ ×nq with orthogonal matrices U ∈ Rnλ ×nλ and V ∈ Rnq ×nq . The singular values are ordered as σ1 ≥ σ2 ≥ . . . ≥ σnλ ≥ 0 , and for the full rank of G we require σmin := σnλ > 0. Omitting the argument q of G for brevity, we can reformulate this criterion by observing that 2 µT diag (σ12 , . . . , σmin )µ λT GGT λ 2 ≥ σmin = T T µ µ λ λ
98
C. Lunk and B. Simeon
for λ = 0 and µ := U T λ. In case of µ = (0, . . . , 0, 1)T ∈ Rnλ , this inequality is sharp and we conclude λT GGT λ λ λT λ
2 = min σmin
⇐⇒
GT λ2 . λ2 λ
σmin = min
Moreover, using the definition of the operator norm we get the identity pT GT λ2 λT Gp = max , v v p2 p2
GT λ2 = max
since pT GT λ2 = |λT Gp|. Overall, we have thus derived the minimax characterization λT Gv = σmin (G) > 0 (2) min max λ v λ2 v2 as equivalent to the full rank criterion. 2.2 Elastic Bodies In case of elastic bodies, the mathematical model involves a coupling of the above equations of motion with the PDEs that govern the deformation. In the engineering literature on flexible multibody systems, however, mathematical modelling and numerical treatment are often intertwined. The elastic body is first discretized in space, which results in a finite dimensional structure. Thereafter, the equations of motion and the coupling conditions are formulated in terms of certain shape functions or finite element displacements [24]. In this contribution, we postpone the introduction of elastic approximations to the following sections and prefer instead to formulate the governing equations as partial differential-algebraic equations (PDAEs). Unconstrained Motion We first consider a single elastic body occupying the domain Ω ⊂ R3 . The surface or boundary ∂Ω of Ω consists of two parts. On the segment Γ0 the motion is prescribed, and on Γ1 a surface traction is given. If we neglect the gross motion for the moment, i.e., assume the setting of linear elasticity, the displacement field u(x, t) ∈ R3 maps the point x ∈ Ω to its deformed state x + u(x, t). Cauchy’s equations of motion then read ¨ (x, t) = div σ(u(x, t)) + β(x, t) ρu u(x, t) = u0 (x, t) σ(u(x, t))n(x) = τ (x, t)
in Ω , on Γ0 , on Γ1 .
(3a) (3b) (3c)
Here, the scalar ρ denotes the mass density, the vector β(x, t) the density of body forces, σ(u) ∈ R3×3 is the stress tensor, u0 (x, t) stands for the given
The Reverse Method of Lines in Flexible Multibody Dynamics
99
Dirichlet boundary conditions, and finally τ (x, t) expresses the given surface tractions with normal vector n(x). Hooke’s law relates stress tensor σ and strain tensor ε = 1/2(∇u + ∇uT ) via the relation σ(u) = C · ε(u) with elasticity tensor C. The most widespread modelling approach in flexible multibody dynamics uses a floating frame of reference in order to take spatial rotations and translations into account. This leads to a decomposition [23] ϕ(x, t) = y(t) + A(α(t))(x + u(x, t)),
(4)
where ϕ(x, t) ∈ R3 is the motion of a material point of the elastic body, y(t) ∈ R3 the translation between the inertial and the floating frame, and A(α) ∈ SO(3) a rotation matrix that depends on the angles α. In the following, we will, whenever appropriate, omit the nonlinear relation (4) for the floating reference frame and work with the partial differential equation (3) for the sake of clarity. However, the results and algorithms are straightforward to generalize to (4) though the notation becomes quite tedious. In order to pass from (3) to a weak formulation or work principle, we multiply each expression by test functions (or virtual displacements) v that are admissible, i.e., that satisfy the homogeneous boundary condition v = 0 on Γ0 . By integration over the domain Ω and Green’s Theorem, one arrives at the following formulation, which is equivalent to the Principle of Virtual Work: At each point of time, find the displacement field u(·, t) such that u = u0 on Γ0 and v T ρ¨ u dx + σ(u) : ε(v) dx = v T β dx + v T τ ds (5) Ω
Ω
Ω
Γ1
for all admissible v. Note that the appropriate function space for the displacement field u and for the test functions v is the Sobolev space V := H 1 (Ω)3 , which consists of all square-integrable functions on the domain Ω with weak first order derivatives. The dual space of V is denoted by V . In order to simplify the notation for the subsequent development, we rewrite the equations of motion in operator form R¨ u = −Au + l,
(6)
¨ to R¨ u ∈ V . where the elliptic operator A maps u to Au ∈ V while R maps u This means that the application of a test function is defined by σ(u) : ε(v) dx for all admissible v ∈ V, [Au](v) = Ω
and
v T ρ¨ u dx
[R¨ u](v) =
for all admissible v ∈ V,
Ω
while the linear functional l ∈ V is given by
100
C. Lunk and B. Simeon
v T β dx +
l(v) =
v T τ ds
Ω
for all admissible v ∈ V.
Γ1
Though the operator notation (6) refers to the weak form (5), one could also associate it with the strong form (3), and for the following, it plays no role whether we think of (6) as strong or weak form. Readers unfamiliar with this functional analysis background may view (6) as an infinite-dimensional system of ordinary differential equations for simplicity. Constrained Motion In the multibody context, the boundary condition u = u0 on Γ0 represents often a constraint equation since u0 = u0 (q) may stand for the interface to a joint which in turn depends on the motion q of a neighboring body. For this reason, it makes sense to extend Cauchy’s equation of motion (3) to a dynamic saddle point problem formulation where the constraint is taken into account via appropriate Lagrange multipliers [25]. Note that this modelling approach is closely related to the so-called dual formulations in contact mechanics [28]. We express the constraint u = u0 in weak form as T ϑ u ds = ϑT u0 ds for all ϑ, Γ0
Γ0
where the test functions ϑ are defined on the boundary Γ0 . Next, we pass to a dynamic saddle point formulation and introduce a Lagrange multiplier λ(x, t) that is also defined on the boundary Γ0 and enforces the constraint. The equations of linear elasticity then read: Find displacement u(·, t) and multiplier λ(·, t) such that v T ρ¨ u dx + σ : ε dx + λT v ds = v T β dx + v T τ ds ∀v, (7a) Ω
Ω
Γ0
Ω
Γ1
ϑT u ds =
Γ0
ϑT u0 ds
∀ϑ .
(7b)
Γ0
In contrast to the weak form (5) above, the displacement field and the test function are here not required to satisfy a boundary condition on Γ0 , which means that they are both element of V and we can skip the addition ‘admissible’. On the other hand, the dual space Q of the trace space H 1/2 (Γ0 )3 on the boundary serves as natural function space for the Lagrange multiplier [27]. As before, we rewrite the equations of constrained motion (7) in operator form as PDAE R¨ u = −Au + l − B∗ λ, 0 = Bu − m,
(8a) (8b)
with operators R, A and linear functional l defined as in (6). Additionally, the operator B maps on Q and is derived from the weak constraint (7b) by
The Reverse Method of Lines in Flexible Multibody Dynamics
101
ϑ(s)T u(·, s) ds
[Bu](ϑ) =
for all ϑ
Γ0
( and the linear functional m ∈ Q contains the remaining term ϑT u0 ds in (7b). The notation B ∗ indicates the adjoint operator of B, which can be viewed as a generalized transpose operator. Associated with B, we define furthermore the bilinear form ϑT v ds (9) b(v, ϑ) := Γ0
on the product space V × Q. There is a striking resemblance between the finite dimensional equations of constrained mechanical motion (1) and the dynamic saddle point problem (8). In fact, (8) can be interpreted as infinite-dimensional analogue, and as discussed in [25], the inf-sup condition for the constraint operator on B or the bilinear form b, respectively, plays a crucial role for the well-posedness of the model equations. This condition reads b(v, ϑ) ≥ β > 0. inf sup v V ϑQ ϑ∈Q v ∈V
(10)
Looking back to the minimax characterization (2) of the finite dimensional case, we see that (10) is a generalized regularity condition on the weak constraint equations. At the end of this section, we give three remarks. First, in practical applications like dynamic contact the boundary segment Γ0 may depend on time t, but for simplicity we assume here that it is fixed and known a priori. Also, to keep the presentation well arranged we do not consider unilateral constraint equations like the non-penetration condition of contact mechanics. And finally third, a floating frame of reference (4) is straightforward to include in the saddle point problem (7) but leads to additional nonlinearities which render the mathematical analysis more involved.
3 The Basic Idea In this section, we introduce the basic idea that is behind the reverse method of lines. Before, however, we have to discuss the well-established procedure of approximating the elastic displacements in flexible multibody dynamics. Some authors call this procedure the Ritz method while others speak of a Galerkin projection or of the method of lines. Whatever we call it, the goal is to reduce the space V for the displacements to a finite dimensional subspace Vh with the subscript h indicating the gridsize. This subspace can be spanned by the nodal basis of certain finite elements or by modal approximations, i.e., eigenfunctions obtained from an eigenvalue analysis. Regardless of its
102
C. Lunk and B. Simeon
denomination, this approach can be characterized by “first space then time” since one discretizes first the spatial variable. More specifically, a space discretization for the displacement field is introduced by the Galerkin projection . u(x, t) = uh (x, t) = N h (x) · q e (t),
(11)
where N h (x) ∈ R3×ne is a matrix of known global shape functions, and q e (t) ∈ Rne the vector of corresponding elastic displacement coefficients. The equations of motion of a flexible multibody system follow now from the same variational principle as in the rigid body case, and we obtain a differentialalgebraic system that has the same structure as (1). The unknowns q split into q = (q r , q e ) with rigid motion variables q r and elastic displacements q e . As argued in the Introduction, in flexible multibody dynamics today, good approximation of elastic deformation depends strongly on engineering judgement since the elastic modes N h (x) are determined a priori, and there is no a posteriori error control that could indicate possible problems. Our goal in this contribution is to advocate such an error control with respect to the elastic deformation, and even more we want to combine it with adaptive grids that may change from time step to time step. To achieve this goal, it is advantageous to reverse the order of the method of lines to “first time then space”. As an example, we consider the operator equation (6) derived from the weak form of the equations of unconstrained motion, which is nothing else than an abstract ordinary differential equation that we may discretize by implicit Euler. To do so, we transform the equation to a system of first order Ru˙ = Rw,
(12a)
˙ = −Au + l, Rw
(12b)
with additional velocity variables w that are (with respect to the spatial variable) element of L2 (Ω)3 , i.e., that are square-integrable over the domain Ω. The next step applies implicit Euler to (12), which yields R R
un+1 − un = Rwn+1 , ∆t
wn+1 − wn = −Aun+1 + ln+1 . ∆t
(13a) (13b)
This is now a stationary PDE problem for the unknowns un+1 (x) ∈ H 1 (Ω)3 and wn+1 (x) ∈ L2 (Ω)3 , which approximate u(x, tn+1 ) and w(x, tn+1 ) = ˙ u(x, tn+1 ). Additionally, the displacement field has to satisfy the Dirichlet boundary condition un+1 (x) = u0 (x, tn+1 ) on Γ0 . For completeness, we state the corresponding weak formulation in detail: given un and wn and the stepsize ∆t, find the new displacement and velocity fields un+1 and wn+1 such that
The Reverse Method of Lines in Flexible Multibody Dynamics
z T ρun+1 dx =
Ω
z T ρun dx + ∆t Ω
"
Ω
v T β n+1 dx +
Ω
(14a)
σ(un+1 ) : ε(v) dx Ω
+
∀z,
v ρwn dx + ∆t − T
v ρwn+1 dx = Ω
z T ρwn+1 dx Ω
T
103
v T τ n+1 ds
#
∀v. (14b)
Γ1
The space discretization is only now applied by projecting both the displacement and velocity fields onto Vh , i.e., . un+1 (x) = un+1,h (x) = N h (x) · q n+1 , . wn+1 (x) = wn+1,h (x) = N h (x) · pn+1 ,
(15a) (15b)
with shape functions N h and corresponding nodal vectors q n+1 , pn+1 ∈ Rne . Following the standard arguments of finite element analysis, we obtain from (14) in the usual way the linear system q n+1 − q n = M h pn+1 , ∆t p − pn = −Ah q n+1 + lh,n+1 M h n+1 ∆t Mh
(16a) (16b)
for the unknown nodal displacements and velocities. Here, M h and Ah denote the finite element mass and stiffness matrices, respectively, and the subscript h again indicates the relation to a specific spatial grid. Looking at the linear system (16), one might argue that the same result can be obtained by applying the standard method of lines in combination with implicit Euler for the time discretization. This is indeed true: since the equations that we consider here are linear, time and space discretization commute. Thus the question is: why should one use the reverse approach? The answer lies in the ansatz (15) for the spatial discretization. There we still have the freedom to change the shape functions N h (x) from step to step. In other words, we actually have N h (x) = N h,n+1 (x), and in this way a change of the grid from one step to the other is much easier to formulate than in the standard method of lines. Even more, the solution of (14) could be left to a standard stationary adaptive finite element solver. So far, we have outlined the basic idea of the reverse method of lines. There is still a number of issues to address in order to come up with a powerful numerical algorithm for flexible multibody dynamics: • •
Appropriate time integrators with time stepsize control for both the unconstrained and the constrained equations of motion A posteriori error estimators for the spatial error in each time step
104
• •
C. Lunk and B. Simeon
A concept for controlling the spatial grid and the time stepsize depending on given error tolerances or on other criteria such as a fixed number of available gridpoints per step Interpolation procedures between different spatial grids
The last item means that the old unknowns q n and pn in (16) may refer to a different grid than q n+1 and pn+1 and thus need to be interpolated to the new grid. One should be aware of the fact that the reverse method of lines has already been investigated in several application fields. The algorithmic framework, originally designed for parabolic partial differential equations, goes back to Bornemann [5] and was later on generalized by Lang [20]. A further important reference is Bangerth and Rannacher [3] who concentrate mainly on fluid dynamics applications such as the Navier-Stokes equations. In the following, we take up the work of Bornemann and Schemann [6] on solving the wave equation and extend it to the present framework of flexible multibody systems.
4 Time Integrators In this section, we suggest several time integration methods that can be combined with the adaptive scheme in time and space. Of course, various efficient and robust methods have been developed for the equations of motion in multibody dynamics, see, e.g., [12], but in our context specifically designed low order algorithms seem most appropriate since the error in space usually dominates the overall error. Though the methods will be later on applied to the abstract ODE (6) and the PDAE (8), we prefer in this section the more familiar standard notation (1) for the constrained case and ¨ = f (q, q, ˙ t) Mq
(17)
for the unconstrained case, with the mass matrix M assumed constant for simplicity. 4.1 Unconstrained Equations One of the standard methods in structural dynamics is the α-method, also called HHT-method due to Hilber, Hughes, and Taylor [17]. It overcomes some of the drawbacks of Newmark’s method and reads q n+1 − q n 1 = pn + ∆t( − β)an + ∆tβan+1 , ∆t 2
(18a)
pn+1 − pn = (1 − γ)an + γan+1 , ∆t
(18b)
M an+1 = αf n + (1 − α)f n+1 ,
(18c)
The Reverse Method of Lines in Flexible Multibody Dynamics
105
with f n := f (q n , pn , tn ), discrete velocities pn and accelerations an . The method reduces to Newmark’s method for the parameter value α = 0. If γ = 1/2 + α, the α-method is of second order. Moreover, unconditional stability is achieved for β = (1 + α)2 /4 and 0 ≤ α ≤ 1/3. In this way, the parameter α can be used to control the numerical dissipation. As extension to the αmethod (18), Chung and Hulbert introduced the generalized α-method [10] where the relation (18c) for the acceleration is replaced by the convex combination (19) αm M an + (1 − αm )M an+1 = αf f n + (1 − αf )f n+1 , with parameters αm and αf . The method coefficients should satisfy the requirements γ = 1/2 − αm + αf
second order,
−1 ≤ αm ≤
1 2
zero-stability.
(20)
Thus, the remaining parameters αm , αf and β can be used to adapt the method to special requirements. Of particular interest is the behavior of the generalized α-method (18a, b), (19) for large stepsizes ∆t T where T stands for the period of the highest frequency in the system. Numerical dissipation is a desirable property whenever the high frequencies need not be resolved, i.e., when the system is stiff. An attractive feature of the α-method in this context is controllable numerical dissipation, which is mostly expressed in terms of the spectral radius ρ∞ at infinity. More specifically, we have ρ∞ ∈ [0, 1] where ρ∞ = 0 represents asymptotic annihilation of the high frequency response, i.e., the equivalent of L-stability. On the other hand, ρ∞ = 1 stands for the case of no algorithmic dissipation. Unconditional stability is achieved for the parameters αf =
ρ∞ , 1 + ρ∞
αm =
2ρ∞ − 1 , 1 + ρ∞
β = 14 (1 − αm + αf )2 .
(21)
Notice that γ = 1/2 − αm + αf is determined by the order condition (20). For the reverse method of lines, one requires furthermore a time stepsize control mechanism, which is usually based on an embedded solution of higher or lower order. While we have studied such embedded methods of third order and tested it in classical multibody applications, their usage in our context below is not advisable because they do not possess sufficient numerical dissipation. Instead, we suggest to use the implicit Euler scheme with its strong damping properties in combination with the second order method (18). An alternative to the α-method and also a candidate for the reverse method of lines is the implicit midpoint rule, which is related to the trapezoidal rule (in fact it is the same scheme for linear differential equations). In time-dependent partial differential equations, this method is known as the Crank-Nicholson scheme and served as basic integrator in [6]. Applied to the unconstrained equations (17), one time step reads
106
C. Lunk and B. Simeon
M
q n+1 − q n = M pn+1/2 , ∆t
(22a)
M
pn+1 − pn = f (q n+1/2 , pn+1/2 , tn+1/2 ), ∆t
(22b)
where the midpoint is defined by q n+1/2 = (q n + q n+1 )/2 and tn+1/2 = (tn + tn+1 )/2 = tn + ∆t/2. The implicit midpoint rule is A-stable (equivalent to unconditional stability), B-stable, and symmetric in time, which is a favorable property if conservation of momentum and angular momentum is desired. However, it does not have numerical dissipation for spurious high frequencies. To some degree, this can be cured by using the first order implicit Euler method for stepsize control purposes. 4.2 Constrained Equations Before stating the generalizations of the α- and the implicit midpoint rule to the constrained case, we review shortly the stabilized formulation that lies underneath. In differential-algebraic systems such as the equations of constrained mechanical motion (1), it is desirable to have an index as small as possible. Simple differentiation lowers the index but may lead to drift-off. The stabilization due to Gear, Gupta, and Leimkuhler [13] starts with the dynamic equations in combination with the constraints at velocity level where the position constraints are interpreted as invariants. Introducing additional multipliers µ(t) ∈ Rnλ and velocity variables p(t) ∈ Rnq , one obtains an enlarged system M q˙ = M p − G(q)T µ , M p˙ = f (q, p, t) − G(q)T λ , 0 = G(q) p , 0 = g(q) .
(23a) (23b) (23c) (23d)
A straightforward calculation shows 0=
d g(q) = G(q)q˙ = G(q) p − G(q)GT (q)µ = −G(q)GT (q)µ, dt
and one concludes µ = 0 since G(q) is of full rank and hence G(q)GT (q) invertible. With the additional multipliers µ vanishing, (23) and the original equations of motion (1) coincide along any solution. Yet, the index of the stabilized index 2 system (23) is two instead of three. Our choice of the time integration schemes for the reverse method of lines is inspired by the stabilized formulation (23), which means that the more expensive acceleration constraint is not evaluated. As described in [21], the
The Reverse Method of Lines in Flexible Multibody Dynamics
107
following α-RATTLE method extends the α-method to the constrained case and is particularly suited for applications in flexible multibody dynamics and structural dynamics. One time step of this method for position q n+1 , velocity pn+1 , and acceleration an+1 is given by − qn q 1 = M pn + ∆t( − β)an + ∆tβan+1 M n+1 ∆t 2 ∆t T − G λn+1 , (24a) 2 n+1 pn+1 − pn 1 1 = M ((1 − γ)an +γan+1 )− GTn λn+1 − GTn+1 µn+1 , (24b) ∆t 2 2 (24c) (1 − αm )M an+1 = αf f n + (1 − αf )f n+1 − αm M an , (24d) 0 = Gn+1 pn+1 , 0 = g n+1 . (24e) M
Both position and velocity constraints are thus enforced at each step. As above, the method coefficients should satisfy the conditions (20) for second order and zero stability while (21) specifies the numerical dissipation. We remark that a variant of (24) often showed a better performance in our experience. This variant reads q n+1 − q n 1 = M pn + ∆t( − β)an + ∆tβan+1 M ∆t 2 ∆tβ GT λn+1 , 1 − αm n − pn p γαf = M ((1 − γ)an + γan+1 ) − GT λn+1 M n+1 ∆t 1 − αm n −
−
γ(1 − αf ) T Gn+1 µn+1 , 1 − αm
(25a)
(25b)
(1 − αm )M an+1 + αm M an = αf (f n − GTn λn+1 )+(1 − αf )(f n+1 − GTn+1 µn+1 ), (25c) 0 = Gn+1 pn+1 ,
(25d)
0 = g n+1 ,
(25e)
see [22, Eq.(III.3.6)] for further details. Several remarks on the α-RATTLE scheme (24) deserve attention. A basically equivalent scheme was introduced in the recent work by Jay and Negrut [19] while Arnold and Bruls [1] derived a DAE version of the α-method without index reduction that does not require the velocity constraint and is thus less expensive to apply. Another remark concerns the solution of the
108
C. Lunk and B. Simeon
nonlinear system in each time step. Careful scaling of the iteration matrix is absolutely necessary to avoid ill-conditioned iteration matrices, and we apply here a scaling similar to the approach proposed by [7] for problems formulated as DAEs of index 3. In particular, we multiply the second and fourth equation of (24) by the time step size ∆t. Though embedded methods of lower or higher order can be constructed for the α-RATTLE scheme, it is more favorable in our context to use the implicit Euler method as counterpart for stepsize control purposes due to its strong numerical dissipation. This means that another nonlinear system should be solved along with (24) in order to provide a first order approxima˜ n+1 , µ ˜ n+1 . This discretization is also based on the stabilized tion q˜n+1 , p˜n+1 , λ formulation (23) and reads q˜n+1 − q n = M p˜n+1 − GTn+1 µ ˜ n+1 , ∆t − pn p˜ ˜ n+1 , = f n+1 − GTn+1 λ M n+1 ∆t 0 = Gn+1 p˜n+1 , 0 = g(˜ q n+1 ). M
(26a) (26b) (26c) (26d)
In combination with the reverse method of lines, one first computes the implicit Euler step (26) and then the second order approximation (24), see below. The extension of the midpoint rule (22) to the DAE case should preserve the symmetry, which is accomplished by the scheme − qn q = M pn+1/2 − GTn+1/2 µn+1 , (27a) M n+1 ∆t − pn p = f n+1/2 − GTn+1/2 λn+1 , (27b) M n+1 ∆t (27c) 0 = Gn+1 pn+1 , (27d) 0 = g(q n+1 ). While the constraint Jacobians in the right hand side are evaluated at the midpoint GTn+1/2 = GT (q n+1/2 ), the constraints are enforced at the endpoint as g(q n+1 ) and Gn+1 (q n+1 )pn+1 . With this modification of the standard midpoint rule, the method (27) becomes symmetric and can be interpreted as a certain symmetric projection method as introduced by Hairer [16]. Notably, (27) has also been suggested as time integration scheme in dynamic contact problems by Solberg [26]. Again, the implicit Euler (26) can be used for error estimation in time.
5 Reverse Method of Lines Though the time integration methods of the last section have been formulated for the finite dimensional equations of motion, we may also apply them formally to the corresponding infinite dimensional problem and reverse the order
The Reverse Method of Lines in Flexible Multibody Dynamics
109
of time and space discretization. In order to keep the notation simple, we skip mostly the nonlinearities from the floating frame of reference approach and reconsider the operator equations (6) and (8). 5.1 Time Step In Section 3 we have described how the implicit Euler method can be applied to the unconstrained problem (6). The extension to the α-method (18) and the midpoint rule (22) is straightforward and follows the lines of the original work of Bornemann and Schemann for the wave equation. What we will consider in more detail here is the constrained case, i.e., the application of α-RATTLE and the implicit midpoint rule. We thus extend the time integration method (24) to the operator DAE (8) and leave the space variable continuous. We distinguish between the displacement field un at time tn , its velocity wn , and the acceleration an and use as above both the original constraints 0 = Bun − mn and the constraints at the ˙ n with corresponding multipliers λn and µn . This velocity level 0 = Bwn − m yields the basic time step ∆t ∗ un+1 − un 1 = R wn + ∆t( − β)an + ∆tβan+1 − B λn+1 , (28a) R ∆t 2 2 R
1 1 wn+1 − wn = R ((1 − γ)an + γan+1 ) − B ∗ λn+1 − B ∗ µn+1 , ∆t 2 2
(28b)
˙ n+1 , 0 = Bwn+1 − m
(28c)
0 = Bun+1 − mn+1 .
(28d)
Here, the last but one equation stands for the velocity constraint, and the new acceleration is given by the expression (1 − αm )Ran+1 = αf (−Aun + ln ) + (1 − αf )(−Aun+1 + ln+1 ) − αm Ran and can be substituted in (28). The system (28) is a stationary saddlepoint problem for the the unknowns un+1 , wn+1 , λn+1 , µn+1 that has to be solved in each time step. A similar saddlepoint problem results for the implicit midpoint rule. Applying (27) formally to the PDAE (8), we obtain un+1 − un = Rwn+1/2 − B ∗ µn+1 , ∆t wn+1 − wn = −Aun+1/2 + ln+1/2 − B ∗ λn+1 , R ∆t R
(29a) (29b)
˙ n+1 , 0 = Bwn+1 − m
(29c)
0 = Bun+1 − mn+1 .
(29d)
110
C. Lunk and B. Simeon
In order to analyse the structure of these stationary problems, one rearranges the displacement and velocity fields in a new variable U := (un+1 , ∆twn+1 ) (note the scaling), and alike the Lagrange multipliers as Λ := (∆t2 λn+1 , ∆t2 µn+1 ). Corresponding test functions for U are denoted by Υ ∈ V × L2 (Ω)3 , and Θ ∈ Q × Q stands for the test functions with respect to the constraints. In this way, one can transform (28) as well as (29) into an abstract saddle point problem with bilinear forms A and B, A(U , Υ ) + B(Υ , Λ) = L(Υ ), B(U , Θ) = M (Θ).
(30a) (30b)
Without going into the details of (30), we remark that the bilinear form A depends now on ∆t and in general is no more symmetric. Results on the inf-sup condition and on the well-posedness are elaborated in [22]. 5.2 Space Discretization and a Posteriori Estimators For the discretization in space, one applies as introduced in Section 3 a Galerkin projection both for the displacement-velocity vector U → U h and the multipliers Λ → Λh . Standard finite elements would be the method of choice for the displacement and velocity fields while for the Lagrange multipliers one can take the trace of the ansatz functions on the boundary Γ0 . This reduces (30) in a straightforward way to a system of linear equations for the discrete approximations U h and Λh , which features again a saddlepoint structure and can be rewritten as finite-dimensional α-RATTLE scheme (24) or implicit midpoint rule (27), respectively, with nodal vectors q n+1 , pn+1 for displacement and velocity and corresponding variables for the Lagrange multipliers. If we include additional nonlinear rotations as in (4), the above procedure is still feasible and leads to a nonlinear system of equations in each time step. For adaptivity in space, we need also a so-called a posteriori error estimator that controls the spatial grid. We give next a short overview on these techniques from numerical PDEs, see [9] for more details. In this context, the choice of an appropriate norm for measuring spatial errors of the solution vector U h requires special care. Duvaut and Lions [11] proved that the displacement u of (6) is element of H 1 (Ω)3 but the velocity w is in L2 (Ω)3 . Thus we take the norm U 2H := un+1 2H 1 (Ω)3 + wn+1 2L2 (Ω)3 as proposed in [6]. The goal of a posteriori error estimation is to find an error indicator ηT ∈ R for each element T of a triangularization T of the domain Ω. The norm η of these values ηT is called an efficient (for the upper bound) and reliable (for
The Reverse Method of Lines in Flexible Multibody Dynamics
111
the lower bound) error estimator of the true error U h − U if there are two constants c and C such that ηT2 . cη ≤ U h − U H ≤ Cη with η 2 = T ∈T
In order to keep the approach simple, we omit the error in Λh . Basically, there exist three approaches for the construction of such an indicator function. The first technique is the residual error estimator. Its main idea is to insert the numerical solution into the infinite problem. Then in general one observes jumps with respect to the spatial derivatives on the edges of each finite element. These jumps can be bounded with a suitable inverse inequality, and the indicator ηT corresponds to the height of the jumps. After a mesh refinement one expects a decrease of these jumps. Unfortunately, in a time integration of a spatially discretized model small oscillations are very frequent. These oscillations cannot be eliminated by a finer mesh, on the contrary it might happen that more and higher eigenfrequencies are stimulated, if the time stepsize decreases. Therefore the use of a residual estimator is not advisable in our context. The two other approaches start from the assumption that a numerical solution U h of (30) with enlarged test function space Vh is closer to the true solution than the solution U h with test functions in Vh ⊂ Vh , U h − U ≤ c U h − U ,
c < 1.
(31)
A simple triangle inequality shows that the computable difference U h −U h would be an efficient and reliable estimator for the real error U h − U . The averaging techniques including the ZZ-estimator named after Zhu and Zienkiewicz [29] construct a smoothed stress tensor σ(un+1 ) instead of computing a second solution U h . This estimator projects the computed stress values σ(un+1 ) on the ansatz function space along so-called recovery points in each finite element. If one considers solely the displacement field and neglects the velocity field of the numerical solution U h = (un+1 , ∆twn+1 ), a coercivity and continuity condition results in c˜C−1/2 (σ(un+1 ) − σ(un+1 ))L2 ≤ U h − U h H ˜ −1/2 (σ(un+1 ) − σ(un+1 ))L2 . ≤ CC Therefore the norm of the difference of these stress tensors is equivalent to the norm of the error U h − U . The main advantage of the averaging techniques lies in the low computational effort. One needs only means of computed stress tensors and is independent of the specific structure of the saddle point problem (30). However, the disadvantage of the residual based error estimator also may show up in this approach. The estimated spatial error given by the difference between two stress tensors only decreases if the solution becomes
112
C. Lunk and B. Simeon
smoother. Additionally the comparison of the error in space and error in time requires the use of the same norm. So the assumption of no errors in the velocity components is not only unrealistic, the time stepsize control using the first order implicit Euler and the second order α-RATTLE scheme actually requires an inclusion of the velocities. The third approach employs the saturation assumption (31), too. The socalled hierarchical error estimation considers the solution U h as a second solution of the discretized saddle point problem (30) with an enlarged function space Vh = Vh ⊕ Zh . Vice versa one could expect that the difference E h := U h − U h is mainly part of Zh . Thus one recomputes the saddle point problem (30) for the error E h with test functions Υ ∈ Zh . With a proper function space Zh the dimension of this auxiliary problem is relatively low, and the corresponding matrix has a diagonal form. Here, small oscillations do not matter because the error estimator neglects the contributions across the edges of the FE discretization. As a further advantage of this method, the solution U h +E h ∈ Vh could be used as a starting value in an iterative solver after a spatial mesh refinement. Last but not least, there exists already a theory by Bank and Smith [4] on using hierarchical bases for indefinite problems like the saddlepoint formulation (30). 5.3 Adaptivity in Time and Space Though at first sight straightforward, an extension of the algorithm described in [6] to our setting of constrained equations of motion turns out to be rather involved and provides several pitfalls. In the following basic algorithm for performing one time step n ; n + 1 of the reverse method of lines, we assume a given mesh Tn and given data M n , f n , Gn , which means that we work with the finite-dimensional notation of Section 4 and allow the dimension of the variables to change from time step to time step according to the gridsize. In short, one time step with α-RATTLE as time integrator reads 1. 2. 3. 4. 5. 6. 7. 7a. 7b.
˜ n+1 , µ ˜ n+1 by implicit Euler Compute first order solution q˜n+1 , p˜n+1 , λ Estimate spatial error errx of q˜n+1 , p˜n+1 (possibly also of λn+1 , µn+1 ) If errx > tolx : refine Tn , go to 1.) Solve (24) or (28), respectively, for q n+1 , pn+1 , an+1 , λn+1 , µn+1 (use solution from 1.) as starting value) Estimate spatial error errx of q n+1 , pn+1 (possibly also of λn+1 , µn+1 ) If errx > tolx : refine mesh Tn , go to 4.) Estimate time error errt ; if errt > tolt : go to 7.a), else 7b.) Decrease time stepsize ∆t; go to 1.) Accept step, interpolate to coarse mesh Tn , n → n + 1, go to 1.)
Clearly, this is only a rough sketch of an actual implementation, and several details like the combination of space and time errors require special attention.
The Reverse Method of Lines in Flexible Multibody Dynamics
113
If we omit the spatial dependence for a second and just look at the time step, the standard time stepsize formula reads ) tolt · ∆tn , ∆tn+1 = errt where the error has to be computed from ' U : solution of order 2, ˜ H errt = U − U ˜ : solution of order 1. U ˜ are the solutions of a stationary PDE problem given by (30), these As U and U field variables are actually not available and approximated by their numerical ˜ h . This means that we have to take the relation counterparts U h and U U = U h + errx ,
˜ =U ˜ h + e* U rrx
with the spatial errors into account. We define the computable quantity ˜ h errt,h := U h − U and from the a posteriori error estimation θ := errx + * errx . From the inequality errt,h − θ ≤ errt ≤ errt,h + θ and assuming θ ≤ errt,h /k, we obtain bounds k+1 k−1 errt,h ≤ errt ≤ errt,h k k
(32)
for the unknown error errt . Actually, one can show that it then must hold tolx < c(k) · tolt for the tolerances with a constant c(k) depending on the factor k. In practice, we use k = 4 as in [6]. If the condition for the inequality chain (32) is violated, e.g., if the time integration error is much smaller than the spatial discretization error, one has to decrease the tolerances for the spatial mesh and to repeat the current time step. Another important algorithmic detail are the coarsening strategies. If we only allow refinement steps, the finite element mesh is likely to increase from time step to time step and eventually to grow beyond a reasonable size. Thus one has to include a coarsening strategy after each time step. One possibility is here to return to a given coarse mesh and interpolate the solution for the
114
C. Lunk and B. Simeon
next time step from the previously performed refinement steps. A second option is to introduce an upper bound for the number of unknowns so that the refinement process always stops when a certain threshold is reached. This means, however, that the error in space might not sink below the desired tolerance. Last, we give some specific remarks concerning the spatial mesh refinement and the mapping of a discrete solution from one mesh to another. From our experience we recommend to freeze the gross motion variables after the first pass of step 1 or 4, respectively, in the algorithm above. Then if step 3 or 6, respectively, fails, one recomputes solely the elastic displacements after spatial mesh refinement. Hence the frozen gross motion variables play the role of additional constraints. This procedure recurs at each time step. If one has computed a solution on a fine mesh and starts the next time step with a coarse mesh, then after a required new mesh refinement one should use the old solution on its fine mesh as input variable. In other words, after each time step one has to save the current finest mesh and the corresponding solution. The interpolation between different meshes requires corresponding routines, which are often provided by standard finite element methods.
6 Simulation Examples 6.1 Pantograph with Catenary Finally, we present preliminary computational results of the above reverse method of lines. The first simulation tackles a catenary system with moving pantograph. We take the setting of the simple benchmark problem as described in [2]. A contact wire of length l = 20 m is suspended by two vertical segments, the so-called droppers, at the carrier wire on top. The moving pantograph starts at x = 2 m and we simulated the velocities v = 32 m/s or v = 48 m/s, respectively. For simplicity, we modelled the pantograph as a two body system with a constant force upward to keep contact. The moving contact constraint between contact wire and pantograph requires an extension of the constraint formulation (7b) with time-depending contact segment Γ0 (t). Further constraints are the two fixed dropper lengths. While the contact wire of the catenary is discretized by cubic beam elements, the carrier wire is treated by linear finite elements, and the averaging error estimator is used to adapt the spatial grid of both wires. We applied the α-RATTLE method (24) as time integrator with spectral radius ρ∞ = 0. Because of the time stepsize memory in the acceleration terms an , frequent changes in stepsize are not advantageous here, and only in case errt > tolt a step size reduction is accepted immediately in the algorithm. Figure 1 shows a snapshot of the dynamic simulation where the dotted line stands for a reference solution with a fixed fine spatial grid. In contrast, the fewer crosses stand for the nodes that are generated by the adaptive algorithm.
The Reverse Method of Lines in Flexible Multibody Dynamics xp=3.453
−3
3
115
x 10
2
1
0
−1
−2
2.5
xp=3.453
x 10−3
−3 2
−4 1.5
−5
14
0
2
4
6
8
10
12
14
14.2
16
14.4
18
14.6
14.8
15
20
Fig. 1. Pantograph-catenary problem: snapshot of contact wire displacement with uniform grid (dots) versus adapted grid (crosses)
7
Time step sizes
x 10−3
6 5 4 3 2 1 0 0
0.1
0.2
0.3
0.4
0.5
Time t
Fig. 2. Pantograph-catenary problem: time stepsizes
One observes a very good agreement of the reference solution with the one obtained by adaptive mesh refinement. In Fig. 2, the time stepsize history for this simulation is depicted. 6.2 Flexbile Slider Crank Mechanism As a second numerical example, we study the well-known planar slider crank mechanism [18, 25] where crank and sliding block are rigid bodies while the
116
C. Lunk and B. Simeon
t = 0.094168 t = 9.11e−006 Nodes = 794
t = 0.077023 t = 1.63e−005 Nodes = 319
Fig. 3. Two snapshots of the slider crank simulation and a posteriori refined grids. The displacements of the connecting rod are scaled to magnify the deformation
Number of nodes 800 600 400 200 0 0
5
x 10
0.02
0.04
0.06 Time t Time stepsize
0.08
0.1
0.12
0.02
0.04
0.06 Time t
0.08
0.1
0.12
−5
4 3 2 1 0 0
Fig. 4. Slider crank: time stepsize history and number of nodes by a posteriori error estimation
connecting rod is elastic and treated under the assumption of plane stress. The frictionless revolute joints of the connecting rod are expressed as equality constraints. As time integrator we applied the α-RATTLE scheme with ρ∞ = 1 and combined it with the first order implicit Euler scheme for stepsize prediction. Three-node linear finite elements are used to discretize the connecting rod, and the error estimator to adapt the spatial grid is based on the hierarchical approach of [4] for non-symmetric indefinite problems. Figure 3 shows two snapshots of the dynamic simulation with corresponding spatial grids and time stepsize. At each time step, an evaluation of the a posteriori error estimator by using the hierarchical basis approach with corresponding refinement steps was performed, and the mesh that meets the spatial error tolerance is shown in the plots. The time stepsize and number of unknowns after refinement are depicted in Fig. 4.
The Reverse Method of Lines in Flexible Multibody Dynamics
117
It can be observed that both the time stepsize and the mesh actually change during the simulation in order to meet the error criterion. The number of employed nodes still oscillates somewhat, which indicates that the delicate balance of time and space error controllers can be further improved by additional tuning. Nevertheless, these results demonstrate the potential benefit of adaptation in space as regions with higher error contributions are better resolved whenever necessary.
7 Conclusions In conclusion, we would like to stress that the reverse method of lines represents a promising approach in flexible multibody dynamics. However, it is clear that one has to pay a price for error control in time and space as the refinement and coarsening steps for the spatial mesh, the interpolation of solutions, and the reassembly of mass and stiffness matrices at each time step squeeze the performance down. For well-understood problems it may not be worth while paying this price, but in other situations where the discretization of the elastic bodies is crucial and delicate, this approach offers a complement and additional safeguarding of engineering judgement. Some improvements of the current implementation and the application to more challenging examples still need to be done in order to provide more evidence for this reasoning.
References 1. Arnold M, Bruls O (2008) Convergence of the generalized-alpha scheme for constrained mechanical systems. Multibody System Dynamics, 18:185–202 (2007) 2. Arnold M, Simeon B (2000) Pantograph and catenary dynamics: a benchmark problem and its numerical solution. Appl Numer Math 34:345–362 3. Bangerth W, Rannacher R (2003) Adaptive finite element methods for differential equations. Birkh¨ auser, Basel 4. Bank RE, Smith RK (1993) A posteriori error estimates based on hierarchical bases. SIAM J Numer Anal 30:921–935 5. Bornemann F (1991) An adaptive multilevel approach to parabolic equations. IMPACT 4:279–317 6. Bornemann F, Schemann M (1998) An adaptive Rothe method for the wave equation. Comput Visual Sci 1:137–144 7. Bottasso CL, Bauchau OA, Cardona A (2007) Time-step-size-independent conditioning and sensitivity to perturbations in the numerical solution of index three differential algebraic equations. SIAM J Scient Comput 29:397–414 8. Bremer H, Pfeiffer F (1992) Elastische mehrk¨ orpersysteme. Teubner, Stuttgart 9. Carstensen C (2005) A unifying theory of a posteriori finite element error control. Numer Math 100:617–637 10. Chung J, Hulbert G (1993) A time integration algorithm for structural dynamics with improved numerical dissipation: the generalized α-method. J Appl Mech 60:371–375
118
C. Lunk and B. Simeon
11. Duvaut B, Lions JL (1976) Inequalities in mechanics and physics. Springer, Berlin/Heidelberg/New York 12. Eich-Soellner E, F¨ uhrer C (1998) Numerical methods in multibody dynamics. Teubner, Stuttgart 13. Gear C, Gupta G, Leimkuhler B (1985) Automatic integration of the EulerLagrange equations with constraints. J Comp Appl Math 12:77–90 14. G´eradin M, Cardona A (2000) Flexible multibody dynamics. Wiley, New York 15. Golub GH, van Loan CF (1996) Matrix computations. 3rd ed., John Hopkins University Press, Baltimore, MD 16. Hairer E (2000) Symmetric projection methods for differential equations on manifolds. BIT 40:726–734 17. Hilber H, Hughes T, Taylor R (1977) Improved numerical dissipation for time integration algorithms in structural dynamics. Earthquake Eng Struct Dyn 5:283–292 18. Jahnke M, Popp K, Dirr B (1993) Approximate analysis of flexible parts in multibody systems using the finite element method. In: Schiehlen W (Ed). Advanced multibody system dynamics. Kluwer, Stuttgart 19. Jay L, Negrut D (2007) Extensions of the HHT-α method to differentialalgebraic equations in mechanics. Elect Trans Numer Analysis (ETNA) 26:190– 208 20. Lang J (2001) Adaptive multilevel solution of nonlinear parabolic PDE systems: theory, algorithm, and applications. Springer, Berlin/Heidelberg/New York 21. Lunk C, Simeon B (2006) Solving constrained mechanical systems by the family of Newmark and α-methods. ZAMM 86:772–784 22. Lunk C (2008) Adaptive numerische verfahren zur l¨ osung partieller differenzialalgebraischer gleichungen in der strukturdynamik. Hieronymus, M¨ unchen 23. Schwertassek R, Wallrapp O (1998) Dynamik flexibler mehrk¨ orpersysteme. Vieweg, Braunschweig 24. Shabana A (1998) Dynamics of multibody systems. Cambridge University Press, Cambridge 25. Simeon B (2006) On Lagrange multipliers in flexible multibody dynamics. Comp Meth Appl Mech Eng 195:6993–7005 26. Solberg J (2000) Finite element methods for frictionless dynamic contact between elastic materials. Ph.D. thesis, University of California, Berkeley, CA 27. Wohlmuth B, Krause R (2003) Monotone methods on non-matching grids for non-linear contact problems. SIAM J Scient Comp 25:324–347 28. Wriggers P (2002) Computational contact mechanics. Wiley, Chichester 29. Zienkiewicz OC, Zhu JZ (1992) The superconvergence patch recovery and a posteriori error estimates, Part I: The recovery technique. Int J Num Meth Eng 33:1331–1364
A Nonlinear Finite Element Framework for Flexible Multibody Dynamics: Rotationless Formulation and Energy-Momentum Conserving Discretization Peter Betsch and Nicolas S¨ anger Chair of Computational Mechanics, University of Siegen, Paul-Bonatz-Straße 9-11, 57068 Siegen, Germany E-mail:
[email protected] Summary. A uniform framework for rigid body dynamics and nonlinear structural dynamics is presented. The advocated approach is based on a rotationless formulation of rigid bodies, nonlinear beams and shells. In this connection, the specific kinematic assumptions are taken into account by the explicit incorporation of holonomic constraints. This approach facilitates the straightforward extension to flexible multibody dynamics by including additional constraints due to the interconnection of rigid and flexible bodies. We further address the design of energy-momentum schemes for the stable numerical integration of the underlying finite-dimensional mechanical systems.
1 Introduction In recent years the extension of finite element methods for nonlinear structural dynamics to the realm of flexible multibody dynamics has attracted a lot of research. This has been facilitated by previous developments of computational methods for nonlinear structural mechanics which have reached a certain state of maturity. In this connection main ingredients of contemporary methods for structural dynamics are (i) the use of nonlinear strain measures which may account for both finite strains and arbitrarily large rigid body motions, and (ii) so-called mechanical integrators (e.g. energy-momentum or energydecaying schemes) which make possible the stable time integration of the ‘stiff’ nonlinear ODEs resulting from the space discretization. In a flexible multibody framework the components may consist of flexible bodies (e.g. beams, shells or continua) and rigid bodies. These components are typically interconnected by various types of joints (e.g. spherical, revolute, prismatic, . . .). The joints impose constraints on the multibody system thus restricting the relative motion of the components. The presence of C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
119
120
P. Betsch and N. S¨ anger
the constraints constitutes the major challenge for the development of finiteelement-based computational methods. In essence, three alternative methods have been applied previously to deal with the constraints within a nonlinear finite element framework: (i) The constraints may be enforced by means of Lagrange multipliers. This approach has been used, for example, by G´eradin and Cardona [20], Ibrahimbegovi´c et al. [25], Puso [31], Taylor [40], Bottasso et al. [14] and Bauchau et al. [2]. (ii) Alternatively, the constraints may be accounted for by kinematic relationships within the master-slave approach developed by Jeleni´c and Crisfield [26], see also Ibrahimbegovi´c et al. [25], G¨ottlicher and Schweizerhof [24] and Mu˜ noz et al. [30]. (iii) The constraints may be imposed by means of the penalty method. This approach has been applied, for example, by Goicolea and Orden [21]. In all of the above cited works rotational parameters are used as integral part of the description of both structural components and rigid bodies. Similarly, previous works on the dynamics of nonlinear shells make use of rotational parameters, at least within the time discretization, see, for example, Simo et al. [37], Simo and Tarnow [39], Kuhl and Ramm [27], Sansour et al. [36], Romero and Armero [33] and Brank et al. [16]. The present description of rigid bodies relies on the enforcement of the assumption of rigidity by means of holonomic constraints [6, 9]. Specifically, to describe the rotational motion of a rigid body, we enforce the orthonormality of a body-fixed director frame. Similar to the use of natural coordinates (see [19] and references therein) our approach completely circumvents the use of rotational parameters and is thus termed a rotationless formulation (see also [41]). Due to the presence of geometric constraints the motion of the rigid body is governed by differential-algebraic equations (DAEs). The present description of beams and shells rests on nonlinear theories which rely on objective strain measures. Contemporary finite element methods are used for the discretization in space. In this connection, the interpolation of rotational parameters is completely circumvented. Instead, director frames characterizing the orientation of the cross sections are interpolated. It is worth mentioning that the director interpolation is a characteristic feature of so-called degenerate continuum elements. The kinematic assumptions of the underlying beam/shell theory are enforced in discrete mesh points by means of holonomic constraints. The resulting semi-discrete equations of motion again assume the form of the above mentioned DAEs. In a multibody system the interconnection of rigid bodies, beams and shells (and, of course, general continuum bodies) can be accomplished by introducing additional ‘external’ constraints which can be easily appended to the DAEs. The DAEs thus provide a uniform framework for flexible multibody dynamics [11, 29]. Concerning the numerical integration of the DAEs we advocate a direct discretization as proposed in [10, 22]. In particular, if the underlying flexible multibody system belongs to the class of Hamiltonian systems with symmetry, we obtain energy-momentum conserving integrators. It is worth noting that energy consistent integrators (and appropriate energy decaying variants
Finite Element Framework for Flexible Multibody Dynamics
121
thereof) are of utmost importance for the stable numerical integration of the underlying DAEs. Originally, this type of conserving schemes have been developed in the realm of nonlinear structural dynamics to meet the specific stability demands of highly-oscillatory elastic systems, see, for example, Simo and Tarnow [38], Gonzalez and Simo [23], Bauchau and Bottasso [1], Betsch and Steinmann [8] and the references cited therein. In addition to that, the advocated energy-momentum scheme is able to reproduce the defining property of worklessness associated with ideal holonomic constraints.
2 Rotationless Formulation of Discrete Mechanical Systems: Equations of Motion In this section we outline the structure of the equations of motion relevant to the rotationless formulation of discrete mechanical systems. In the case of nonlinear beams and shells these equations emanate from an appropriate discretization in space. For simplicity, we focus here on discrete mechanical systems which are holonomic and scleronomic. Accordingly, the equations of motion assume the form q˙ − v = 0, (1a) M v˙ − f (q) + GT (q)λ = 0, Φ(q) = 0.
(1b) (1c)
Here, q(t) ∈ Rn specifies the configuration of the mechanical system at time t, and v(t) ∈ Rn is the velocity vector. Together (q, v) form the vector of state space coordinates (see, for example, Rosenberg [35]). A superposed dot denotes differentiation with respect to time and M ∈ Rn×n is a constant and symmetric mass matrix, so that the kinetic energy can be written as T (v) =
1 v · M v. 2
(2)
Moreover, f ∈ Rn is a load vector which may be decomposed according to f = Q − ∇V (q).
(3)
Here, V (q) ∈ R is a potential energy function and Q ∈ Rn accounts for loads which can not be derived from a potential. Moreover, Φ(q) ∈ Rm is a vector of geometric constraint functions, G = DΦ(q) ∈ Rm×n is the constraint Jacobian and λ ∈ Rm is a vector of multipliers which specify the relative magnitude of the constraint forces. In the above description it is tacitly assumed that the m constraints are independent. Due to the presence of holonomic (or geometric) constraints (1c), the configuration space of the system is given by Q = {q(t) ∈ Rn |Φ(q) = 0}.
(4)
122
P. Betsch and N. S¨ anger
The equations of motion (1) form a set of index-3 differential-algebraic equations (DAEs) (see, for example, Kunkel and Mehrmann [28]). They constitute a uniform framework for the rotationless formulation of rigid bodies, nonlinear beams and shells and, consequently, flexible multibody dynamics. The corresponding discretization in space of beams and shells will be treated in Sections 4 and 5, respectively. 2.1 Discretization of the DAEs For the discretization of the DAEs (1), we apply a specific approach which yields an energy-momentum conserving scheme. We refer to [5, Section 3] for a summary of the salient features of the present discretization approach. The Basic Energy-Momentum Scheme Consider a representative time interval [tn , tn+1 ] with time step ∆t = tn+1 −tn , and given state space coordinates q n ∈ Q, v n ∈ Rn at tn . The discretized version of (1) is given by ∆t (v n + v n+1 ) , 2 ¯ M (v n+1 − v n ) = ∆tf (q n , q n+1 ) − ∆tG(q n , q n+1 )T λ, q n+1 − q n =
Φ(q n+1 ) = 0,
(5a) (5b) (5c)
with ¯ (q n , q n+1 ). f (q n , q n+1 ) = Q(q n , q n+1 ) − ∇V
(6)
In the sequel, the algorithm (5) will be called the basic energy-momentum (BEM) scheme. The advantageous algorithmic conservation properties of the BEM scheme are linked to the notion of a discrete gradient (or derivative) of ¯ (q n , q n+1 ) denotes the discrete a function f : Rn → R. In the present work ∇f gradient of f . It is worth mentioning that if f is at most quadratic then the discrete gradient coincides with the standard gradient evaluated in the mid¯ (q n , q n+1 ) = point configuration q n+ 12 = (q n +q n+1 )/2, that is, in this case ∇f ∇f (q n+ 12 ). In (5b) the discrete gradient is applied to the potential energy function V as well as to the constraint functions Φi . In particular, the discrete constraint Jacobian is given by ¯ 1 (q n , q n+1 ), . . . , ∇Φ ¯ m (q n , q n+1 ) . (7) G(q n , q n+1 )T = ∇Φ Concerning (6), for the present purposes it suffices to set Q(q n , q n+1 ) = Q(q n+ 12 ). The BEM scheme can be used to determine q n+1 ∈ Q, v n+1 ∈ Rn ¯ ∈ Rm . To this end, one may substitute for v n+1 from (5a) into (5b) and and λ then solve the remaining system of nonlinear algebraic equations for the n+m ¯ We refer to [4] for further details of the implementation. unknowns (q n+1 , λ).
Finite Element Framework for Flexible Multibody Dynamics
123
The Reduced Energy-Momentum Scheme To reduce the computational costs and improve the conditioning of the algebraic system to be solved we perform a size-reduction of the BEM scheme. For this purpose the so-called discrete null space method (see [4, 6, 29]) has been previously developed. This method essentially rests on the introduction of a discrete null space which is related to the discrete constraint Jacobian (7) via (8) range P(q n , q n+1 ) = ker G(q n , q n+1 ) . That is, the columns of P(q n , q n+1 ) span the null space of the discrete constraint Jacobian. Although such a matrix can always be generated by numerical methods (for example, by applying a QR decomposition of GT , see [4]), in many cases it is feasible to devise appropriate explicit versions by applying a specific velocity analysis in the continuous setting (see [6, 13] for representative examples). In particular, in the continuous case, admissible velocities v ∈ Tq Q = ker G(q) may be written in the form v = Pν
(9)
with independent velocities ν ∈ Rn−m . Once a continuous version of the null space matrix has been found, a suitable discrete version can be designed which has to satisfy the following conditions (see [6] for further details): (a) In the limit of vanishing time steps, ∆t → 0, the discrete version has to coincide with the continuous one. That is, P(q n , q n+1 ) → P (q n )
as
q n+1 → q n .
(10)
(b) The n × (n − m) matrix P(q n , q n+1 ) has full rank and satisfies G(q n , q n+1 )P(q n , q n+1 ) = 0.
(11)
This condition should be fulfilled at least for q n , q n+1 ∈ Q. Once a discrete null space matrix is at hand, the discrete constraint forces can be eliminated from the BEM scheme. To this end, pre-multiplication of (5b) by P(q n , q n+1 )T and use of property (b) yields the reduced scheme ∆t (v n + v n+1 ) , 2 P(q n , q n+1 )T M (v n+1 − v n ) = ∆tP(q n , q n+1 )T f (q n , q n+1 ), Φ(q n+1 ) = 0. q n+1 − q n =
(12a) (12b) (12c)
A further size-reduction can be accomplished by a reparametrization of the remaining unknowns. For open-loop multibody systems it is generally feasible to choose n − m local coordinates (e.g. joint variables) for the parametrization
124
P. Betsch and N. S¨ anger
of the configuration space. Accordingly, use can be made of a mapping F : Rn−m → Q ⊂ Rn , such that q n+1 = F (u).
(13)
Substitution from (13) into (12) yields n − m algebraic equations for the determination of the new unknowns u ∈ Rn−m . Application of the two sizereduction steps outlined above yields a scheme which will be referred to as the reduced energy-momentum (REM) scheme in the following.
3 Rotationless Formulation of Rigid Bodies The configuration of a rigid body in three-dimensional Euclidean space can be characterized by the placement of its center of mass ϕ(t) ∈ R3 and a right-handed body frame {dI }, dI (t) ∈ R3 (I = 1, 2, 3), which specifies the orientation of the body (Fig. 1). The vectors dI will be occasionally called directors. Let X = Xi ei 1 be a material point which belongs to the reference configuration V ⊂ R3 of the rigid body. The spatial position of X ∈ V at time t relative to an inertial Cartesian basis {eI } can now be characterized by x(X, t) = ϕ(t) + Xi di (t).
(14)
For simplicity we assume that the axes of the body frame are aligned with the principal axes of the body. Then the kinetic energy of the rigid body can be written as 3 1 1 EI v I 2 , (15) T = Mϕ v ϕ 2 + 2 2 I=1
d2 d1
e3
ϕ
e2
d3
e1 Fig. 1. Spatial rigid body 1
In this work the summation convention applies to repeated lower case Roman indices.
Finite Element Framework for Flexible Multibody Dynamics
125
˙ v I = d˙ I and where v ϕ = ϕ, Mϕ =
(X) dV,
(16a)
(XI )2 (X) dV.
(16b)
V EI = V
Here, (X) is the mass density at X ∈ V , Mϕ is the total mass of the body and EI are the principal values of the Euler tensor with respect to the center of mass. Note that the spectral decomposition of the current Euler tensor with respect to the center of mass is given by E=
3
EI dI ⊗ dI .
(17)
I=1
The Euler tensor is symmetric positive definite and can be linked to the customary inertia tensor via the relationship J = (trE)I − E.
(18)
Obviously, the configuration of the rigid body can be characterized by the following vector of redundant coordinates (n = 12): ⎡ ⎤ ϕ ⎢d1 ⎥ ⎥ (19) q=⎢ ⎣d2 ⎦ . d3 Due to the assumption of rigidity the body frame has to stay orthonormal for all times. Thus there are m = 6 independent internal constraints with associated constraint functions ⎤ ⎡1 T 2 [d1 d1 − 1] ⎢ 1 [dT2 d2 − 1]⎥ ⎥ ⎢2 ⎢ 1 [dT d − 1]⎥ ⎥ ⎢2 3 3 Φ int (q) = ⎢ (20) ⎥. ⎥ ⎢ dT1 d2 ⎥ ⎢ ⎦ ⎣ dT1 d3 T d2 d3 The internal constraints give thus rise to the corresponding 6 × 12 constraint Jacobian ⎡ T T T T⎤ 0 d1 0 0 ⎢0T 0T dT2 0T ⎥ ⎢ T T T T⎥ ⎢0 0 0 d3 ⎥ ⎥ Gint (q) = ⎢ (21) ⎢0T dT2 dT1 0T ⎥ . ⎢ T T T T⎥ ⎣0 d3 0 d1 ⎦ 0T 0T dT3 dT2
126
P. Betsch and N. S¨ anger
Moreover, the kinetic energy expression (15) leads to the constant mass matrix ⎡ ⎤ 0 0 Mϕ I 0 ⎢ 0 E1 I 0 0 ⎥ ⎥. M =⎢ (22) ⎣ 0 0 E2 I 0 ⎦ 0 0 0 E3 I where I and 0 are the 3 × 3 identity and zero matrices. The equations of motion of the constrained system at hand can now be written in the form of the DAEs (1).
4 Space Discretization of Geometrically Exact Beams We next deal with a space finite element discretization of geometrically exact beams which fits well into the framework of the rotationless description of discrete mechanical systems outlined in Section 2. The kinematical assumptions pertaining to the underlying geometrically exact beam formulation give rise to holonomic constraints. We directly start with the semi-discrete beam formulation developed in [12]. Accordingly, the finite element discretization in space leads to nodal configuration vectors q A ∈ R12 of the form ⎡ A⎤ ϕ ⎢ dA ⎥ A 1 ⎥ q =⎢ (23) ⎣ dA ⎦, 2 dA 3 where A = 1, . . . , nnode (Fig. 2). Here, the position of each node is denoted by 3 ϕA (t) ∈ R3 and the nodal director vectors dA i (t) ∈ R are assumed to stay orthonormal for all time. On a nodal basis the kinematics of the semi-discrete beam model is thus in complete analogy to the rotationless formulation of rigid bodies. Accordingly,
dA 1 dA 3
e3 ϕA e1
dA 2 s
e2
Fig. 2. Configuration of a semi-discrete beam
Finite Element Framework for Flexible Multibody Dynamics
127
to each node there are associated six internal constraints Φ int (q A ), where the corresponding vector-valued constraint function Φ int : R12 → R6 is given by (20). A In the beam formulation, dA 1 and d2 are assumed to span a principal basis for the cross-section of the beam at node A. The isoparametric finite element interpolations in space are given by h
ϕ (s, t) =
n node
A
NA (s)ϕ (t)
and
dhk (s, t)
=
n node
A=1
NA (s)dA k (t),
(24)
A=1
where NA (s) are Lagrange-type nodal shape functions and s ∈ [0, L] is the arc-length of the reference curve of the beam in the reference configuration. Note that the interpolation of the director field in general does not preserve orthonormality of the director triad throughout the beam element even though orthonormality is enforced in the nodal points. This is a salient property of the present finite element approximation in space (see Bottasso et al. [15] and Romero [32] for related investigations). With regard to (24), the configuration of the semi-discrete finitedimensional mechanical system at hand is characterized by the configuration vector q ∈ R12nnode , into which all the A = 1, . . . nnode nodal contributions (23) are assembled: ⎡ 1 ⎤ q ⎢ .. ⎥ q = ⎣ . ⎦. (25) q nnode In the present case the potential energy function can be written in the form V (q) = Vint (q) + Vext (q),
(26)
where Vext takes into account conservative external loads and Vint accounts for elastic deformations of the beam. In particular, the strain energy of the semi-discrete beam is assumed to be of the form L W Γ (ϕh (s, t), dhk (s, t)), K(ϕh (s, t), dhk (s, t)) ds, (27) Vint (q) = 0
where W (Γ , K) is a strain energy density function expressed in terms of material strain measures. The strain measures of the underlying geometrically exact beam theory are given by Γ (ϕ, dk ) = Γi ei , K(ϕ, dk ) = Ki ei ,
with
Γi = di · ϕ,s −δi3 , Ki = 12 εijk [dk · dj ,s −(dk · dj ,s )|t=0 ],
(28)
where εijk is the alternating symbol. The kinetic energy of the semi-discrete beam can be written in the form
128
P. Betsch and N. S¨ anger
˙ = T (q)
nnode 1 1 ˙A ˙B 2 ˙A ˙B ˙ B + MAB ˙A·ϕ MAB ϕ d1 · d1 + MAB d2 · d2 , 2
(29)
A,B=1
where the components of the consistent mass matrix are given by L L α MAB = A NA NB ds and MAB = M α NA NB ds. 0
(30)
0
Here, A is the mass density per reference length and M 1 , M 2 can be interpreted as principal mass-moments of inertia of the cross-section. Note that due to the time-independence of the inertial quantities A and M α , the components of the mass matrix in (30) are constant thus leading to the constant mass matrix in (1). It is worth noting that in the present case the mass matrix is symmetric but not positive-definite. This degenerate feature is caused by the fact that there is no inertia associated with the nodal director velocities A d˙ 3 . We remark, however, that the degenerate mass matrix does not disturb the present time discretization. To summarize, the motion of the present finite-dimensional mechanical system with holonomic constraints is governed by the DAEs in (1). Consequently, for the free beam, the number of unknowns amounts to 18nnode . 4.1 Application of the Discrete Null Space Method To apply the REM scheme outlined in Section 2.1, we introduce the nodal discrete null space matrix ⎡ ⎤ I 0 A ⎢ 0 (d ⎥ ⎢ + 1 )n+ 12 ⎥ A ⎢ ⎥. P (q n , q n+1 ) = ⎢ + A (31) ⎥ 1 0 ( d ) n+ 2 ⎣ 2⎦ + A) 1 0 (d 3 n+ 2 Due to the aforementioned similarity on a nodal basis with the rotationless formulation of rigid bodies, the above matrix coincides with that of the free rigid body devised in [6]. The assembly of the A = 1, . . . , nnode nodal discrete null space matrices yields in a straightforward way the discrete null space matrix of the discrete beam. To perform the second size-reduction step, the nodal directors are expressed in terms of nodal incremental rotations uA ∈ R3 such that (dA i )n+1 = A ) . Consequently, the nodal configuration variables q ∈ R12 are R(uA )(dA n+1 i n A A 6 replaced by the new nodal degrees of freedom (ϕn+1 , u ) ∈ R . To summarize, the size-reduction outlined above leads to a significant reduction of the number of unknowns. Specifically, the reduced scheme for the free beam employs 6nnode degrees of freedom. We further remark that the reduced scheme for the free beam can be shown to be equivalent to the energymomentum method proposed by Romero and Armero [34].
Finite Element Framework for Flexible Multibody Dynamics
129
5 Space Discretization of Geometrically Exact Shells We next aim at a specific space discretization of geometrically nonlinear shells which yields semi-discrete equations of motion that fit into the framework of the rotationless formulation outlined in Section 2. In essence, the proposed approach is in complete analogy to the treatment of geometrically exact beams presented in the last section. We start from the stress resultant shell model used in Simo et al. [37] which relies on a classical Reissner-Mindlin kinematic. We remark, however, that the present approach can be directly applied to any degenerate continuum (or continuum-based) C 0 shell element. The present approach essentially relies on the interpolation of the inextensible director field t ∈ S 2 (the unit sphere) which may be written as h
t (ξ1 , ξ2 , t) =
N node
NA (ξ1 , ξ2 )tA (t).
(32)
A=1
Here, NA are standard Lagrange-type nodal shape functions and ξ1 , ξ2 are curvilinear coordinates which provide a parametrization of the reference surface of the shell. Similarly, the interpolation of the reference surface of the shell is given by ϕh (ξ1 , ξ2 , t) =
N node
NA (ξ1 , ξ2 )ϕA (t),
(33)
A=1
where ϕA is the placement of node A on the reference surface of the shell. It is worth mentioning that the director interpolation is a characteristic feature of degenerate shell elements (cf. B¨ uchter and Ramm [17]). In this connection the kinematic assumption of inextensibility is typically imposed directly on the nodal directors by introducing rotational parameters for S 2 , the unit sphere (see, for example, Crisfield [18, Section 8.2]). An inherent property of the director interpolation (32) is that the constraints on the director field are relaxed to the nodes of the mesh. If rotational degrees of freedom are introduced for the description of dA ∈ 2 S , it is natural to make use of nodal angular velocities and accelerations in the design of time-stepping schemes. Representative examples can be found in Simo et al. [37] and Belytschko et al. [3, Section 9.5.20]. In contrast to that, the present discretization approach for nonlinear shells does not rely on the use of rotational parameters. Rather redundant coordinates are used which have to satisfy discrete constraint equations. Relaxing the kinematic shell constraints to the nodes A ∈ {1, . . . , Nnode }, the equations of motion pertaining to the semi-discrete shell formulation can again be written in form of the DAEs (1). With regard to (32) and (33), the finite element discretization in space leads to nodal configuration vectors q A ∈ R6 of the form
130
P. Betsch and N. S¨ anger
qA =
ϕA . tA
(34)
Similar to (25), the configuration vector corresponding to the semi-discrete shell model results from the assembly of all nodal contributions. For hyperelastic shells the potential energy function can again be written in the form (26) with (35) W ϕh,α · ϕh,β , ϕh,α · th,β , ϕh,α · th dA, Vint (q) = A
where W is a strain energy density function depending on the frame-indifferent shell strain measures. The kinetic energy of the semi-discrete shell is given by nnode 1 ϕ t ˙A ˙B ˙ B + MAB ˙ = ˙A·ϕ MAB t ·t , T (q) ϕ 2
(36)
A,B=1
where the components of the consistent mass matrix are given by L ϕ t A NA NB dA and MAB = I NA NB dA. MAB =
(37)
0
A
Here, A is the (time-independent) nominal surface density and I is the (time-independent) nominal rotational inertia of the shell. Accordingly, the components in (37) are constant thus leading to the constant mass matrix in (1). In the context of semi-discrete shells the holonomic constraints in (1c) can be classified according to the following geometric restrictions: 1. For smooth shell domains the nodal inextensibility condition needs be imposed. Let η be the index set of nodes belonging to smooth shell domains. Then the constraint function corresponding to a node A ∈ η is given by ΦA =
1 (tA 2 − 1). 2
(38)
2. In the case of shell intersections a rigid connection between the adjacent shells is assumed. Let η¯ denote the index set of nodes belonging to shell intersections and e1 , e2 be the numbers of two adjacent elements which share node A ∈ η¯ with directors teA1 , teA2 ∈ 3 . Then the intersection gives rise to the following vector of constraint functions ⎤ ⎡ 1 e1 2 2 (tA − 1) (39) Φ A = ⎣ 12 (teA2 2 − 1) ⎦ , teA1 · teA2 − cos α0 where α0 is the angle between teA1 and teA2 in the initial configuration. 3. Standard Dirichlet boundary conditions imposed on the boundary of the shell structure lead to linear constraint equations so that the corresponding constraint Jacobian is constant. Standard finite element techniques can be applied to account for this type of constraints.
Finite Element Framework for Flexible Multibody Dynamics
131
4. In the context of flexible multibody systems a shell component may be connected to other components (e.g. rigid bodies, beams and shells) by various types of joints. The DAEs (1) can easily accommodate the additional joint constraints. In the case of lower kinematic pairs (e.g. spherical, revolute or prismatic joints) it can be shown that the corresponding constraint functions are at most quadratic (see [6] for a detailed treatment of kinematic pairs in the context of the rotationless formulation). This advantageous property is a consequence of the advocated uniform description of all multibody components by means of the DAEs (1). 5.1 Application of the Discrete Null Space Method We next aim at the application of the REM scheme outlined in Section 2.1 to semi-discrete shells. We will illustrate this approach for the case of smooth shells (constraints of the above type 1). On a nodal basis, the relevant kinematical relationships are in complete analogy to the spherical pendulum. Thus, for simplicity of exposition, we shall provide the details for the example of the rotationless formulation of a spherical pendulum. Rotationless Formulation of a Spherical Pendulum To demonstrate the application of the discrete null space method we first consider the simple example of a spherical pendulum. As outlined above, the application of the discrete null space method to the pendulum is closely related to the treatment of the nodal directors in the case of smooth shells (constraint type 1). Thus the approach to the pendulum can be directly transferred to semi-discrete shells on a nodal basis. Consider a point mass m suspended by a massless rigid rod of unit length (Fig. 3). In an inertial frame, the placement of m in three-dimensional space is characterized by the position vector q ∈ 3 , subject to the constraint
x3 x2
g q m x1
Fig. 3. Spherical pendulum
132
P. Betsch and N. S¨ anger
Φ(q) =
1 (q2 − 1) = 0. 2
(40)
Thus, due to the rigidity of the rod the configuration space of the pendulum is the unit sphere, i.e. Q = S 2 . At this point one may resort to local coordinates for the parametrization of the unit sphere. Then Lagrange’s equations (of the second kind) may be applied to yield the equations of motion in the form of second-order ODEs. In contrast to that, the rotationless formulation is based on the DAEs (1). In this connection the mass matrix is given by M = diag(m, m, m), the potential due to a gravitational field reads V (q) = −mg T q, where g ∈ R3 is the vector of gravitational acceleration. The constraint gives rise to the constraint Jacobian G = DΦ(q) = q T .
(41)
d Consequently, due to the consistency condition dt Φ = Gv = 0, admissible 3 velocities v ∈ R are restricted to the tangent space Tq Q which coincides with the null space of G = q T , i.e.
Tq Q = null(G(q)).
(42)
Since dim(null(q T )) = 2, two independent ‘generalized’ speeds ν1 , ν2 ∈ R can be used to characterize v ∈ Tq Q. Let q 1 , q 2 ∈ R3 span a basis for Tq Q. Then the 3 × 2 matrix (43) P (q) = [q 1 , q 2 ] constitutes a null space matrix for the problem at hand. Accordingly, admissible velocities may be written as v = P (q)ν. Moreover, it is obvious that GP = 0. A viable choice of P results from choosing q 1 , q 2 ∈ R3 in such a way that q 1 , q 2 , q ≡ q 3 are mutually orthonormal vectors. Accordingly, there exists a proper orthogonal matrix Λ ∈ SO(3), such that q i = Λei , and Λ = [P , q].
(44)
Note that the above introduced independent ‘generalized’ speeds may be used to express admissible velocities in the form v = ω × q, with angular velocity vector ω = ωα q α , and [ω1 , ω2 ] = [−ν2 , ν1 ]. Steps Towards the REM Scheme To perform a first size-reduction step within the discrete setting of the BEM scheme (5), we seek a proper discrete version P(q n , q n+1 ) of the null space matrix. In particular we aim at an explicit representation of P(q n , q n+1 ). To achieve this, we require the fulfillment of the following design conditions: P(q n , q n+1 ) → P (q n ) as q n+1 → q n , G(q n , q n+1 )P(q n , q n+1 ) = 0.
(45a) (45b)
Finite Element Framework for Flexible Multibody Dynamics
133
Note that in the present example the discrete constraint Jacobian is given by G(q n , q n+1 ) = q Tn+ 1 . It can be easily verified that a proper choice of the 2 discrete null space matrix reads q 1 , q& 2 ], P(q n , q n+1 ) = [& with q& α = (q α )n −
(q α )Tn q n+ 12 q Tn q n+ 12
(46)
qn .
(47)
(cf. [4, Section 4.2.3]). We refer to [4, Table 3] for details of the implementation of the resulting scheme. A further size-reduction of the algebraic system to be solved can be achieved by introducing new incremental unknowns u ∈ U ⊂ Rn−m via a reparametrization of the form q n+1 = F qn (u).
(48)
The mapping F qn : U → Q represents a local parametrization of the configuration space in a neighborhood of q n ∈ Q. Note that (48) indicates that q n = F qn (0). As a consequence of (48) the discrete constraint equations (5c) are identically satisfied. For the spherical pendulum two rotational parameters u1 , u2 ∈ R can be chosen for the parametrization of the unit sphere. Specifically, we make use of the exponential map expq : Tq S 2 → S 2 , such that q n+1 = expqn (P (q n )u) or q n+1 = cos(P (q n )u)q n +
(49)
sin(P (q n )u) P (q n )u. P (q n )u
(50)
It is worth mentioning that the update of the incremental unknowns within the iterative solution procedure can be performed in a standard additive manner, i.e. (51) u(k) = u(k−1) + ∆u(k) . Eventually, if convergence has been attained after & k iterations, the update of the null space matrix P (q n ) can be performed via &
+(k) )P (q n ), P (q n+1 ) = exp(θ & (k)
(52) &
+ is a skew-symmetric matrix with associated axial vector θ (k) ∈ R3 , where θ which represents the incremental rotation given by
& 0 −1 & (k) (k) with J = . (53) θ = P (q n )J u 1 0
134
P. Betsch and N. S¨ anger
+ Moreover, in (52), exp(θ Rodrigues formula
& (k)
) ∈ SO(3) can be calculated via the well-known
2 sin θ + 1 sin(θ/2) +2 + θ+ θ . exp(θ) = I + θ 2 θ/2
(54)
We finally remark that the reduced scheme thus obtained can be implemented as described in [4, Table 4].
6 Numerical Examples 6.1 Planar Parallel Manipulator We first consider the example of a free-floating parallel manipulator (Fig. 4). The two platforms of the manipulator are considered to be rigid, whereas the three legs are discretized by means of elastic beam elements. Three internal torques act on the revolute joints attached to the lower (larger) platform to initiate the motion. In particular, starting at rest, the torques are applied in form of a hat function over time. Since no external loads are present, the angular momentum is a conserved quantity. Moreover, after the initial load period the total energy of the system is a first integral of the motion too. The first calculation deals with the rigid body limit. That is, the beam stiffness parameters are chosen to be quite large. This approach can be regarded as penalty method for the enforcement of the rigidity of the two parts of each leg. Comparison is made to a second model of the manipulator consisting uniformly of rigid bodies (cf. [13]). Our simulations confirm that the present energy-momentum scheme is able to reproduce the rigid body limit for any step size. Snapshots of the calculated motion are depicted in Fig. 5. To illustrate the robustness of the present method we next choose the beam stiffness parameters to get a highly flexible response. The resulting large deformations of the legs can be observed from Fig. 5.
rigid bodies
flexible beams
Fig. 4. Schematic of the planar parallel manipulator
Finite Element Framework for Flexible Multibody Dynamics time: 0.00
time: 0.50
time: 1.00
time: 2.00
time: 0.00
time: 1.30
135
time: 0.50
time: 1.50
Fig. 5. Planar parallel manipulator: snapshots of the motion for the case of the rigid body limit and for the highly flexible case
136
P. Betsch and N. S¨ anger 1
100 90 80
Etot
Ekin
70 50
l
E
60 0
40 30 20 Epot
10 0 0
0.5
1.0
1.5 t
2.0
2.5
3.0
−1 0
0.5
1
1.5
2
2.5
3
t
Fig. 6. Planar parallel manipulator: illustration of the algorithmic conservation properties: energy (left) and angular momentum (right)
Fig. 7. Tumbling cylinder: snapshots of the motion
Eventually, we note that the relevant conservation laws are exactly reproduced by the energy-momentum scheme, independent of the step size. The algorithmic conservation properties are exemplified in Fig. 6. 6.2 Tumbling Cylinder This example of a (smooth) cylindrical shell has been taken from Simo and Tarnow [39]. Starting at rest, the motion is initialized by applying external loads in form of a hat function over time. The motion is illustrated with some snapshots in Fig. 7. For t > 10 no external forces act on the cylinder anymore such that the algorithmic conservation properties can be checked. The present energy-momentum scheme does indeed conserve the total energy as well as the total angular momentum. For example, Fig. 8 shows algorithmic conservation of the energy (solid red line). In contrast to that, the mid-point rule shows a typical blow-up behavior (dash-dotted curves). Similar examples are often used in nonlinear structural dynamics to demonstrate the superior numerical stability properties of conserving time-stepping schemes.
Finite Element Framework for Flexible Multibody Dynamics
137
midpoint rule vs. energy momentum scheme 600 Etot(midpoint)
500
E
400 Etot(EM)
300
Ekin(EM)
Ekin(midpoint) 200
Epot(midpoint)
100
Epot(EM) 0 0
10
20
30
40
50
60
70
80
90 100
t
Fig. 8. Tumbling cylinder: illustration of the algorithmic energy conservation of the present energy-momentum scheme (black lines). The grey curves show the blow-up behavior of the standard mid-point rule. For the simulations a step-size of ∆t = 0.5 has been used
F rigid body beam elements shell elements
F revolute joint (actuated) revolute joint (free)
Fig. 9. Schematic of the flexible multibody system
6.3 Flexible Multibody System The last example deals with the flexible multibody model of a satellite (Fig. 9). The satellite is comprised of a rigid body as well as nonlinear beam and shell components. The complete system is modelled by applying the present rotationless approach. A specific coordinate augmentation technique developed in [13] (see also [41]) is used to actuate the revolute joints connecting the flexible panels of the satellite with its center part consisting of beams and a rigid body (cf. Fig. 9). Starting at rest, a pair of forces is applied for t ∈ [0, 0.3] (cf. Fig. 10, left diagram). After t = 0.3 no external forces act on the system anymore such that the total angular momentum is a conserved quantity. For t ∈ [2.5, 3.5] joint-torques are applied to effect the evolution of the corresponding joint-angles depicted in Fig. 10 (right diagram). This causes a relative motion of the panels relative to the center part of the satellite. The
138
P. Betsch and N. S¨ anger external loads
1.5
angle of panels π
Θ
F
1.0 0.5
0
0 -0.5 0
2
4
6
8
10
0
2.0
4.0
t
6.0
8.0
10.0
t
Fig. 10. Flexible multibody system: application of external forces (left) and actuated joint angles (right)
Fig. 11. Flexible multibody system: snapshots of deformed configurations
angular momentum
energy
200 180
Ix, Iy (mid, EM)
0
160 140 Etot (mid)
−25
100
l
E
120
80
Epot (mid)
−50
60
Etot (EM)
40
Ekin (EM) Ekin (mid)E (EM) pot
20 0
0
1
2
3
4
5
t
6
7
8
9
10
Iz (mid, EM)
−75
0
1
2
3u
4
5
6
7
8
9
10
t
Fig. 12. Flexible multibody system: energy and angular momentum plot of both the mid-point rule (dash-dotted curves) and the energy momentum scheme (solid curves)
simulated motion is illustrated with some snapshots of deformed configurations in Fig. 11. It can be observed from Fig. 12 that the present energymomentum scheme correctly conserves the total energy (for 0.3 ≤ t ≤ 2.5 and t ≥ 3.5) as well as total angular momentum (for t ≥ 0.3). For the simulation a step-size of ∆t = 0.05 has been used.
Finite Element Framework for Flexible Multibody Dynamics
139
7 Conclusions We have presented a rotationless formulation of flexible multibody dynamics. The main feature of the proposed formulation is the simple and uniform structure of the underlying DAEs. Indications of the inherent simplicity of the rotationless formulation are the absence of transcendental functions, a constant mass matrix and momentum maps that are at most quadratic in the state space coordinates. These features facilitate the design of energymomentum conserving schemes. Energy-momentum schemes not only provide enhanced numerical stability properties but also yield numerical results of enhanced physical quality. The advantages of the rotationless formulation come at the expense of a high level of redundancy. However, as has been outlined in the present work, this drawback can be mitigated by applying the discrete null space method which yields a significant size-reduction along with an improved conditioning of the algebraic system to be solved (see [4, 6, 29] for further details). Eventually, it is worth mentioning that the advocated rotationless formulation makes possible a uniform algorithmic treatment of displacements and rotations. This brings about additional advantages over the traditional use of rotational parameters, see [7] for further investigations in this direction.
Acknowledgments Support for this research was provided by the Deutsche Forschungsgemeinschaft (DFG) under grant BE 2285/5-1. This support is gratefully acknowledged.
References 1. Bauchau OA, Bottasso CL (1999) On the design of energy preserving and decaying schemes for flexible, nonlinear multi-body systems. Comput Meth Appl Mech Eng 169:61–79 2. Bauchau OA, Choi JY, Bottasso CL (2002) On the modeling of shells in multibody dynamics. Mult Syst Dyn 8:459–489 3. Belytschko T, Liu WK, Moran B (2000) Nonlinear finite elements for continua and structures. Wiley, New York 4. Betsch P (2005) The discrete null space method for the energy consistent integration of constrained mechanical systems. Part I: Holonomic constraints. Comput Methods Appl Mech Eng 194(50–52):5159–5190 5. Betsch P, Hesch C (2007) Energy-momentum conserving schemes for frictionless dynamic contact problems. Part I: NTS method. In: Wriggers P, Nackenhorst U (eds) IUTAM Symposium on Computational Methods in Contact Mechanics. Volume 3 of IUTAM Bookseries, pp 77–96. Springer
140
P. Betsch and N. S¨ anger
6. Betsch P, Leyendecker S (2006) The discrete null space method for the energy consistent integration of constrained mechanical systems. Part II: Multibody dynamics. Int J Numer Meth Eng 67(4):499–552 7. Betsch P, S¨ anger N (2007) On the use of geometrically exact shells in a conserving framework for flexible multibody dynamics. In preparation. 8. Betsch P, Steinmann P (2001) Conservation properties of a time FE method. Part II: Time-stepping schemes for nonlinear elastodynamics. Int J Numer Meth Eng 50:1931–1955 9. Betsch P, Steinmann P (2001) Constrained integration of rigid body dynamics. Comput Methods Appl Mech Eng 191:467–488 10. Betsch P, Steinmann P (2002) Conservation properties of a time FE method. Part III: Mechanical systems with holonomic constraints. Int J Numer Meth Eng 53:2271–2304 11. Betsch P, Steinmann P (2002) A DAE approach to flexible multibody dynamics. Mult Syst Dyn 8:367–391 12. Betsch P, Steinmann P (2002) Frame-indifferent beam finite elements based upon the geometrically exact beam theory. Int J Numer Meth Eng 54:1775– 1788 13. Betsch P, Uhlar S (2007) Energy-momentum conserving integration of multibody dynamics. Mult Syst Dyn 17(4):243–289 14. Bottasso CL, Borri M, Trainelli L (2001) Integration of elastic multibody systems by invariant conserving/dissipating algorithms. II. Numerical schemes and applications. Comput Methods Appl Mech Eng 190:3701–3733 15. Bottasso CL, Borri M, Trainelli L (2002) Geometric invariance. Comput Mech 29(2):163–169 16. Brank B, Korelc J, Ibrahimbegovi´c A (2003) Dynamics and time-stepping schemes for elastic shells undergoing finite rotations. Comput & Struct 81(12): 1193–1210 17. B¨ uchter N, Ramm E (1992) Shell theory versus degeneration – A comparison in large rotation finite element analysis. Int J Numer Methods Eng 34:39–59 18. Crisfield MA (1991) Non-linear finite element analysis of solids and structures. Volume 1: Essentials. Wiley, New York 19. Garc´ıa de Jal´ on J (2007) Twenty-five years of natural coordinates. Mult Syst Dyn 18(1):15–33 20. G´eradin M, Cardona A (2001) Flexible multibody dynamics: A finite element approach. Wiley, New York 21. Goicolea JM, Garcia Orden JC (2000) Dynamic analysis of rigid and deformable multibody systems with penalty methods and energy-momentum schemes. Comput Meth Appl Mech Eng 188:789–804 22. Gonzalez O (1999) Mechanical systems subject to holonomic constraints: Differential-algebraic formulations and conservative integration. Physica D 132: 165–174 23. Gonzalez O, Simo JC (1996) On the stability of symplectic and energymomentum algorithms for non-linear Hamiltonian systems with symmetry. Comput Methods Appl Mech Eng 134:197–222 24. G¨ ottlicher B, Schweizerhof K (2005) Analysis of flexible structures with occasionally rigid parts under transient loading. Comput & Struct 83:2035–2051 25. Ibrahimbegovi´c A, Mamouri S, Taylor RL, Chen AJ (2000) Finite element method in dynamics of flexible multibody systems: Modeling of holonomic
Finite Element Framework for Flexible Multibody Dynamics
26. 27. 28. 29.
30.
31. 32. 33.
34.
35. 36.
37.
38.
39. 40.
41.
141
constraints and energy conserving integration schemes. Multy Syst Dyn 4(2– 3): 195–223 Jeleni´c G, Crisfield MA (2001) Dynamic analysis of 3D beams with joints in presence of large rotations. Comput Methods Appl Mech Engrg 190:4195–4230 Kuhl D, Ramm E (1999) Generalized energy-momentum method for non-linear adaptive shell dynamics. Comput Meth Appl Mech Eng 178:343–366 Kunkel P, Mehrmann V (2006) Differential-algebraic equations. European Mathematical Society, Zurich Leyendecker S, Betsch P, Steinmann P (2008) The discrete null space method for the energy consistent integration of constrained mechanical systems. Part III: Flexible multibody dynamics. Mult Syst Dyn 19(1–2):45–72 Mu˜ noz J, Jeleni´c G, Crisfield MA (2003) Master-slave approach for the modelling of joints with dependent degrees of freedom in flexible mechanisms. Commun Numer Meth Eng 19:689–702 Puso MA (2002) An energy and momentum conserving method for rigid-flexible body dynamics. Int J Numer Meth Eng 53:1393–1414 Romero I (2004) The interpolation of rotations and its application to finite element models of geometrically exact rods. Comput Mech 34:121–133 Romero I, Armero F (2002) Numerical integration of the stiff dynamics of geometrically exact shells: An energy-dissipative momentum-conserving scheme. Int J Numer Meth Eng 54:1043–1086 Romero I, Armero F (2002) An objective finite element approximation of the kinematics of geometrically exact rods and its use in the formulation of an energy-momentum conserving scheme in dynamics. Int J Numer Meth Eng 54:1683–1716 Rosenberg RM (1977) Analytical dynamics of discrete systems. Plenum, New York Sansour C, Wagner W, Wriggers P, Sansour J (2002) An energy-momentum integration scheme and enhanced strain finite elements for the non-linear dynamics of shells. Non-linear Mech 37:951–966 Simo JC, Rifai MS, Fox DD (1992) On a stress resultant geometrically exact shell model. Part VI: Conserving algorithms for non-linear dynamics. Int J Numer Meth Eng 34:117–164 Simo J, Tarnow N (1992) The discretes energy-momentum method. Conserving algorithms for nonlinear elastodynamics. Z Angew Math Phys (ZAMP) 43:757– 792 Simo JC, Tarnow N (1994) A new energy and momentum conserving algorithm for the nonlinear dynamics of shells. Int J Num Meth Eng 37:2527–2549 Taylor RL (2001) Finite element analysis of rigid-flexible systems. In: Ambr´ osio JAC, Kleiber M (eds) Computational aspects of nonlinear structural systems with large rigid body motion. Volume 179 of NATO Science Series: Computer & Systems Sciences, pp 63–84. IOS Press, Amsterdam Uhlar S, Betsch P (2007) On the rotationless formulation of multibody dynamics and its conserving numerical integration. In: Bottasso CL, Masarati P, Trainelli L (eds) Proceedings of the ECCOMAS Thematic Conference on Multibody Dynamics, Politecnico di Milano, Milano
A Second Order Extension of the Generalized–α Method for Constrained Systems in Mechanics Laurent O. Jay1 and Dan Negrut2 1
2
Department of Mathematics, The University of Iowa, 14 MacLean Hall, Iowa City, IA 52242-1419, USA E-mails:
[email protected],
[email protected] Department of Mechanical Engineering, University of Wisconsin-Madison, 2035ME, 1513 University Avenue, Madison, WI 53706-1572, USA E-mail:
[email protected]
Summary. We present a new second order extension of the generalized–α method of Chung and Hulbert for systems in mechanics having nonconstant mass matrix, holonomic constraints, and/or nonholonomic constraints. Such systems are frequently encountered in multibody dynamics. For variable step–sizes, a new adjusting formula preserving the second order of the method is proposed.
1 Introduction The generalized–α method of Chung and Hulbert [2] was originally developed for second order systems of differential equations in structural dynamics of the form M y = f (t, y, y ). In mechanics M ∈ Rn×n is a constant mass matrix, y ∈ Rn is a vector of generalized coordinates, y ∈ Rn is a vector of generalized velocities, y ∈ Rn is a vector of generalized accelerations, and f (t, y, y ) ∈ Rn represents forces. Introducing the new variables z := y ∈ Rn and a := z = y ∈ Rn , these equations are equivalent to the semi–explicit system of differential–algebraic equations (DAEs) y = z ,
z = a ,
0 = M a − f (t, y, z) .
(1)
Assuming the mass matrix M to be nonsingular, this system of DAEs is of index 1 since one can obtain explicitly a = M −1 f (t, y, z). The generalized– α method of Chung and Hulbert [2] for M y = f (t, y, y ) or equivalently for (1) is a non-standard implicit one-step method. One step of this method (t0 , y0 , z0 , aα ) → (t1 = t0 + h, y1 , z1 , a1+α ) with step–size h can be expressed as follows
C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
143
144
L.O. Jay and D. Negrut
h2 ((1 − 2β)aα + 2βa1+α ) , 2 z1 = z0 + h ((1 − γ)aα + γa1+α ) ,
y1 = y0 + hz0 +
(1 − αm )M a1+α + αm M aα = (1 − αf )f (t1 , y1 , z1 ) + αf f (t0 , y0 , z0 ),
(2a) (2b) (2c)
see Section 2 for a justification of the notation aα , a1+α . The generalized–α method has free coefficients αm = 1, αf , β, γ. For specific choices of these coefficients we obtain well-known methods: • Newmark’s family: αm = 0, αf = 0 – The trapezoidal rule: β = 1/4, γ = 1/2 – St¨ormer’s rule: β = 0, γ = 1/2 • The Hilber-Hughes-Taylor α (HHT-α) method [3, 4]:
(1 − α)2 1 , αm = 0 , α := −αf ∈ − , 0 , β = 3 4
γ=
1 −α 2
The coefficients αm = 1, αf , β, γ of the generalized–α method (2) are usually chosen according to αm =
2ρ∞ − 1 , 1 + ρ∞
αf =
ρ∞ , 1 + ρ∞
β=
(1 − α)2 , 4
γ=
1 − α, 2
where α := αm − αf and ρ∞ ∈ [0, 1] is a parameter controlling numerical dissipation (ρ∞ = 0 for maximal dissipation [2]). In this paper we present extensions of the generalized–α method (2) for systems having • Nonconstant mass matrix M (t, y), see Section 3 • Holonomic constraints g(t, y) = 0, see Section 4 • Nonholonomic constraints k(t, y, y ) = 0, see Section 5 Such systems are frequently encountered in multibody dynamics [10]. A general extension and a convergence result is given in Section 6. For variable step–sizes, a new adjusting formula preserving the second order of the method is proposed in Section 7. Some numerical experiments are given in Section 8. A short conclusion is finally given in Section 9.
2 About the Notation aα, a1+α We use the notation aα and a1+α instead of a0 and a1 to emphasize the fact that these quantities should not be considered as approximations to the acceleration vector a(t) at t0 and t1 respectively, but at tα := t0 + αh and t1+α := t1 + αh = t0 + (1 + α)h respectively where α := αm − αf . The reason is that for a solution (y(t), z(t), a(t)) and values (y0 , z0 ) satisfying y0 − y(t0 ) = O(h2 ), z0 − z(t0 ) = O(h2 ), we have
The Generalized–α Method for Constrained Systems in Mechanics
a1+α − a(t1+α ) = O(h2 ) when
aα − a(tα ) = O(h2 ) ,
145
(3)
whereas we only have a1+α − a(t1 ) = O(h) for α = 0 and when aα − a(t0 ) = O(h2 ) or aα − a(tα ) = O(h2 ). This can be seen as follows. We rewrite (2c) as (1 − αm )a1+α + αm aα = (1 − αf )M −1 f (t1 , y1 , z1 ) + αf M −1 f (t0 , y0 , z0 ) . (4) Since a(t) = M −1 f (t, y(t), z(t)), y1 − y(t1 ) = O(h2 ), and z1 − z(t1 ) = O(h2 ) we have M −1 f (t1 , y1 , z1 ) = a(t0 )+ha (t0 )+O(h2 ) , M −1 f (t0 , y0 , z0 ) = a(t0 )+O(h2 ) . Hence, for the right-hand side of (4) we obtain (1 − αf )M −1 f (t1 , y1 , z1 ) + αf M −1 f (t0 , y0 , z0 ) = a(t0 ) + h(1 − αf )a (t0 ) + O(h2 ) . (5) Since a(t1+α ) = a(t0 ) + h(1 + α)a (t0 ) + O(h2 ) ,
a(tα ) = a(t0 ) + hαa (t0 ) + O(h2 ) ,
we have (1 − αm )a(t1+α ) + αm a(tα ) = a(t0 ) + h(1 − αm + α)a (t0 ) + O(h2 ) .
(6)
Thus, from (4–6), we obtain (1 − αm )(a1+α − a(t1+α )) + αm (aα − a(tα )) = h(−αf + αm − α)a (t0 ) + O(h2 ) . (7) Hence, (3) is satisfied for α = αm − αf . Choosing the Initial Value of aα for the First Integration Step Here we give two possible choices for the initial value of aα to be used for the first integration step. For αm = 0, for example for the HHT-α method, we see from (7) that taking aα := a0 where M a0 = f (t0 , y0 , z0 ) still leads to the estimate a1+α − a(t1+α ) = O(h2 ). For αm = 0 it is better to define aα such that aα − a(tα ) = O(h2 ), for example implicitly by M aα = (1 − α)f (t0 , y0 , z0 ) + αf (t1 , y1 , z1 ),
(8)
as proposed by Lunk and Simeon [7]. Nevertheless, taking aα := a0 in fact does not affect the order of global convergence of the y and z components, see Theorem 1 in Section 6.
146
L.O. Jay and D. Negrut
3 Nonconstant Mass Matrix M (t, y) We consider M (t, y)y = f (t, y, y ) where M (t, y) is a nonconstant mass matrix assumed to be nonsingular. These equations are equivalent to the semi– explicit system of index 1 DAEs y = z ,
z = a ,
0 = M (t, y)a − f (t, y, z) .
A natural extension of the generalized–α method of (2) is to replace (2c) with (1 − αm )M1+α a1+α + αm Mα aα = (1 − αf )f (t1 , y1 , z1 ) + αf f (t0 , y0 , z0 ), where M1+α ≈ M (t1+α , y(t1+α )) ,
Mα ≈ M (tα , y(tα )) .
For example we can take explicitly M1+α := M (t1+α , y0 + h(1 + α)z0 ) ,
Mα := M(1+α)−1 or M (tα , y0 + hαz0 ),
where M(1+α)−1 denotes the matrix M1+α used at the previous time–step. Second order of convergence is a consequence of Theorem 1 in Section 6.
4 Holonomic Constraints g(t, y) = 0 We extend now the generalized–α method to systems having holonomic constraints g(t, y) = 0. More precisely we consider M (t, y)y = f (t, y, y , λ) ,
0 = g(t, y) .
∂g (t, y), usually f (t, y, y , λ) = f0 (t, y, y ) − Using the notation gy (t, y) := ∂y gyT (t, y)λ and the term −gyT (t, y)λ containing algebraic variables λ represents reaction forces due to the holonomic constraints g(t, y) = 0. Differentiating 0 = g(t, y) once with respect to t we obtain
0 = (g(t, y)) = gt (t, y) + gy (t, y)y . Hence, we consider overdetermined systems of index 2 differential–algebraic equations (ODAEs) of the form y = z , z = a , 0 = M (t, y)a − f (t, y, z, λ) , 0 = g(t, y) , 0 = gt (t, y) + gy (t, y)z, where we assume the matrix M (t, y) −fλ (t, y, z, λ) is nonsingular. O gy (t, y)
The Generalized–α Method for Constrained Systems in Mechanics
147
When f (t, y, z, λ) = f0 (t, y, z) − gyT (t, y)λ, this matrix becomes
M (t, y) gyT (t, y) gy (t, y) O
and it is symmetric when M (t, y) is symmetric. At t0 we consider consistent initial conditions (y0 , z0 , a0 , λ0 ), i.e., 0 = M (t0 , y0 )a0 − f (t0 , y0 , z0 , λ0 ) , 0 = g(t0 , y0 ) , 0 = gt (t0 , y0 ) + gy (t0 , y0 )z0 , 0 = gtt (t0 , y0 ) + 2gty (t0 , y0 )z0 + gyy (t0 , y0 )(z0 , z0 ) + gy (t0 , y0 )a0 . Several extensions of the HHT-α method have been proposed. Cardona and G´eradin [1] analyze a direct extension of the HHT-α method to linear index 3 DAEs. They show that a direct application of the HHT-α method is inconsistent and suffers from instabilities. Yen, Petzold, and Raha [11] propose a first order extension of the HHT-α method based on projecting the solution of the underlying ODEs onto the constraints (including the index 1 acceleration level constraints) after each step. More recently, second order extensions of the HHT-α method and generalized–α method have been proposed independently by Jay and Negrut [5] and by Lunk and Simeon [7] assuming additivity of f (t, y, z, λ) = f0 (t, y, z) + f1 (t, y, λ). Here, we propose a different extension of the generalized–α method without making this assumption: h2 ((1 − 2β)aα + 2β& a1+α ) , (9a) 2 z1 = z0 + h ((1 − γ)aα + γa1+α ) , (9b)
y1 = y0 + hz0 +
&1 ) a1+α + αm Mα aα = (1 − αf )f (t1 , y1 , z1 , λ (1 − αm )M1+α & + αf f (t0 , y0 , z0 , λ0 ) ,
(9c)
(1 − αm )M1+α a1+α + αm Mα aα = (1 − αf )f (t1 , y1 , z1 , λ1 ) + αf f (t0 , y0 , z0 , λ0 ) ,
(9d)
0 = g(t1 , y1 ) ,
(9e)
0 = gt (t1 , y1 ) + gy (t1 , y1 )z1 .
(9f)
When f (t, y, z, λ) = f0 (t, y, z) − gyT (t, y)λ we can replace (9c) by &1 ) . a1+α − a1+α ) = (1 − αf )gyT (t1 , y1 )(λ1 − λ (1 − αm )M1+α (& Second order of convergence is a consequence of Theorem 1 in Section 6.
148
L.O. Jay and D. Negrut
5 Nonholonomic Constraints k(t, y, y ) = 0 We extend now the generalized–α method to systems having nonholonomic constraints k(t, y, y ) = 0. More precisely we consider M (t, y)y = f (t, y, y , ψ) ,
0 = k(t, y, y ) .
Usually f (t, y, y , ψ) = f0 (t, y, y ) − kyT (t, y, y )ψ and the term −kyT (t, y, y )ψ containing algebraic variables ψ represents reaction forces due to the nonholonomic constraints k(t, y, y ) = 0. Hence, we consider systems of index 2 DAEs of the form y = z ,
z = a ,
0 = M (t, y)a − f (t, y, z, ψ) ,
0 = k(t, y, z) ,
and we assume the matrix M (t, y) −fψ (t, y, z, ψ) is nonsingular. O kz (t, y, z) When f (t, y, z, ψ) = f0 (t, y, z) − kzT (t, y, z)ψ, this matrix becomes M (t, y) kzT (t, y, z) O kz (t, y, z) and it is symmetric when M (t, y) is symmetric. At t0 we consider consistent initial conditions (y0 , z0 , a0 , ψ0 ), i.e., 0 = M (t0 , y0 )a0 − f (t0 , y0 , z0 , ψ0 ) , 0 = k(t0 , y0 , z0 ) , 0 = kt (t0 , y0 , z0 ) + ky (t0 , y0 , z0 )z0 + kz (t0 , y0 , z0 )a0 . We propose the following extension of the generalized–α method h2 ((1 − 2β)aα + 2βa1+α ) , 2 z1 = z0 + h ((1 − γ)aα + γa1+α ) ,
y1 = y0 + hz0 +
(10a) (10b)
(1 − αm )M1+α a1+α + αm Mα aα = (1 − αf )f (t1 , y1 , z1 , ψ1 ) + αf f (t0 , y0 , z0 , ψ0 ) , 0 = k(t1 , y1 , z1 ) .
(10c) (10d)
Second order of convergence is a consequence of Theorem 1 in Section 6.
6 General Extension and Convergence We extend now the generalized–α method to systems having a nonconstant mass matrix M (t, y), holonomic constraints g(t, y) = 0, and nonholonomic constraints k(t, y, y ) = 0. The algebraic variables λ are associated with the
The Generalized–α Method for Constrained Systems in Mechanics
149
holonomic constraints g(t, y) = 0 and gt (t, y) + gy (t, y)y = 0 which results from differentiating g(t, y) = 0 with respect to t. The algebraic variables ψ are associated with the nonholonomic constraints k(t, y, y ) = 0. Hence, we consider overdetermined systems of index 2 differential–algebraic equations (ODAEs) of the form y = z ,
(11a)
M (t, y)z = f (t, y, z, λ, ψ) ,
(11b)
0 = g(t, y) ,
(11c)
0 = gt (t, y) + gy (t, y)z ,
(11d)
0 = k(t, y, z) ,
(11e)
and we assume the matrix ⎞ ⎛ M (t, y) −fλ (t, y, z, λ, ψ) −fψ (t, y, z, λ, ψ) ⎟ ⎜ O O ⎠ is nonsingular. ⎝ gy (t, y) O O kz (t, y, z)
(12)
When f (t, y, z, λ, ψ) = f0 (t, y, z)−gyT (t, y)λ−kzT (t, y, z)ψ, this matrix becomes ⎞ M (t, y) gyT (t, y) kzT (t, y, z) ⎟ ⎜ O O ⎠ ⎝ gy (t, y) kz (t, y, z) O O ⎛
and it is symmetric when M (t, y) is symmetric. At t0 we consider consistent initial conditions (y0 , z0 , a0 , λ0 , ψ0 ), i.e., 0 = M (t0 , y0 )a0 − f (t0 , y0 , z0 , λ0 , ψ0 ) , 0 = g(t0 , y0 ) , 0 = gt (t0 , y0 ) + gy (t0 , y0 )z0 , 0 = k(t0 , y0 , z0 ) , 0 = gtt (t0 , y0 ) + 2gty (t0 , y0 )z0 + gyy (t0 , y0 )(z0 , z0 ) + gy (t0 , y0 )a0 , 0 = kt (t0 , y0 , z0 ) + ky (t0 , y0 , z0 )z0 + kz (t0 , y0 , z0 )a0 . We propose an extension of the generalized–α method which does not use any additive structure of f (t, y, z, λ, ψ). We call it the generalized–α–SOI2 method (SOI2 stands for Stabilized Overdetermined Index 2). One step (t0 , y0 , z0 , aα , λ0 , ψ0 ) → (t1 , y1 , z1 , a1+α , λ1 , ψ1 ) with step–size h of the generalized–α–SOI2 method for (11) can be expressed as follows
150
L.O. Jay and D. Negrut y1 = y0 + hz0 +
h2 a1+α ) , ((1 − 2β)aα +2β & 2
(13a)
z&1 = z0 + h ((1 − γ)aα + γ & a1+α ) ,
(13b)
z1 = z0 + h ((1 − γ)aα + γa1+α ) ,
(13c)
&1 , ψ&1 ) (1 − αm )M1+α & a1+α + αm Mα aα = (1 − αf )f (t1 , y1 , z1 , λ + αf f (t0 , y0 , z0 , λ0 , ψ0 ),
(13d)
(1 − αm )M1+α a1+α + αm Mα aα = (1 − αf )f (t1 , y1 , z1 , λ1 , ψ1 ) + αf f (t0 , y0 , z0 , λ0 , ψ0 ) ,
(13e)
0 = g(t1 , y1 ) ,
(13f)
0 = gt (t1 , y1 ) + gy (t1 , y1 )z1 ,
(13g)
0 = k(t1 , y1 , z&1 ) ,
(13h)
0 = k(t1 , y1 , z1 ),
(13i)
where M1+α := M (t1+α , y0 + h(1 + α)z0 ) and Mα := M(1+α)−1 or M (tα , y0 + &1 , ψ&1 are just local to the current step, a1+α , λ hαz0 ). The auxiliary variables z&1 , & they are not propagated. The possibility to replace ψ&1 by ψ1 in (13d) and to suppress the equations (13b) (13h) and the auxiliary variables z&1 , ψ&1 remains to be investigated. When f (t, y, z, λ, ψ) = f0 (t, y, z) − gyT (t, y)λ − kzT (t, y, z)ψ we can replace (13d) by &1 ) (1 − αm )M1+α (& a1+α − a1+α ) = (1 − αf )gyT (t1 , y1 )(λ1 − λ + (1 − αf )kzT (t1 , y1 , z1 )(ψ1 − ψ&1 ) . Convergence analysis of the generalized–α–SOI2 method is not straightforward. We have the following convergence result: Theorem 1. Consider the overdetermined system of DAEs (11) under the assumption (12) with consistent initial conditions (y0 , z0 , a0 , λ0 , ψ0 ) at t0 and exact solution (y(t), z(t), a(t), λ(t), ψ(t)). Suppose that aα − a(t0 + αh) = O(h) (e.g., aα := a0 ), αm = 1, αf = 1, β = 0, γ = 0, and r < 1 where r := |αm /(1 − αm )|. Then the generalized–α–SOI2 numerical approximation (yn , zn , an+α , λn , ψn ), see (13), satisfies for 0 ≤ h ≤ hmax and tn − t0 = nh ≤ Const, the following global error estimates yn − y(tn ) = O(h2 ) , zn − z(tn ) = O(h2 ) , an+α − a(tn + αh) = O(h2 + rn δ0 ) , λn − λ(tn ) = O(h2 + rn δ0 ) , ψn − ψ(tn ) = O(h2 + rn δ0 ), where δ0 := aα − a(t0 + α) = O(h). If αm = 0 or aα − a(t0 + αh) = O(h2 ) we have an+α − a(tn + αh) = O(h2 ) , λn − λ(tn ) = O(h2 ) , ψn − ψ(tn ) = O(h2 ) .
The Generalized–α Method for Constrained Systems in Mechanics
151
Theorem 1 remains valid for variable step–sizes, see Section 7. A proof of this theorem will be given in a forthcoming paper [6]. It is long and technical and is thus omitted here.
7 Adjustments for Variable Step–Sizes hn When applying the generalized–α method with variable step–sizes, the values an+α and Mn+α an+α must be adjusted before each new step in order to preserve the second order of the method for all components. Consider a previous step starting at tn−1 with step–size hn−1 and a new step starting at tn = tn−1 +hn−1 with step–size hn . The value an−1+α used in the previous step is an approximation of a(t) at tn−1 + αhn−1 i.e., an−1+α ≈ a(tn−1 + αhn−1 ). The value an+α obtained in the previous step is an approximation of a(t) at tn−1 + (1 + α)hn−1 = tn + αhn−1 i.e., an+α ≈ a(tn + αhn−1 ). For the current time–step starting at tn with step–size hn we need the value an+α to be an approximation of a(t) at tn + αhn , i.e., an+α ≈ a(tn + αhn ). By linearly interpolating an−1+α at tn−1 +αhn−1 and an+α at tn +αhn−1 and by extrapolating at tn + αhn , an+α can be replaced by hn an+α := an+α + α − 1 (an+α − an−1+α ) . (14a) hn−1 Analogously we can replace Mn+α an+α by Mn+α an+α := Mn+α an+α hn +α − 1 (Mn+α an+α − Mn−1+α an−1+α ) . hn−1
(14b)
These adjusting formulas (14) have several advantages: • • • •
They are simple to implement. Their computational cost is almost negligible. They are valid for ODEs and DAEs. They preserve second order of convergence.
These modifications are not necessary to preserve the second order of convergence for the y and z components. However, they are recommended since their computational cost is almost negligible and they allow second order of convergence for the components a, λ, and ψ.
8 Numerical Experiments 8.1 A Nonlinear Mathematical Test Problem To illustrate Theorem 1 numerically we first consider the following nonlinear mathematical test problem
152
L.O. Jay and D. Negrut
y1 y2
=
z1 z2
,
(15a)
y1 y2 − e−2t z1 z2 sin(y1 − et ) y1 y2 et (y1 z2 + 2y2 z1 ) + e2t y1 λ1 − y1 z2 ψ1 − 2 , (15b) = e−t (0.5y2 z2 − 2y1 z1 y2 z2 + y2 λ21 ) − y1 y2 z1 ψ13 + e3t
0 = g(t, y) = y12 y2 − 1 ,
(15c)
0 = gt (t, y) + gy (t, y)z = 2y1 y2 z1 +
y12 z2
,
(15d)
0 = k(t, y, z) = y1 z1 z2 + 2 .
(15e)
Observe that this problem is nonlinear in the algebraic variables λ1 and ψ1 . The following initial conditions at t0 = 0 have been used: y1 (0) = 1, y2 (0) = 1, z1 (0) = 1, z2 (0) = −2, λ1 (0) = 1, ψ1 (0) = 1. The exact solution is given explicitly as follows: y1 (t) = et , y2 (t) = e−2t , z1 (t) = et , z2 (t) = −2e−2t , λ1 (t) = e−t , ψ1 (t) = et . We have applied the generalized–α–SOI2 method, see (13), with damping parameter ρ∞ = 0.2 and variable step–sizes alternating between h/3 and 2h/3 for various values of h. Using the adjusting formulas (14) for an+α and Mn+α an+α we observe global convergence of order 2 at tn = 1 in Fig. 1. Without these modifications a reduction of the order of convergence to 1 for the components a, λ, and ψ can be observed in Fig. 2.
−4
error of generalized−alpha−SOI2 with variable stepsizes
10
−5
errors in y,z,a,lambda, and psi
10
−6
10
10−7
−8
10
10−9
10−10 10−5
10−4
10−3
10−2
h
Fig. 1. Global errors yn −y(tn )2 (), zn −z(tn )2 (◦), an+α −a(tn +αh)2 (×), λn − λ(tn )2 (+), ψn − ψ(tn )2 (∗) of the generalized–α–SOI2 method (ρ∞ = 0.2) at tn = 1 for the test problem (15) with variable step–sizes alternating between h/3 and 2h/3 using the adjusting formulas (14) for an+α and Mn+α an+α
The Generalized–α Method for Constrained Systems in Mechanics
153
error of generalized−alpha−SOI2 with variable stepsizes
−3
10
−4
errors in y,z,a,lambda, and psi
10
−5
10
−6
10
−7
10
−8
10
−9
10 −5 10
−4
−3
10
10
−2
10
h
Fig. 2. Global errors yn −y(tn )2 (), zn −z(tn )2 (◦), an+α −a(tn +αh)2 (×), λn − λ(tn )2 (+), ψn − ψ(tn )2 (∗) of the generalized–α–SOI2 method (ρ∞ = 0.2) at tn = 1 for the test problem (15) with variable step–sizes alternating between h/3 and 2h/3 without using the adjusting formulas (14) for an+α and Mn+α an+α
8.2 A Pendulum Model As a second numerical experiment we consider the pendulum model in Fig. 3 where we denote y1 := x, y2 := y, y3 := θ. The constrained equations of motion associated with this model are ⎛ ⎞ ⎛ ⎞ z1 y1 ⎝ y2 ⎠ = ⎝ z2 ⎠ , (16a) y3 z3 ⎞⎛ ⎞ ⎛ ⎛ ⎞ m 0 0 0 z ⎜ 0 m 0 ⎟ ⎝ 1 ⎠ ⎝ ⎠ −mg ⎠ z2 = ⎝ 3π L2 z3 −cz3 − k · y3 − 2 0 0 m 3
⎛
⎞ 1 0 λ1 ⎝ ⎠ 0 1 − , λ2 L sin(y3 ) −L cos(y3 ) 0 y1 − L cos(y3 ) = g(t, y) = , 0 y2 − L sin(y3 )
(16b)
(16c)
154
L.O. Jay and D. Negrut
Fig. 3. A pendulum model. Parameters used (in SI units): mass m = 5, length L = 2, spring stiffness k = 3, 000, damping coefficient c = 100, gravitational acceleration g = 9.81. Initial conditions used correspond to θ(0) = 3π/2, θ (0) = 10
0 z1 + L sin(y3 )z3 = gt (t, y) + gy (t, y)z = . 0 z2 − L cos(y3 )z3
(16d)
The pendulum is started from consistent initial conditions corresponding to y3 (0) = 3π/2, z3 (0) = 10. The parameters used are given in the caption of Fig. 3. We have applied the generalized–α–SOI2 method, see (13), with damping parameter ρ∞ = 0.2 and variable step–sizes alternating between h/3 and 2h/3 for various values of h. Using the adjusting formula (14a) for an+α we observe global convergence of order 2 at tn = 2 in Fig. 4. 8.3 αm = 0 and Holonomic Constraints: HHT–SOI2 As mentioned in Section 1 the generalized–α method for αm = 0 corresponds to the HHT–α method. In this section we only consider the generalized–α– SOI2 method for αm = 0 and systems for which • The constraints are holonomic 0 = g(t, y) • Forces are of the form f (t, y, z, λ) = f0 (t, y, z) − gyT (t, y)λ Note that these two conditions are satisfied by a vast number of multibody a1+α − a1+α , and µ & := systems. In this situation, denoting α := −αf , δ& := & &1 ), the generalized–α–SOI2 method given in (9) becomes (1 + α)(λ1 − λ
The Generalized–α Method for Constrained Systems in Mechanics
155
error of generalized−alpha−SOI2 with variable stepsizes
0
10
−1
10
−2
errors in y,z,and lambda
10
−3
10
−4
10
−5
10
−6
10
−7
10 −4 10
−3
10
−2
10
h/2
Fig. 4. Global errors yn − y(tn )2 (), zn − z(tn )2 (◦), λn − λ(tn )2 (+) of the generalized-α-SOI2 method (ρ∞ = 0.2) at tn = 2 for the pendulum test problem (16) with variable step–size alternating between h/3 and 2h/3 using the adjusting formula (14a) for an+α
# h2 " & , (1 − 2β)aα + 2β(a1+α + δ) 2 z1 = z0 + h ((1 − γ)aα + γa1+α ) , µ, M1+α δ& = g T (t1 , y1 )& y1 = y0 + hz0 +
y
M1+α a1+α = (1 + α)f (t1 , y1 , z1 , λ1 ) − αf (t0 , y0 , z0 , λ0 ) , 0 = g(t1 , y1 ) , 0 = gt (t1 , y1 ) + gy (t1 , y1 )z1 .
(17a) (17b) (17c) (17d) (17e) (17f)
Numerical experiments have been carried out based on this method which is called HHT-SOI2 hereafter. The method HHT-SOI2 is similar in form to the HHT–I3 method proposed in [9] with the following differences: •
The velocity kinematic constraints (17f) has been added to provide straint stabilization. & • The equation (17a) has an acceleration correction term δ. • The additional equation (17c) relates the acceleration correction the algebraic variables µ & associated with the velocity kinematic straints (17f).
conδ& to con-
156
L.O. Jay and D. Negrut
A rigid-body slider crank model shown in Fig. 5 is used here to illustrate the velocity constraint stabilization. The equations of motion are formulated using the floating frame of reference formulation [10]. A description of this model along with initial conditions used in its analysis is provided in [8]. We have monitored the velocity of the pin connecting the crank to the ground (point O in Fig. 5) using a step–size h = 2−10 = 0.0009765625. Ideally, the
Fig. 5. Slider crank mechanism
Fig. 6. Velocity kinematic constraints violation for HHT–I3
The Generalized–α Method for Constrained Systems in Mechanics
157
Fig. 7. Velocity kinematic constraints satisfaction for HHT–SOI2
drift of the velocity constraints should be zero. When plotted in a phase plot one against the other, for the HHT–I3 integrator a limit cycle of magnitude approximately 10−6 can be observed in Fig. 6, while for the HHT–SOI2, as expected, the plot of Fig. 7 displays a collection of random points that are within machine precision.
9 Conclusions The generalized–α method of Chung and Hulbert [2] is extended in this work to handle the case of nonlinear differential–algebraic equations associated for example with the time evolution of systems of rigid and/or flexible bodies. The proposed method, called generalized–α–SOI2 method, where SOI2 stands for Stabilized Overdetermined Index 2, is second order convergent for systems having nonconstant mass matrix, holonomic constraints, and/or nonholonomic constraints. For variable step–sizes, a new adjusting formula preserving the second order of the method is proposed. Numerical experiments have been carried out to verify these claims. The new extension has the same user-adjustable numerical damping parameters associated with the original generalized–α method. Due to its semi–implicit formulation, early numerical
158
L.O. Jay and D. Negrut
results suggest that the new method is more efficient for large mechanical systems simulation when compared to the current state of the art in numerical integration of constrained systems in mechanics.
Acknowledgements This material is based upon work supported by the National Science Foundation under Grant No. 0654044.
References 1. Cardona A, G´eradin M (1989) Time integration of the equations of motion in mechanism analysis. Comput & Struct 33:801–820 2. Chung J, Hulbert GM (1993) A time integration algorithm for structural dynamics with improved numerical dissipation: The generalized–α method. J Appl Mech 60:371–375 3. Hilber HM, Hughes TJR, Taylor RL (1977) Improved numerical dissipation for time integration algorithms in structural dynamics. Earthquake Eng Struct Dyn 5:283–292 4. Hughes TJR, Taylor RL (1987) Finite element method – Linear static and dynamic finite element analysis. Prentice–Hall, Englewood Cliffs, NJ 5. Jay LO, Negrut D (2007) Extensions of the HHT-α method to differentialalgebraic equations in mechanics. ETNA 26:190–208 6. Jay LO, Negrut D (2008) A second order extension of the generalized–α method for constrained systems in mechanics. Technical report, Department of Mathematics, University of Iowa, USA. In progress 7. Lunk C, Simeon B (2006) Solving constrained mechanical systems by the family of Newmark and α–methods. ZAMM 86:772–784 8. Negrut D, Jay LO, Khude N, Heyn T (2007) A discussion of low order numerical integration formulas for rigid and flexible multibody dynamics. In: Bottasso CL, Masarati P, Trainelli L (eds) Conference Proceedings of Multibody Dynamics 2007, ECCOMAS Thematic Conference, Milano, Italy, 25–28 June 2007 9. Negrut D, Rampalli R, Ottarsson G, Sajdak A (2007) On an implementation of the HHT method in the context of index 3 differential algebraic equations of multibody dynamics. ASME J Comp Nonlin Dyn 2:73–85 10. Shabana AA (2005) Dynamics of multibody systems. Cambridge University Press, third edition 11. Yen J, Petzold LR, Raha S (1998) A time integration algorithm for flexible mechanism dynamics: The DAE–alpha method. Comput Meth Appl Mech Eng 158:341–355
Kinematical Optimization of Closed-Loop Multibody Systems Jean-Fran¸cois Collard1 , Pierre Duysinx2 , and Paul Fisette1 1
2
Centre for Research in Mechatronics, Universit´e Catholique de Louvain (UCL), Place du Levant 2, 1348 Louvain-la-Neuve, Belgium E-mails:
[email protected],
[email protected] Department of Mechanics and Aerospace, Universit´e de Li`ege (ULg), Chemin des Chevreuils 1, 4000 Li`ege, Belgium E-mail:
[email protected]
Summary. Applying optimization techniques in the field of multibody systems (MBS) has become more and more attractive considering the increasing development of computer resources. One of the main issues in the optimization of MBS concerns closed-loop systems which involve non-linear assembly constraints that must be solved before any analysis of the system. The addressed question is: how to optimize such closed-loop topologies when the objective evaluation relies on the assembly of the system? The authors have previously proposed to artificially penalize the objective function when those assembly constraints cannot be exactly fulfilled. However, the method suffers from some limitations especially due to the difficulty to get a differentiable objective function. Therefore, the key idea of this paper is to improve the penalty approach. Practically, instead of solving the assembly constraints, their norm is minimized and the residue is taken as a penalty term instead of an artificial value. Hence, the penalized objective function becomes differentiable throughout the design space, which enables the use of efficient gradient-based optimization methods such as the sequential quadratic programming (SQP) method. To illustrate the reliability and generality of the method, two applications are presented. They are related to the isotropy of parallel manipulators. The first optimization problem concerns a three-dof Delta robot with five design parameters and the second one deals with a more complex six-dof model of the Hunt platform involving ten design variables.
1 Introduction Most of present issues in the field of multibody system (MBS) dynamics involve other scientific disciplines to enlarge and enrich the results of the MBS analysis. Combining multibody analysis and optimization techniques has sometimes been exploited in the literature (e.g.: [6, 10, 11, 17]), but the few existing results are still not able to cope completely with some limitations C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
159
160
J.-F. Collard et al.
and/or conflicts. Several issues are addressed in recent researches: applicability of optimization methods for some families of MBS, problem formulation in terms of cost function, especially for constrained systems, computer efficiency and parallel computation algorithms, etc. However, current researches mainly produce guidelines rather than rigid rules regarding the coupling of multibody dynamics and optimization in a particular context, i.e. a family of applications and a given type of objective function. As examples, one can cite the optimal design of a vehicle transmission [10], of 3D manipulators [4, 7, 16, 19, 20], the optimal synthesis of mechanisms [3,11,14,21,23] and parallel robots [1,8,12] as well as the optimization of suspension and railway vehicle [6, 18], or machine tools [15]. The present research tackles the optimization problem of MBS containing 3D kinematic loops involving: • Geometrical design parameters • A single scalar objective function whose nature may be geometrical, kinematic or dynamical but it does not involve an integration scheme to simulate the system In this case, the question that is addressed is the following: how to build a robust cost function for those systems such that a classical optimization method can iterate with good convergence properties and without troubles in terms of loop closure (or “system assembly”)? For simple systems, one can obviously circumvent the problem by expressing explicitly some optimization constraints on the geometrical parameters (length of segments, amplitude of motion, etc.), but, as soon as complex multi-loop systems like 3D parallel manipulators are involved, this is not possible anymore. The original approach that is used here is based on a penalty technique in which the norm of assembly constraints is properly minimized and the residue of this minimization plays the role of penalty term. This provides us with a continuous and differentiable objective function. Hence, this enables the use of efficient gradient-based optimization methods such as the SQP method. This paper is organized as follows. In Section 2, the different issues arising from closed-loop MBS are addressed. Therefore, a penalty strategy is proposed in Section 3 to optimize such closed-loop systems. The corresponding sensitivity analysis is developed in Section 4 using two different methods: the direct method in Section 4.1 and the adjoint method in Section 4.2. Finally, the reader can found two applications about the kinetostatic optimization of parallel manipulators in Section 5. First, a three-dof Delta robot is optimized in Section 5.1 and then a 6-dof Hunt platform in Section 5.2.
2 Dealing with Assembly Constraints In general, the assembly constraints are highly non-linear and can be expressed by a set of implicit algebraic equations: h(q) = 0 , (1)
Kinematical Optimization of Closed-Loop Multibody Systems
161
where, in our case, q is the vector of relative or joint variables. The technique we use to solve these assembly constraints is the Coordinate Partitioning method [22]. Joint variables q are first partitioned into independent variables u and dependent ones v. Then, v may be computed by means of the classic Newton-Raphson iterative algorithm. At iteration k + 1, this algorithm provides: (2) v k+1 = v k − J −1 v h(u, v k ) , ∂h where J v = ∂v T and the values of u correspond to the instantaneous system configuration. What should be emphasized here is that the MBS simulation, or more precisely the performance evaluation of a closed-loop MBS, is strongly dependent on these assembly constraints. To simulate the behavior of such a mechanism or compute its inverse dynamics for instance, the mechanism should be assembled; otherwise, the evaluation has no physical meaning. The resolution of (1) is thus crucial to analyze the system and further optimize its performance. Therefore, we will discuss some issues arising when using the NewtonRaphson algorithm (2) to assemble the closed-loop MBS. Let’s point out that optimization process involving the integration of the equation of motion to simulate the dynamical behavior will not be investigated in this paper; only kinematic performance will be optimized in order to focus on the assembly constraints. During the optimization process, the optimizer calls the objective function to evaluate the performance. In some cases, according to the parameters given by the optimizer (e.g. dimensions of the mechanism, values of the generalized coordinates, . . . ), we may face convergence problems of the Newton-Raphson algorithm to solve the constraints (2). Especially, such problems may frequently occur when performing a geometrical optimization process, i.e. whose optimization variables are the dimensions of the MBS. To illustrate that, let us consider a four-bar mechanism in relative coordinates with a point-to-point constraint as assembly constraint (see Fig. 1) with a given partition of the variables. On the basis of this example, three particular cases can be identified when evaluating the mechanism during optimization:
v1
v1
v1
h=
? u
v2
(a) Multiple solutions
v2
u (b) Singularity: ∂h ∂vT
=0
0 v2
u (c) Impossibility: h(u, v) = 0, ∀v
Fig. 1. Particular cases when solving assembly constraints during optimization
162
J.-F. Collard et al.
1. A first case may occur when the solution in terms of v is not unique for a given vector u (or scalar u in the case of Fig. 1a). These multiple assembly solutions are obviously inconvenient since the performance are usually different for each assembly case. In most cases however, the problem requirements favor one assembly solution more than others. One possible way to reach that preferable assembly configuration is to start the Newton-Raphson algorithm with initial dependent variables close to that expected solution. But good initial variables are not always easy to guess if the dimensions vary during optimization. Another possibility is to add constraint to the assembly problem, using optimization techniques. This will be applied in the following optimization strategy of Section 3. 2. A second problem may happen if the mechanism reaches a singular config∂h uration, the constraint Jacobian matrix J v = ∂v T becoming singular (see Fig. 1b). These singularities have both a mathematical and a physical interpretation. Mathematically, they are related to the chosen partitioning and corresponds to an ill-conditioned Jacobian matrix. Physically, those singularities correspond to a loss of mobility of the MBS by locking one or more actuators associated to the independent joint variables. Let us note that purely mathematical Jacobian singularities may arise for instance: • From a wrong choice of the sequence of rotation variables in three dimensions • When redundant constraints are present • Or even if an ignorable variable is taken into account in the set of dependent variables (e.g. axial rotation of a connecting rod) In all these cases, the Newton-Raphson algorithm meets trouble. This singularity problem is clearly dependent on the choice of partitioning and also on the type of formalism used to model the geometry (relative, absolute or natural coordinates). For a given formalism, whatever the choice of partition, singular configurations always exist. They can be avoided by modifying the partition of the generalized coordinates during the simulation if needed. Practically, the proximity to the singularity should be detected thanks to the conditioning of the constraint Jacobian matrix and a new partition should be achieved accordingly. This possibility has been studied for redundantly actuated MBS by [9]. In this work, the particular case of overactuated MBS models is not considered and the partition is fixed according to the problem requirements. 3. Finally, it can be impossible to close the mechanism simply because a constraint hi has no root: the Newton-Raphson algorithm cannot converge toward a solution (see Fig. 1c). This situation occurs for instance when the optimizer evaluates the objective function with incompatible dimensional parameters (e.g. too short bars in Fig. 1c). In that case, the closed mechanism does not exist but, instead of rejecting it, we will keep it and penalize the cost function accordingly. The penalty strategy is presented in the next section.
Kinematical Optimization of Closed-Loop Multibody Systems
163
3 Best-Assembly Penalty Optimization Strategy Suppose that we want to optimize the performance of a MBS given by the objective function f with respect to the vector of p design variables l. This objective f also depends on the n generalized coordinates q that are constrained by m independent non-linear assembly constraint equations hi (q) = 0, i = 1, . . . , m. To be complete, the set of design variables l can also be constrained by r design constraints bj (l, q), j = 1, . . . , r. Typically, these design constraints are bound constraints on the design variables but may become more complex according to the problem requirements. The objective function is time-independent but may be evaluated for s different configurations or sets of generalized coordinates q k , k = 1, . . . , s. The considered problem can generally be formulated as follows: minl f (l, q 1 , . . . , q k ) , subject to and
(3a)
hi (q k ) = 0
for i = 1, . . . , m and k = 1, . . . , s, (3b)
bj (l, q k ) ≥ 0
for j = 1, . . . , r and k = 1, . . . , s . (3c)
The proposed strategy to solve (3) is an alternative of the artificial penalty strategy developed in [5]. It is based on the minimization of the assembly constraints instead of their resolution. This is the main difference which will imply the possible use of gradient-based optimization methods such as quasiNewton methods or the sequential quadratic programming (SQP) technique. The key idea of the method is developed in the following. Practically, after the partitioning of the generalized coordinates, the resolution of the constraint equations (1) is replaced by the following constrained optimization problem: 1 (4a) min h(l, u, v)T h(l, u, v) , v 2 subject to c(l, u, v) ≥ 0 . (4b) This replacement can be seen as an extension of the assembly problem. Indeed, if the assembly constraints can be satisfied, solution of problem (4) reaches a zero value as with the Newton-Raphson resolution. Otherwise, the square norm of the assembly constraint vector h is minimized while the NewtonRaphson algorithm would not have provided a consistent result to the assembly problem. This new formulation offers a direct solution to the assembly issue 3 described in Section 2, for which the dimensions of the mechanism are not compatible with assembly constraints. Now, if we consider assembly issues due to multiple solutions and singular configurations (issues 1 and 2 in Section 2), formulation (4) can also help us to avoid these problems, using the additional optimization inequality constraints c ∈ mc . If function c is well chosen, the assembly problem can be
164
J.-F. Collard et al.
θ2
l2
A B
l1 l3 Iˆ2 θ1
l4
θ3
Iˆ1 Fig. 2. Design parameters of a four-bar mechanism in relative coordinates
restricted to a regular and unique solution. The definition of these additional constraints therefore requires to know in advance the different solutions and singularities of the assembly problem. This prerequisite can sometimes be difficult to establish when complex topologies are considered. However, for many applications involving several simple loops of bodies such as parallel manipulators (see applications in Section 5), the construction of function c remains affordable. The following planar example will provide more details on the method. Assume that we want to assemble a four-bar mechanism modeled in relative coordinates (see Fig. 2). As shown in Fig. 1a, two assembly solutions exist. The additional requirement is to keep the “elbow-up” closed configuration (i.e. when points A or B remain above the straight line between θ2 and θ3 in Fig. 2) and to avoid the singular configurations when bars are aligned. This model contains three generalized coordinates θ1 ,θ2 , and θ3 . The four bars have lengths l1 , l2 , l3 and l4 . To make points A and B coincide, the two assembly equations are: h1 (θ1 , θ2 , θ3 ) = −l1 sin θ1 + l2 cos (θ1 + θ2 ) + l3 sin θ3 − l4 = 0 ,
(5a)
h2 (θ1 , θ2 , θ3 ) = l1 cos θ1 + l2 sin (θ1 + θ2 ) − l3 cos θ3 = 0 .
(5b)
To solve the initial position problem or assembly problem, solution of the nonlinear system (5) has to be found. This implies the partitioning of the three variables θ1 , θ2 and θ3 into one independent variable u and two dependent ones v. Assume this partition is chosen as: u = θ1 , v = [θ2 θ3 ]T . To detect singular configurations, we compute the determinant of J v as: |J v | = cos (θ1 + θ2 − θ3 ) .
(6)
This determinant vanishes at the singularity and its sign defines two regions of the joint space corresponding to the “elbow-up” and “elbow-down”
Kinematical Optimization of Closed-Loop Multibody Systems
165
configurations. To safely choose the “elbow-up” configuration during assembly process, the additional inequality constraint c ≥ 0 will be: c (θ1 , θ2 , θ3 ) = cos (θ1 + θ2 − θ3 ) − cth ≥ 0 ,
(7)
where cth is a positive threshold constant that defines a safety gap from the singularity. The value of cth (typically cth = 0.01) will keep the assembly process far from the singularity even if the mechanism cannot be assembled. This will ensure a regular consistent optimal solution to the design optimization problem (3). Concerning the objective function, the MBS performance can now be evaluated everywhere in the design space. Even if the MBS is not assembled, its performance can be computed but the result may have no physical meaning. That is why the objective function has also to be penalized outside the assembly domain. The penalty term will be composed of the residue of the assembly constraint minimization (4) whose value is zero inside the closed-loop domain. The extended objective function g will be defined as follows: g(l, q) = f (l, q) + w fpen (l, q) , subject to fpen (l, q) = with
s %
(8a)
min 12 h(l, uk , v k )T h(l, uk , v k ) ,
k=1 v k
c(l, uk , v k ) ≥ 0, k = 1, . . . , s ,
(8b) (8c)
where w is a weighted factor between the actual performance f and the penalty term fpen . This definition is illustrated in one dimension in Fig. 3. The main advantage of this new formulation is that g is always continuous and always differentiable over the design space. This property enables us to use gradientbased algorithms that are local methods but often more efficient than direct searches or stochastic algorithms. In the diagram flowchart of Fig. 4, this new penalty strategy is roughly described. For better efficiency of the algorithm, we can make the distinction between design constraints bl (l) involving only the design parameters and
g(X)
g f fpen = minv 12 hTh G
B
X
Fig. 3. Example of best-assembly penalization
166
J.-F. Collard et al.
Computation of bl (l)
minv 21 h(l, u, v)t h(l, u, v) such that c(l, u, v) ≥ 0
Performance evaluation Optimizer Computation of blq (l,q)
Sensitivity Analysis
yes
Other config. ? no
Fig. 4. Best assembly penalization of the objective function
those also involving the generalized coordinates blq (l, q). The first ones are only computed at the beginning of the optimization iterate while the second ones are evaluated after each “assembled” configuration as well as the performance of the MBS. An important step in the diagram is the sensitivity analysis. This computation is needed to take advantage of the continuity and differentiability of the objective function and its gradient. More details about sensitivity analysis can be found in Section 4. All these computations are performed for each configuration and all the partial evaluations are accumulated to form the complete objective value returned to the optimizer. To conclude, some limitations of the method will be highlighted. Probably the most inconvenient feature is to choose the weighted factor w. Its value depends on the range of the performance values that can be normalized to be compared to the assembly residue in the penalty term. This is similar to multiobjective optimization problems whose various objectives can be summed with different weights to compose the final function. However, the final optimal solution is usually independent on w because the penalty term that vanishes when the MBS is assembled only affects the process behavior outside the assembly domain. Another difficulty could be to find the additional constraints c and the choice of cth (e.g. in (7)). This parameter influences the safety gap with respect to singularity. However, it is possible to give a geometrical meaning to the constant cth .
Kinematical Optimization of Closed-Loop Multibody Systems
167
4 Sensitivity Analysis Before using gradient-based optimization techniques, it is necessary to analyze the sensitivity of the penalized objective function g (see (8a)) with respect to the vector of design parameters l. To alleviate the notation, let us consider only one set of generalized coordinates q. We can define the reduced objective function g ∗ for which the assembly problem (4) is solved: g ∗ (l, u) ≡ g (l, u, v ∗ (u, l)) ,
(9)
where the independent variables u are constant for the optimization problem and v ∗ is the solution of problem (4). When the optimal assembly solution is found, the Kuhn-Tucker first-order necessary condition of this constrained optimization problem can be applied. This means that the Lagrangian L of the constrained problem (4) vanishes at the optimal solution. Practically, L is defined in our case as follows: L(l, u, v, λ) =
1 h(l, u, v)T h(l, u, v) − λT c(l, u, v) , 2
(10)
where λ ∈ mc is the vector of Lagrangian multipliers. These multipliers can be considered as additional dependent variables of the assembly problem. We ¯ of dependent variables: therefore define an extended set v
v ¯≡ v . (11) λ According to the Kuhn-Tucker theorem [13], we can write the following necessary conditions: ∂L (l, u, v ∗ , λ∗ ) = 0 , ∂v T λ∗i ≥ 0, i = 1, . . . , mc , λ∗i ci (l, u, v ∗ )
= 0, i = 1, . . . , mc ,
(12a) (12b) (12c)
where the asterisk symbol refers to a local constrained minimizer of prob∂c of all lem (4). Note that these conditions assume that the gradients ∂v T ¯ is active constraints are linearly independent. The new extended function h then defined as: 3 2 ∂L (l, u, v, λ) ∂v ¯ u, v ¯) ≡ , (13) h(l, diag(λ) c(l, u, v) where diag(λ) is a square matrix containing the elements of vector λ on ¯ reaches a zero value when the bestits diagonal. This extended function h assembly problem (4) is solved.
168
J.-F. Collard et al.
¯ with respect to the For further developments, the Jacobian matrix of h design variables is given by: ⎤ ⎡ ∂2L ¯ ∂h (14a) = ⎣ ∂lT ∂v ⎦ , ∂lT ∂c diag(λ) ∂l T ⎡ m ⎤ mc ∂ 2 hi ∂hT ∂h ∂ 2 ci hi + − λi T ⎢ ⎥ =⎣ (14b) ∂lT ∂v ∂v ∂lT ∂l ∂v ⎦ , i=1 i=1 ∂c diag(λ) ∂lT ¯ with respect to the extended dependent variand the Jacobian matrix of h ables is:
¯ ¯ ∂h ¯ ∂h ∂h , (15a) = ¯T ∂v ∂v T ∂λT ⎤ ⎡ ∂2L ∂2L (15b) = ⎣ ∂v T ∂v ∂λT ∂v ⎦ , ∂c diag(λ) ∂v T diag(c) ⎤ ⎡ m mc ∂ 2 hi ∂hT ∂h ∂ 2 ci ∂cT hi + − ∂v ⎥ − λi T ⎢ =⎣ ∂v T ∂v ∂v ∂v T ∂v ∂v ⎦ , (15c) i=1 i=1 ∂c diag(λ) ∂vT diag(c) where diag(c) is a square matrix containing the elements of vector c on its diagonal. In the following, two well-known techniques are presented to calculate the ¯ = 0. derivatives of the objective function g subject to the constraints h 4.1 Direct Method The first method coming in mind to analyze the sensitivity of g is called the direct method. Using the chain rule applied to function g ∗ , we obtain: ¯∗ ∂g ∂g ∂ v ∂g ∗ = T + . T T ∂l ∂l ¯ ∂lT ∂v
(16)
¯ ∗ with respect to the The remaining unknowns are the partial derivatives of v design parameters. These can be found by differentiating the m+mc equations ¯ u, v ¯ ∗ (u, l)) = 0. This gives us: of constraints h(l, ¯ ∂v ¯ ¯∗ ∂h ∂h + =0, ∂lT ¯ T ∂lT ∂v from which the following equation can be easily deduced: ¯ −1 ¯ ¯∗ ∂h ∂h ∂v = − , T T ∂l ∂lT ¯ ∂v
(17)
(18)
Kinematical Optimization of Closed-Loop Multibody Systems
169
¯ v = ∂ h¯T is invertible. Note that J ¯ v is always a if the Jacobian matrix J ¯ ∂v square matrix since there must be as many constraint equations as dependent variables. ¯ v is regular, (18) can be inserted in (16) to yield the In conclusion, if J sensitivity of g ∗ with respect to the design variables: ¯ −1 ¯ ∂h ∂g ∂g ∂h ∂g ∗ = − . (19) ∂lT ∂lT ∂lT ¯T ∂v ¯T ∂v 4.2 Adjoint Method Another well-known technique to derive the sensitivity of the objective function g is to use the adjoint method. This method is considered as the dual of the direct method. The basic idea is to artificially adjoin a linear combination of the constraint equations to the objective function. As these constraint equations are assumed to be satisfied or identically equal to zero, this does not modify the value of g ∗ . We can thus define g¯∗ : ¯ u, v ¯ ∗ (u, l)) = g ∗ (l, u) , ∀ν ∈ m+mc . g¯∗ (l, u) ≡ g ∗ (l, u) + ν T h(l,
(20)
This extended objective function g¯∗ can then be derived with respect to the design parameters. This provides us with: ¯ ¯ ∂v ¯∗ ∂g ∂g ∂¯ g∗ T ∂h T ∂h = + ν + + ν . (21) ∂lT ∂lT ∂lT ¯T ¯ T ∂lT ∂v ∂v ∗
Now, the trick to avoid the computation of ∂∂lv¯T is to choose the adjoint variables ν such that: ¯ ∂h ∂g + ν T T = 0, (22a) T ¯ ¯ ∂v ∂v ¯T ∂h ∂g + ν = 0, (22b) ⇔ ¯ ¯ ∂v ∂v
T −1 ¯ ∂g ∂h . (22c) ⇔ ν=− ¯ ¯ ∂v ∂v ¯T
¯ v is invertible since This last line is consistent if ∂∂hv¯ is regular, i.e. if J ¯ v )T . (J Finally, after the insertion of (22c) into (21), we obtain: ∂g ∂g ∂¯ g∗ = T − ∂lT ∂l ¯T ∂v
¯ ∂h ¯T ∂v
−1
¯ ∂h , ∂lT
¯T ∂h ¯ ∂v
=
(23)
which is strictly equivalent to (19). This remark is obvious in our case be¯ is the same as the number of cause the number of independent constraints h ¯. dependent variables v
170
J.-F. Collard et al.
5 Applications: Kinetostatic Optimization of Parallel Robots Isotropy of a manipulator is a kinetostatic performance that can be measured from the condition number κ of its forward kinematics Jacobian matrix J f [2]. In other words, if this Jacobian J f is defined by: x˙ = J f q˙a ,
(24)
where q˙a is the velocity vector of the actuated joints and x˙ the end-effector twist defined by the Cartesian (position) and angular (orientation) velocity vectors of the end-effector, this isotropy index κ is the condition number of J f [2]. It can be written as the ratio of the largest singular value σl of J f to its smallest one σs : σl . (25) κ (J f ) = σs Defined as in (24), the forward Jacobian can be seen as a mapping between the actuated joint space and the end-effector position and orientation space (see Fig. 5). This means that if we consider a unit ball in the actuator space, this ball is mapped into an ellipsoid whose semiaxis lengths bear the ratios of the eigenvalues of J f . If J f is singular, then the ellipsoid degenerates into one with at least one vanishing semiaxis. On the other hand, if matrix J f is isotropic, i.e. if all its singular values are identical, then it maps the unit ball into another ball, either enlarged or shrunken. This definition (25) assumes that all entries of J f have the same units. Otherwise, this dimensional inhomogeneity can be solved by introducing a normalizing characteristic length as suggested in [2]. The latter is used to divide the positioning rows of J f , making it dimensionally homogeneous. As explained in [2], let us note that the value of the characteristic length itself comes from the minimization of the condition number over all the reachable configurations. qa 3
Jf
x3
x2 qa2 qa 1
x1
Fig. 5. Jacobian mapping between actuated coordinates qi and tool coordinates xi
Kinematical Optimization of Closed-Loop Multibody Systems
171
In the following applications, the goal is to optimize a global postureindependent performance index which is the mean of the inverses of the condition number κ over a volume V in the Cartesian space of the end-effector, also called Global Dexterity Index (GDI) [8]: ( 1 κ dV . (26) GDI = V V In the case of positioning and orientating manipulators (for instance, the Hunt platform described in Section 5.2), the value of κ is obviously computed after normalizing the Jacobian matrix, as previously explained. By analogy with the optimization proposed in [2], we suggest that the above-mentioned characteristic length becomes an additional parameter of our optimization problem which initially only deals with purely design parameters. The optimization of this global dexterity index (26) will be first performed on the Delta parallel robot, whose moving platform has only three translational dof. In the second subsection, a more complex model will be considered where the moving platform has six dof (three positions and three orientations): the Hunt platform. 5.1 Optimization of the Translational Delta Robot The Delta robot (see Fig. 6) is a three-dof parallel manipulator whose moving platform has only translational motions. The model is composed of three legs, each containing a parallelogram to keep the platform in a horizontal plane. The isotropy of the Delta will be evaluated over a 2 cm-sided cube as shown in Fig. 6. The design parameters of this optimization problem are: the lengths of the upper leg lA and the lower leg lB , the characteristic radii of the platform rp and the base rb , and also the distance zc between the base and the center of the workspace volume (i.e. the cube). In the objective definition, the volumetric integral of (26) is discretized into eight points corresponding to the eight vertices of the cube. The performance f to optimize is thus:
Fig. 6. Delta robot model
172
J.-F. Collard et al.
1 1 . 8 i=1 κ (J f )i 8
f (lA , lB , zc , rb , rp ) =
(27)
The computation of J f is obtained from the derivation of the constraint equations hi = 0 which involve the joint coordinates of the three actuators q a = (q1a , q2a , q3a ) between the base and the leg as well as the Cartesian coordinates x = (x, y, z) of the moving platform. Note that this computation is only valid inside the assembly domain where h = 0. Practically, after the coordinate partitioning, the derivation of h yields the relation between independent and dependent velocity vectors: v˙ = B vu u˙ ,
(28)
−1
where B vu − (J v ) J u is the coupling matrix. In our case, the independent variables correspond to the platform coordinates (u = x) while the actuated variables belong to the dependent variable set. A second partition of the dependent variables gives:
a q , (29) v= qf where q f is the vector of the free (non actuated) dependent variables. Accordingly, relation (28) becomes:
a q˙ B qa u = x˙ , (30) B qf u q˙ f from which we can deduce in comparison with (24) that: J f = B −1 qa u .
(31)
Therefore, in (27), the condition number of J f is equal to the condition number of B qa u . This optimization problem has also some restrictions to satisfy. Bound constraints have been specified to get a reasonable solution. Minimum and maximum values are reported in Table 2. Concerning the assembly problem, we should avoid to close the mechanism toward an “inside” position of the legs as shown in Fig. 7. Therefore, some inequality constraints c have been formulated in terms of sines and cosines of the dependent variables. All these considerations lead to the extended objective function g (see (8a)). The optimization has been first performed with only two of the five parameters to visualize the process. The leg lengths are optimized but the radii rp and rb , and the distance zc are fixed: rp = 2 cm, rb = 5 cm, and zc = −10 cm. Two different starting points Start1 and Start2 are proposed to highlight the existence of two different local minima. The optimization process can be observed in Fig. 8. The constrained optimization algorithm is the well-known
Kinematical Optimization of Closed-Loop Multibody Systems
173
Fig. 7. Wrong assembly of Delta robot model
100
0.2
90 80
g(x, z,l)
lB [m]
Start2 0.1
Start1
70 60 50 40
Start1 Start2
30 0.02 0.02
0.1
0.2
20 0
lA[m]
5
10
15
Iteration i
(a) Evolution in the variable space
(b) Evolution of the function values
Fig. 8. Isotropy maximization of the Delta robot with two optimization variables and the best-assembly penalty Table 1. Isotropy maximization of the Delta robot with two optimization variables GDIopt la,opt lb,opt GDI0 la,0 lb,0
(%) (cm) (cm) (%) (cm) (cm)
Start1 91.36 9.91 15.66 36.11 10.00 10.00
Start2 67.64 20.00 14.57 23.16 15.00 12.00
sequential quadratic programming method (SQP) [13]. From both starting points, the process converges respectively with 22 and 21 evaluations of the objective function. The corresponding optimization results are reported in Table 1 and lead to an optimal isotropy index of 91.36% from the first starting point.
174
J.-F. Collard et al.
Table 2. Isotropy maximization of the Delta robot with five optimization variables Minimum GDI la lb z rb rp
(%) (cm) (cm) (cm) (cm) (cm)
(a) Initial design
2.00 2.00 −20.00 1.00 1.00
Initial 36.11 10.00 10.00 −10.00 5.00 2.00
Maximum 20.00 20.00 −5.00 10.00 10.00
Optimal 98.88 16.39 20.00 −11.53 6.02 6.05
(b) Optimal design
Fig. 9. Optimal designs for the isotropy maximization of the Delta robot
With five optimization variables, the solution is obviously improved and the average isotropy over the cube vertices reaches 98.98%. The optimization results can be found in Table 2. The SQP algorithm needs 54 evaluations of the objective function. It should be noted that the isotropy index does not depend on the radii themselves but rather on their ratio which tends to unit value as shown in Table 2. Initial and optimal designs are sketched in Fig. 9. 5.2 Optimization of the Six-Dof Hunt Platform The Hunt platform (see Fig. 10) has three position and three orientation degrees of freedom which require to normalize J f before computing κ, each time the parameters change, i.e. at each call of the objective function. As explained above, this involves an additional optimization parameter: the characteristic length LC . The nine other parameters of this optimization problem (see Fig. 10) are the legs lengths LI and LS, the characteristic radii of the platform RP and of the base RB, the gauge H between adjoining actuators on the base, the angle α (around a vertical axis), followed by angle β (around an horizontal axis), angle ψ, and finally the vertical distance zc between the base and the center of the desired workspace volume (small cube in Fig. 10). The objective function is similar to (27) for the Delta robot except that now the forward Jacobian matrix J f has six rows for the three positions and the three orientations of the platform and six columns for the six actuated
Kinematical Optimization of Closed-Loop Multibody Systems
175
Fig. 10. Hunt platform model
joint variables between the base and the six legs. Therefore, the average value of isotropy has to be computed over a 6-dimension hyper-cube, or more precisely over its 32 vertices. According to the research project specifications (for a surgical application), this cube has a side length of 50 mm (i.e. ±25 mm from the center point) in the Cartesian space and the angular range is ±10◦ around each orthogonal axis (note that the sequence of rotation is around x-axis, y-axis and z-axis successively). As for the Delta robot, after coordinate partitioning, we obtain the coupling matrix Bvu . Then, the dependent variables are also partitioned into actuated q a and free q f variables as well as the independent variables into positions xp and orientations xo . The velocity relation (28) becomes:
a p q˙ B qa xp B qa xo x˙ = , (32) f B qf xp B qf xo x˙ o q˙ from which we can extract: −1 . J f = B qa xp B qa xo
(33)
Actually, as explained previously, this Jacobian matrix is inhomogeneous since linear and angular variables are involved. Thus, we make use of the characteristic length LC to divide the “position” columns of B vu , defining the ˜ f involved in the optimization as: homogeneous Jacobian J −1 ˜ f = B qa xp B qa xo . (34) J LC The objective function f of this 6-dimension problem is finally: 1 1 "" # # . 32 i=1 κ J ˜ 32
f (zc , LI, LS, RB, RP, H, α, β, ψ, LC ) =
f
i
(35)
176
J.-F. Collard et al.
Table 3. Isotropy maximization of the Hunt platform with ten optimization variables GDI zc LI LS RB RP H α β ψ LC
(%) (mm) (mm) (mm) (mm) (mm) (mm) (◦ ) (◦ ) (◦ ) (µm)
Minimum Initial Maximum 26.45 100.0 300.0 500.0 50.0 200.0 300.0 50.0 200.0 300.0 50.0 100.0 200.0 50.0 100.0 200.0 10.0 50.0 100.0 0.0 10.0 20.0 0.0 10.0 50.0 5.0 10.0 50.0 0.0001 10.0 1000.0
(a) Initial design
Optimal 58.12 267.8 220.6 300.0 50.0 162.2 10.0 0.0 6.7 5.0 6.2
(b) Optimal design
Fig. 11. Initial and optimal designs for the isotropy maximization of the Hunt platform
Also for practical reasons, design parameters of this robot are limited by bounds (see minimum and maximum values in Table 3). The same restrictions are imposed on the assembly problem to avoid “inside” configurations as in Fig. 7. This leads to the final extended function g (see (8a)). The results obtained with the SQP method are presented in Table 3, and initial and optimal design can be compared in Fig. 11. The graphical results are represented in Fig. 12. It takes 94 iterations and 110 evaluations of the objective function to optimize the 10 parameters. In Fig. 5.2, we can observe the evolution of the objective function in three main steps. These steps correspond to “sequences” of the SQP method for which active constraints are defined. For example, we can observe in Fig. 5.2 that the active bound constraint on variable β is relaxed in iteration 22 from which β begins to increase.
Kinematical Optimization of Closed-Loop Multibody Systems
177
60 55
g(x, z, l)
50 45 40 35 30 25 0
20
40
60
80
100
Iteration i
(a) Objective function 10 zc
LI
LS
RB
RP
H
300
8
250
6
Angles (◦)
Lengths (mm)
350
200 150
® ¯ Ã
4 2
100 0
50 0 0
20
40
60
80
100
Iteration i
(b) Linear variables
−2
0
20
40
60
80
100
Iteration i
(c) Angular variables
Fig. 12. Isotropy maximization of the Hunt platform with ten optimization variables and the SQP method
The “validation” of this result is made by applying stochastic optimization algorithms on the same optimization problem: on the one hand, we apply a classical genetic algorithm and on the other hand, an evolutionary strategy is used. Further details can be found in [5]. The comparison between the results obtained with these stochastic methods and our deterministic SQP method tends to show that the optimal solution is a global one.
6 Conclusions and Prospects In this paper, a reliable optimization strategy has been proposed to deal with the optimization of closed-loop MBS. The key idea is to replace the resolution of the assembly constraints by the minimization of their norm. Besides, the residue of the constraint minimization is used to penalize the objective function when the system cannot be properly assembled. On the basis of 3D
178
J.-F. Collard et al.
examples of parallel manipulators, the method has shown its efficiency and reliability. However, it may suffer from some limitations mainly due to the choice of parameters. For example, if the performance to optimize has a wide range, it may be difficult to fix the weighted factor w between the actual objective and the penalty term. If more complex topologies are concerned, it may also be tricky to choose a good additional assembly constraint c to avoid singular configurations and multiple assembly solutions. As further prospects, we intend to improve the definition of the weighted factor w using similar techniques as for multi-objective optimization. We can also find a way to automatically build function c and define the safety gap cth to avoid singularity. The next challenge of this research could be the adaptation of the proposed penalty technique to more complex problems involving the numerical integration of the equations of motion, using for instance backtracking techniques to analyze the sensitivity of the objective function.
Acknowledgements This research has been sponsored by the Belgian Program on Interuniversity Attraction Poles initiated by the Belgian State – Prime Minister’s Office – Science Policy Programming (IUAP V/6). The scientific responsibility is assumed by its authors.
References 1. Alici G, Shirinzadeh B (2004) Topology optimisation and singularity analysis of a 3-sps parallel manipulator with a passive constraining spherical joint. Mech Mach Th 39:215–235 2. Angeles J (1997) Fundamentals of robotic mechanical systems: theory, methods, and algorithms. Springer, New York 3. Cabrera J, Simon A, Prado M (2002) Optimal synthesis of mechanisms with genetic algorithms. Mech Mach Th 37:1165–1177 4. Ceccarelli M, Lanni C (2004) A multi-objective optimum design of general 3R manipulators for prescribed workspace limits. Mech Mach Th 39:119–132 5. Collard JF, Fisette P, Duysinx P (2005) Contribution to the optimization of closed-loop multibody systems: application to parallel manipulators. Mult Syst Dyn 13:69–84 6. Datoussa¨ıd S, Verlinden O, Conti C (2002) Application of evolutionary strategies to optimal design of multibody systems. Mult Syst Dyn 8(4):393–408 7. Fattah A, Hasan Ghasemi A (2002) Isotropic design of spatial parallel manipulators. Int J Robotics Res 21(9):811–824 8. Gallant-Boudreau M, Boudreau R (2000) An optimal singularity-free planar parallel manipulator for a prescribed workspace using a genetic algorithm. In: Proceedings of the 3rd International Conference on Integrated Design and Manufacturing in Mechanical Engineering. Presses Universitaires PolytechniquesMontreal, Montreal
Kinematical Optimization of Closed-Loop Multibody Systems
179
9. Ganovski L (2007) Modeling of redundantly actuated parallel manipulators: actuator solutions and corresponding control applications. Ph.D. thesis, Universit´e Catholique de Louvain, Louvain-la-Neuve, Belgium 10. Haj-Fraj A, Pfeiffer F (2001) Optimization of automatic gearshifting. Vehicle Syst Dyn Suppl 35:207–222 11. Jim´enez JM, Alvarez G, Cardenal J, Cuadrado J (1997) A simple and general method for kinematic synthesis of spatial mechanisms. Mech Mach Th 32(3):323–341 12. Lemay J, Notash L (2004) Configuration engine for architecture planning of modular parallel robots. Mech Mach Th 39:101–117 13. Madsen K, Nielsen HB, Tingleff O (2001) Optimization with constraints. Lecture notes for the course “Optimization and data fitting”, Technical University of Denmark, Lyngby 14. Mar´ın FTS, Gonz´ alez AP (2003) Global optimization in path synthesis based on design space reduction. Mech Mach Th 38:579–594 15. N´emeth I (1998) A cad tool for the preliminary design of 3-axis machine tools: synthesis, analysis and optimisation. Ph.D. thesis, Katholieke Universiteit Leuven, Leuven, Belgium 16. Ryu J, Cha J (2003) Volumetric error analysis and architecture optimization for accuracy of hexaslide type parallel manipulators. Mech Mach Th 38:227–240 17. Sedlaczek K, Eberhard P (2006) Using augmented lagrangian particle swarm optimization for constrained problems in engineering. Struct Mult Opt 32: 277–286 18. Simionescu P, Beale D (2002) Optimum synthesis of the four-bar function generator in its symmetric embodiment: the Ackermann steering linkage. Mech Mach Th 37:1487–1504 19. Stocco L, Salcudean S, Sassani F (1998) Fast constrained global minimax optimization of robot parameters. Robotica 16:595–605 20. Su Y, Duan B, Zheng C (2001) Genetic design of kinematically optimal fine tuning stewart platform. Mechatronics 11:821–835 21. Vallejo J, Avil´es R, Hern´ andez A, Amezua E (1995) Nonlinear optimization of planar linkages for kinematic syntheses. Mech Mach Th 30(4):501–518 22. Wehage RA, Haug EJ (1982) Generalized coordinate partitioning for dimension reduction in analysis of constrained dynamic systems. J Mech Design 134: 247–255 23. Zhou H, Cheung E (2001) Optimal synthesis of crank-rocker linkages for path generation using the orientation structural error of the fixed link. Mech Mach Th 36:973–982
A Comparison of Three Different Linear Order Multibody Dynamics Algorithms in Limited Parallel Computing Environments Adarsh Binani1 , James H. Critchley2 , and Kurt S. Anderson3 1
2
3
The Math Works, Natick, Massachusetts 01760, USA E-mail:
[email protected] Multibody.org, Lake Orion, Michigan 48360, USA E-mail:
[email protected] Rensselaer Polytechnic Institute, Troy, New York 12180-3590, USA E-mail:
[email protected]
Summary. A comparison of three multibody dynamics algorithms, Featherstone’s Divide and Conquer Algorithm (DCA), a fast sequential recursive algorithm, and an efficient version of the DCA (DCAe), in a parallel computing environment is presented. The comparison is made both in terms of theoretical (ideal) performance and with an implementation evaluated on a parallel computer. The DCAe was motivated by the promise of practical performance gains, and its development as a combination of DCA and efficient recursive methods is reviewed. An implementation of each method in C++ using an object oriented architecture for a shared memory parallel computer is described and results are reported for a computer with four processors. Benchmark results agree closely with the predicted algorithm performances. The default DCA is shown to require four processors to obtain results similar to the recursive sequential results and the DCAe is verified to outperform the efficient recursive solution with just two processors when load balancing is considered.
Nomenclature Ak Spatial (6 dimensional) acceleration of body k reference point in the Newtonian reference frame. ˇk Backward recursive spatial acceleration of body k. A +k Portion of the spatial acceleration of body k which is explicit in A ˙ unknown state derivatives u. +k Portion of spatial acceleration of handle i of body or subsystem k A i ˙ which is explicit in unknown state derivatives u. AkT Spatial state explicit acceleration of body k. b Base body of a terminal subsystems. bki State explicit term in the equation for handle i of DCA subsystem k. C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
181
182
A. Binani et al.
k Clocal Orientation transformation which relates the orientation of the child end to the parent end of subystem/body k. N sub1 [k] C Orientation transformation which relates the orientation of the child end of subsystem sub1 [k] to N . N sub2 [k] C Orientation transformation which relates the orientation of the child end of subsystem sub2 [k] to N. ch[k] Topological child body index of body k. ˇ ch[k] Child body index of body k in a backwards topological representation (same as p[k] for chain systems). d An intermediate DCA term formed during subsystem combination. + k Recursive coefficient of propagated spatial handle force for E body k. ˇ k Backward recursive coefficient of propagated spatial handle force E for body k. fik Spatial joint constraint force at handle i of DCA subsystem k. F k Spatial applied and state explicit inertia force (6 dimensional) associated with body k. k Spatial right hand side force (6 dimensional) associated with FRHS Newton-Euler equations of body k. F+ k Recursive spatial force of body k. Fˇ k Backward recursive spatial force of body k. I k Spatial (6 dimensional) inertia of body k, associated with body’s reference point. I+k Recursive spatial inertia (Articulated body inertia) of body k. Iˇk Backward recursive spatial inertia of body k. k Index associated with representative body-k. M k Diagonal block k of a triangularized generalized mass matrix associated with forward recursion inertia I+k . k ˇ M Diagonal block k of a triangularized generalized mass matrix associated with backward recursion inertial Iˇk . n Number of system degrees-of-freedom. N Newtonian reference frame. p[k] Topological parent body index of body k. pˇ[k] Parent body index of body k in a backwards topological representation (same as ch[k] for chain systems). Prk Spatial (6 dimensional) partial velocity of body k with respect to the coordinates defining joint r. Pˇrk Backward spatial partial velocity of body k with respect to the coordinates defining joint r. qk Relative generalized coordinates associated with joint k. RHS Right Hand Side. S The system. S k Spatial position vector cross product required to compute accelerations and forces on adjacent bodies p[k] and k.
Multibody Algorithms in Limited Parallel Computing Environments
183
Sˇk Backward spatial position vector cross product required to compute accelerations and forces on adjacent bodies pˇ[k] and k. sub1 [k] Parent subsystem used in generation of assembly k. sub2 [k] Child subsystem used in generation of assembly k. T k Spatial triangulatization matrix associated with body k. Tˇ k Backward spatial triangulatization matrix associated with body k. U An identity matrix. uk Generalized speeds (quasi-coordinate velocities) of joint k. W An intermediate DCA term formed during subsystem combination. Φk A handle constraint force coefficient matrix in the DCA equation for subsystem k. Ψ k A handle constraint force coefficient matrix in the recursive equation for DCA subsystem k.
1 Introduction The speed of solution in performing dynamic systems simulation and analysis has motivated many in the computational multibody field to pursue efficient alternatives to classical formulations. Those interested in operator or hardware in the loop simulations or model based predictive control, often require faster than real-time solutions. Designers may seek to perform comprehensive design optimization on high fidelity dynamic models, which may require an unbounded number of iterations. While others may simply wish to work with prohibitively large systems. Regardless of motivation, the development of time efficient multibody solutions extends the domain of realizable analysis for problems involving multibody systems. If n is the number of degrees-of-freedom of a system, sequential algorithms using state space formulation have been developed, which have computational cost as a linear function of n, commonly referred as O(n) algorithms. Theoretically, these are the most efficient algorithms and require the least number of numerical computations for articulated body systems involving large n and few kinematic loops. The parallel computing application performance of these algorithms due to their recursive nature is topologically limited. In 1999 Featherstone introduced the Divide and Conquer Algorithm (DCA) for parallel computation of multi-rigid-body dynamics [9, 10]. This algorithm boasts a logarithmic order of time complexity for the solution of the unknown generalized coordinate accelerations in the presence of a parallel computer system and scales linearly with n. In other words, the algorithm is O(log2 n) in the presence of O(n) processors, which is a theoretical minimum order time/resource solution. Prior to the DCA, optimal order time/resource methods were either topologically limited to chain systems (Fijany et al. [11]) or iterative and therefore
184
A. Binani et al.
not truly logarithmic (Anderson and Duan [4]). The DCA represents a milestone in efficient computation of multibody dynamics, however the DCA does not approach the efficiency of the fastest parallel algorithms until the number of available processors becomes very large [10]. However, increasing the number of processors leads to an undesirable increase in the communication overhead, affecting the performance. This tradeoff is captured in Amdahl’s Law [13], according to which for a given problem speed up does not increase linearly as the number of processors increases, or a larger instance of the same problem yields higher speedup and efficiency for the same number of processors. This arises from serial bottlenecks and interprocessor communications costs. This inspires one to work towards the development of parallel algorithms that have a coarse grain communication structure, requiring less interprocessor communication per time step, while performing the serial portions of the calculations efficiently. With this motivation, Critchley [6] introduced a modified approach to DCA, called the DCAe, which uses computationally more efficient serial algorithms, like the state space O(n), to create the basic subsystems used in DCA. Specifically, the algorithm may be viewed as a hybrid between DCA and the state-space O(n). The DCAe offers a simplified solution and a significantly lower operations count in a limited multi-processor environment. Being a hybrid algorithm, on a single processor machine the DCAe and O(n) are the same, while DCAe operations count equals that of DCA when the number of processors used become greater than or equal to the number of system bodies. The DCAe thus promises competitive and scalable performance, and gives the user increased flexibility to construct subsystems that would be better suited to time optimal use of available parallel computational resources. This work presents a comparison between the O(n), DCA, and DCAe on the basis of their performance when implemented in parallel using a limited number of processors. All the algorithms were implemented in C++ using an object oriented architecture for a shared memory parallel computer with a detailed implementation description presented here in. As DCAe is the hybrid between DCA and O(n) algorithms, the generic objects used for implementing the DCA and O(n) were reused through interface inheritance to construct the generic implementation for DCAe, demonstrating code reusability and reduced development time. This exposition begins with briefly presenting a state space O(n) algorithm [1], which is used as the sequential state space algorithm in the formulation of DCAe. This version of O(n) algorithm utilizes Kane’s Method to solve the dynamics. DCA and DCAe are then presented in a nomenclature consistent with this O(n) algorithm. The paper also presents a formulation based on the Divide and Conquer approach to evaluate the kinematic quantities needed in the DCA and DCAe. Simulation results for the dynamic behavior of articulate spatial chain systems involving 128, 256 and 1,024 bodies using these algorithms in serial, two and four processor shared memory parallel machine modes are obtained and presented. Furthermore, as DCAe gives the user the flexibility to change the computational load on each of the processors, a static loading scheme on each of the processors that gives
Multibody Algorithms in Limited Parallel Computing Environments
185
the best performance was determined for a two and four processor modes by running simulations with varying load sizes.
2 Preliminaries Before introducing the Divide and Conquer equations, it is necessary to establish some fundamental notations and relationships. This paper assumes that the reader is either familiar with “spatial” notations or otherwise capable of backing out the necessary details of the compact equations which are presented. The unfamiliar reader may simply regard spatial quantities as linear algebraically augmented linear and angular quantities or refer to [8] or similar texts. In spatial matrix form, the Newton-Euler equations of motion for a single rigid body k can be represented as k . I k Ak = FRHS
(1)
In Eq. (1), I k is the 6 × 6 spatial inertial matrix associated with this body’s reference point. This partitioned matrix contains the 3 × 3 rotational inertia matrix of body k with respect to its reference point; a 3 × 3 diagonal matrix of the mass of body k; and off diagonal 3×3 matrices containing all necessary first moment terms. The body’s reference point may be the body’s mass center, or any other point of interest, such as a point which connects an articulated body to its parent (proximal body). The term Ak is the spatial acceleration (6 × 1 matrix) of body k containing the stacked matrix representations of the body’s angular acceleration and translational acceleration of body’s reference k is the spatial force (6 × 1 matrix) containing the stacked point. Finally, FRHS matrix representations of moments and forces which must be present on the right hand side (constraint, applied and gyroscopic inertial forces). For simplicity, only multibody chain systems (Fig. 1) will be studied here. Extension of all presented methods to tree and looped systems follows exactly as the original formulation of Featherstone [10]. In a multibody chain system, each body k has exactly one child (distal or outboard) body ch[k], with the exception of the terminal (last or outboard most) body which has no children. Each body also has a single parent (proximal or inboard) body p[k] and the first body in the system has a parent which is the inertial reference frame N .
Fig. 1. A chain system
186
A. Binani et al.
Relative joint coordinates q are used to describe the position and orientation of each body relative to its parent. Commonly the kinematic relationships between bodies are simplified by exploiting some form of quasi coordinate velocities u (or generalized speeds [14]) which may be any invertible linear ˙ combination of the system q: (2) u = X(t,q) q˙ + Y(t,q) . For the purposes of recursive methods, it is common if not required to restrict the generalized speeds to be functions of only the local joint coordinates and coordinate velocities: (3) uk = X(t,qk ) q˙k + Y(t,qk ) . The spatial accelerations may be partitioned as +k + Ak , Ak = A (4) T
with AT representing those parts which are functions only of the system state + The spatial ˙ A. (and time), and parts which are linear in the unknown u, + is kinematically related to is parent body p[k], by recursive acceleration A definition of the relative coordinates and generalized speeds as +p[k] + P k u˙ k . +k = S k T A (5) A k
+ This decomposition and notation is essentially that of Anderson [3]. This A ˙ which are to be determined contains the unknown acceleration variables, u, and integrated. AkT by comparison may be immediately determined from the know state of the system. S k is the spatial shift matrix which encapsulates the necessary position vector cross product (shift) operation (in transpose) required to evaluate the acceleration of that point of body p[k] which instantaneously coincident with the reference point of its outboard (child) body k. The term Pkk is exactly the matrix of state dependent spatial partial velocities [14] of body k with respect to the joint local quasi coordinate velocities uk . These partial velocities characterized the motion of the systems by defining the free modes of motion [16]. Similarly, pre-multiplying the spatial force associated with the body k reference point by the spatial shift matrix S k transforms the spatial force into an equivalent system now associated with the point which is instantaneously coincident with the reference point of parent body p[k].
3 The Divide and Conquer Algorithm This section presents a brief derivation of an equivalent form of Featherstone’s Divide and Conquer Algorithm for chain systems and illustrates that such a method achieves O(log2 n) time complexity of solutions in the presence of a parallel computer with resources which scale with problem size (O(n) processors).
Multibody Algorithms in Limited Parallel Computing Environments
187
Fig. 2. A binary assembly tree for a chain system
In what follows, a notational consistency has been maintained with the original work of Featherstone. While many of the symbols used by him are given different definitions, they represent analogous quantities and the resulting form of the solution is changed only slightly. This abuse of notation is designed to clearly illustrate that the underlying assembly operations are the prior work of Featherstone, while permitting the inclusion of efficient subsystems. The DCA begins by computing all necessary kinematic information for the entire system via two sweeps of a binary assembly tree (O(log2 n)), as illustrated in Fig. 2. A binary tree solution for kinematic problems can be found in [15]. In the DCA, the kinematic quantities of interest are Pkk , S k , and state dependent inertia and applied forces F k . The Pkk and S k are functions of the local joint geometry and are evaluated concurrently at the leaf nodes (single bodies) of the binary tree. The F k requires the evaluation of needed absolute and relative positions, orientations, angular and linear velocities for the computation of general applied forces, as well as the state dependent linear and angular acceleration terms AkT . The assembly tree recursively combines subsystems at the connecting joint to form a single assembled subsystem. In an assembled subsystem k, sub1 [k], refers to the parent subsystem of the assembly joint and sub2 [k] refers to the child subsystem, where each subsystem contains its parent connecting joint. k , Using this notation, a complete set of local subsystem transformation Clocal relating the orientation of the of the terminal (child) end of the assembly k to
188
A. Binani et al.
its base (parent) end, and inertial orientation transformations N C k are constructed. This is accomplished via an inward sweep of the assembly tree using sub [k]
sub [k]
k = Clocal1 Clocal2 , Clocal
(6)
and an outward sweep with N
C sub1 [k] =
N
sub [k] T C k Clocal2
N
C sub2 [k] =
N
Ck,
and
(7a) (7b)
r is exactly N C r . where for the root subsystem r, Clocal Having solved for the kinematic based information, the two handle [9] spatial Newton-Euler equation of a single rigid body p[k] is introduced as
+p[k] = f p[k] + S k f p[k] + F p[k] . I p[k] A 1 1 2
(8)
In this equation subscripts 1 and 2 refer to the first and second handles of parent body p[k]. Handles 1 and 2 are those points of p[k] connecting it to its +p[k] is the spatial recursive parent p[p[k]] and child k bodies, respectively. A 1 +p[k] ), while f p[k] and f p[k] are the spatial acceleration of handle 1 (identically A 1 2 joint constraint forces of the inboard (parent) and outboard (child) joints, respectively (which must be applied at the handle 1 of body p[k], necessitating p[k] the S k on f2 ). Equation (8) can be solved for the unknown spatial recursive acceleration as #−1 " p[k] p[k] +p[k] = I p[k] f1 + S k f2 + F p[k] (9) A 1 and the acceleration of the second handle is then related by +p[k] = S k T A 2 T = Sk
+p[k] , A 1 " #−1 p[k] p[k] I p[k] f1 + S k f2 + F p[k] .
(10)
These equations are valid for any two handle rigid body and are re-written as +k = Φk f k + Φk f k + bk and A 1 11 1 12 2 1 k k k k k + A2 = Φ21 f1 + Φ22 f2 + bk2 ,
(11a) (11b)
where the Φ and b are determined through collection of coefficients. At this point it is the objective of a divide and conquer method to take the equations associated with two rigid bodies and combine them to form a new subsystem of identical structure. To this end it is recognized that the common joint spatial constraint forces are equal and opposite, giving p[k]
f2
= −f1k .
(12)
Multibody Algorithms in Limited Parallel Computing Environments
189
The kinematic definition of Eq. (5) is written in two handle form as +k = S k T A +p[k] + P k u˙ k , A 1 k 1
(13)
+p[k] + P k u˙ k , +k = A A 1 k 2
(14)
which simplifies to
which may be used to isolate u˙ k as +k − A +p[k] . Pkk u˙ k = A 1 2
(15)
Equation (15) is expanded and simplified using Eqs. (11a, b), and Eq. (12) giving p[k]
p[k]
p[k]
p[k]
p[k]
Pkk u˙ k = Φk11 f1k + Φk12 f2k + bk1 − Φ21 f1 − Φ22 f2 − b2 , p[k] p[k] p[k] p[k] = Φk11 + Φ22 f1k + Φk12 f2k + bk1 − Φ21 f1 − b2 .
(16)
p[k]
From Eq. (16) a solution for u˙ k is available in terms of only f2k and f1 when the orthogonality of the constraint force f1k and the admissible joint motion Pkk is exploited yielding u˙ k =
Pkk
T
Φ −1 Pkk
−1
Pkk
T
Φ −1 {Φk12 f2k p[k]
p[k]
− Φ21 f1 Φ = Φk11 +
p[k]
+ bk1 − b2 }
and
p[k] Φ22 .
(17) (18)
Equation (16) may also provide for the determination of f1k , which together with Eq. (17) yields p[k] p[k] p[k] , f1k = Φ −1 Pkk u˙ k − Φk12 f2k + Φ21 f1 − bk1 + b2 p[k]
p[k]
= W Φk12 f2k − W Φ21 f1 with
+ d,
T −1 T Pkk Φ −1 − U W = Φ −1 Pkk Pkk Φ −1 Pkk # " p[k] , d = W bk1 − b2
(19)
and
(20a) (20b)
where U is an identity matrix. A new subsystem j of identical structure can then be constructed using Eqs. (11a, b) for a body k and its parent p[k] together with Eqs. (12) and (19). Together these yield
190
A. Binani et al.
+j = Φj f j + Φj f j + bj , A 1 11 1 12 2 1
(21a)
+j = Φj f j + Φj f j + bj , A 2 21 1 22 2 2
(21b)
p[k]
p[k]
p[k]
Φj11 = Φ11 + Φ12 W Φ21 ,
(21c)
p[k]
Φj12 = −Φ12 W Φk12 , bj1 Φj22
=
p[k] b1
=
Φk22
− +
(21d)
p[k] Φ12 d,
(21e)
Φk21 W Φk12 ,
(21f)
p[k]
(21g)
Φj21 = −Φk21 W Φ21
and
bj2 = bk2 + Φk21 d.
(21h)
Similar binary tree assembly of all subsystems produces a parallel assembly of global equations Eqs. (21a, b) in logarithmic (O(log2 n)) time complexity. In terms of the subsystem notation, assembly Eqs. (21c–h) are rewritten as sub1 [k]
Φk11 = Φ11 Φk12
=
bk1 =
sub1 [k]
+ Φ12
,
(22a)
sub [k] sub [k] −Φ12 1 W Φ12 2 ,
(22b)
sub [k] b1 1
(22c)
−
sub [k] Φ12 1 d,
Φk22 =
sub [k] Φ22 2
Φk21 =
sub [k] sub [k] −Φ21 2 W Φ21 1
bk2 =
sub1 [k]
W Φ21
sub [k] b2 2
+
+
sub [k] sub [k] Φ21 2 W Φ12 2 ,
sub [k] Φ21 2 d.
and
(22d) (22e) (22f)
Trivial boundary data is then obtained for the entire system S through the connection to the inertial frame and the lack of a connection at the second +N = 0 and f S = 0). These identities solve for the local joint u˙ 1 and handle ( A 2 S + which in turn gives f S . The constraint forces are then propagated back up A 1 1 the assembly tree to give solutions for the unknown coordinate accelerations in a time complexity of O(log2 n). To summarize, the DCA first computes all required kinematic information and state dependent inertia forces in a backward (leaf to root) and forward pass of a binary assembly tree (Fig. 2). Local equations Eqs. (11a, b) are then constructed on the leaf nodes and a backward sweep begins. Equations (21c–h) are applied at each level of the tree during this backward sweep to generate identical forms of Eqs. (21a, b) for each subsystem. When all subsystems have been combined the definition of the connection to the inertial frame allows the solution of the root joint spatial constraint force (using Eq. (21a)), then the root joint generalized acceleration (using Eq. (21b)). The results are then propagated forward through the assembly tree and the complete solution is
Multibody Algorithms in Limited Parallel Computing Environments
191
achieved in O(log2 n) time complexity provided that the computations associated with branches of the tree are delegated to O(n) processors. To avoid confusion during implementation, the reader is again reminded that symbols and entire equations appearing here which are common to [9] do in fact have different numerical values. Mixing definitions from the two will most certainly yield incorrect results. Although derived here for chain systems, Featherstone presents a complete algorithm for trees and loops [10]. A limited connectivity of general subsystems is also desired because the straightforward application of the multi-handle equations exhibit a local cubic growth in complexity, O(h3k ), with handle number hk associated with body/subsystem k. The concept of link splitting is introduced to transform mechanisms with high inter-body connectivity into equivalent mechanisms of lower connectivity through the subdivision of individual bodies and the addition of equivalent rigid joints. The loop solution is a direct extension of the tree formulation. Within each loop a single joint is stripped of its generalized coordinates and represented only by its constraint forces. This equivalent constraint force representation constitutes acceleration level enforcement of the joint constraint which requires the addition of constraint stabilization to remove displacement level violations that accrue during temporal integration.
4 Theoretical DCA Performance This section describes the performance and related issues associated with Featherstone’s DCA (and therefore the DCA form presented to this point). Unless otherwise stated, all observations and results quoted in this section can be found in [9, 10]. Featherstone presents a detailed operations count and comparison of the DCA with other algorithms. An effective operations count of (928 mult + 812 add) log2 (n) is given for the DCA in the presence of exactly n processors and (1150 mult + 937 add)n with one processor (an O(n) algorithm). These operations counts are optimized for chain systems through the use of DenavitHartenberg transformations and a similarly optimized operations count for the efficient recursive O(n) Articulated Body Algorithm (ABA) is also given as (300 mult + 279 add)n. Thus the theoretical minimum number of processors required to demonstrate any sort of speed increase over the serial O(n) method is four. However in the presence of small parallel computers (small numbers of processors) there is a host of O(n) based parallelization schemes which can achieve nearly ideal speed increases (particularly in the case of branched systems). A review of many such methods is presented in [5]. It can be concluded from the existence of available concurrency within the recursive order n algorithms for ∼4 processors that the DCA will not demonstrate a speed-up in the presence of less than ∼16 processors, a number which becomes even larger
192
A. Binani et al. DCA Serial Order−N
Effective Flops
106
105
104 0 10
101
102
103
Number of Processors
Fig. 3. DCA performance for a 1,024 body chain
(potentially unbounded) for branching topologies. Furthermore, an O(p3 ) subsysteming method which implements the same O(n) algorithm local to each subsystem [2, 4, 12] potentially outperforms the DCA for computer systems with hundreds of processors. In the presence of a variable number of parallel resources p, the operations counts of Featherstone can be used to obtain the following DCA operations count
"p# n + 1150 − 1041 multiplications 928 log2 2 p
"p# n + 812 log2 + 937 − 845 additions. (23) 2 p A plot of theoretical performance versus number of processors for a 1,024 body chain is illustrated in Fig. 3 along with strictly serial performance of the ABA O(n) as a reference. Featherstone also notes that approximately one third of the computations involved in the DCA may be done independently, constituting a significant “overlap”. This overlap is used to produce another operations count which is two thirds of the total but requires exactly n processors to realize. It should be noted that all parallel algorithms (and many sequential algorithms) possess some form of overlap but for the purposes of straight forward comparison these properties will be ignored. One should also consider that exploiting overlap typically requires additional interprocessor communications (or synchronization) which for present day computers may very well outweigh potential benefits.
Multibody Algorithms in Limited Parallel Computing Environments
193
Another issue which adversely effects the DCA’s efficiency is that its accuracy is significantly worse than the ABA. To remedy highly inaccurate results for large (n > 1, 000) system chains, Featherstone introduces a pivoting scheme which successfully avoids ill-conditioned calculations present in the formulation. The pivoted version of the DCA remains less accurate than the O(n) algorithm, which should be expected of any method which introduces additional calculations in the form of inversions and propagations to induce parallelism. The additional computational cost of the pivoting scheme is said to make the equations “more complicated” and “less efficient” than the original, but is not otherwise quantified by an operations count. This pivoting scheme is also shown to recast the equations in a form which is the ABA in the case of a single rigid body. In parallel algorithms one should expect the accuracy relative to sequential methods to be a function of available parallelism rather than system size. In this respect one of the chief problems with the DCA is that the concurrency is always generated for exactly n virtual processors which are mapped to the actual number of available processors. This is in contrast to other subsysteming methods which generate concurrency only as required (preserving both accuracy and efficiency for small parallel computers). Ideally, the DCA should only use a number of subsystems which is equal to the number of available parallel processors and use an efficient algorithm within them. Such a change would guarantee speed improvements over the ABA and other efficient recursive O(n) algorithms in the presence of as few as two parallel processors, rendering it highly competitive with methods which have shown useful application.
5 An O(n) Algorithm There are many efficient recursive O(n) algorithms to choose from, but in general they all have more in common than they have differences. The DCA has been recast to work with one such algorithm, that of Anderson [3], however it should be possible to do the same for most of the others as well. For completeness Anderson’s algorithm for chain systems is briefly presented here in a notation consistent with that of this paper. For more detail refer to [3]. Given relative joint coordinates qk and relative quasi-coordinate velocities uk , a linear order forward topological solution for all position and velocity kinematics is automatic. For the purposes of this algorithm, this means that the terms S k , Pkk , and F k are computed for all bodies k. Next a backwards sweep of the topology is used to compute quantities I+ and F+ . These terms are also known as the recursive spatial inertia and recursive spatial force (or articulated body inertia and articulated body force [7]), respectively, and are given as
194
A. Binani et al.
T I+k = I k + T ch[k] I+ch[k] S ch[k]
and
F+ k = F k + T ch[k] F+ ch[k] ,
(24a) (24b)
with the following intermediate quantities " −1 k T # Pk T k = S k U − I+k Pkk M k T k k M k = Pkk I+ Pk .
and
(25a) (25b)
The boundary relationships I+t = I t and F+ t = F t are retrieved by observing that the chain system terminal body t has no children. It is useful to point out that at the completion of the backwards sweep, the admissible motion projection equation of motion associated with body k is +k − F+ k }. 0 = (Pkk )T { I+k A
(26)
In a final forward topological sweep, the solution for u˙ k is used together with Eq. (5) to obtain the solution −1 k T k k T p[k] + Pk (27) − F+ k . A I+ S u˙ k = − M k The required boundary relationship is that the spatial recursive acceleration +N = 0). of the inertial frame (parent of the first body in the chain) is zero ( A
6 Terminal Subsystems An O(n) subsystem can be used within any terminal subsystem appearing in the DCA. This is accomplished by replacing the equation of a single rigidbody with the articulated body equation for the base body b of the terminal subsystem. The articulated body equation for the base body which replaces Eq. (8) is given as +b = f + Ψ b f + F+ b . I+b A 2 2 1
(28)
This equation was obtained directly from Eq. (26) by definition of the admissible motion (velocity) projection. Within terminal subsystems the second handle does not contain a connection, therefore the force f2b is identically zero and the unknown coefficient matrix Ψ2b is of no consequence. In fact the analyst is free to choose the second handle as coincident with the first. In any case the form of +b = Φb f b + Φb f b + bb A 1 11 1 12 2 1 +b = Φb f b + Φb f b + bb A 2 21 1 22 2 2
and
(29a) (29b)
Multibody Algorithms in Limited Parallel Computing Environments 2.5
195
x 106 DCA DCAt Serial Order−N
Effective Flops
2
1.5
1
0.5
0 0
1
2
4
3
5
6
7
8
9
Number of Processors
Fig. 4. Improvement using O(n) terminal subsystems on a 1,024 body chain
is required and can be arrived at by computing I+b and F+ b with the efficient recursive O(n) operations of Eqs. (24a–25b). Equations (29a–F29b) are exactly those appearing in the DCA formulation and proper selection of subsystems and the order in which they are combined (load balancing) insures a speed increase relative to the O(n) algorithm in the presence of as few as two parallel processors. Locally, this equation is both formed and solved at the same cost as the O(n) algorithm and provides more accurate results without the pivoted form of the DCA. Exploiting this identity and load balancing the processors reveals a significant increase in throughput. Figure 4 illustrates the theoretical performance benefits of this terminal O(n) DCA method (referred to as DCAt) applied to a sequential chain system of 1,024 bodies with small parallel computers. This optimization has further significance to branching systems which may be subdivided such that the identity holds for terminal subsystems of each branch. From this one may now conclude that the worst case modified DCA system is actually a chain system and not a tree. Although there still remains additional complexity introduced at the branching bodies which should be treated as per the discussion of Featherstone [10].
7 Non-terminal Subsystems Non-terminal subsystems present a twofold problem with respect to obtaining the two handle subsystem equations required for the fundamental form of the DCA. First, in the articulated body equation for the subsystem base, Eq. (28),
196
A. Binani et al.
the f2b are neither zero or in general applied to the base body. This means that Ψ2b is a nontrivial dynamic quantity which can not be obtained directly from the geometry. Second, the equation associated with the acceleration of handle 2 cannot be formed by the same O(n) traversal as that of handle 1 because the articulated body equation for the body at handle 2 cannot contain ancestral body forces such as f1b . This information is instead embedded in the forward solution phase of the O(n) algorithm. 7.1 Handle 1 Equations To resolve the issues associated with Ψ2b appearing in the equation for handle 1, a first step is to write out the articulated body equation for the subsystem terminal body t (containing handle 2). This is given by +t = S ch[t] f + F+ t , I+t A 2
(30)
+t = S ch[t] f + F t ItA 2
(31)
which further simplifies to
by definition of a terminal body. It should be noted that while the recursive computations of I+ and F+ follow exactly as in (24a–25b) and ignore bodies beyond the local subsystem terminal body, the shift operation to one body beyond the terminal body (handle 1 on the adjacent subsystem or ch[t]) is required. The systematic application of recursive O(n) procedures can be used to generate recursive equations of motion for the other bodies in the local subsystem. The recursive assumption is that these equations exist in the form of + k S ch[t] f + F+ k +k = E I+k A 2
(32)
with the trivial boundary data I+t = I t ,
+t = U , E
F+ t = F t
(33)
obtained on the subsystem terminal body. Solving for the assumed form of the recursive quantities verifies that Eqs. (24a–25b) are unchanged and further requires −1 k T k k T p[k] + k S ch[t] f − F+ k + and (34) (Pk ) I+ (S ) A −E u˙ k = − M k 2 + p[k] = T k E +k. E
(35)
The coefficient matrices for the subsystem base body equation of motion + b S ch[t] f + F+ b +b = f + E I+b A 1 2 may now be obtained in linear time.
(36)
Multibody Algorithms in Limited Parallel Computing Environments
197
7.2 Handle 2 Equations Forming the necessary subsystem equation associated with the acceleration of handle 2 may be accomplished using a backwards subsystem sweep and almost exactly the equations of handle 1. To facilitate the justification of these equations, the system is given a backwards ordering and new quantities associated with the backwards solution are denoted with (ˇ). The O(n) algorithms exploit the forward propagation of Eq. (5) and the corresponding definition of the partial velocities. The first observation fundamental to the O(n) derivation reduces the summation over all bodies in D’Alembert’s variational form (or in this case Kane’s Method) such that only one term appears at terminal bodies t. In the backwards system the summation at the terminal body (tˇ = b) does not readily reduce to one term because the variational displacements are coupled by the subsystem generalized coordinates. To resolve this issue one observes that the DCA equations require only the matrices relating the accelerations and constraint forces and is therefore free from any assumptions about a uniform coordinate description between the handle 1 and handle 2 equations of motion. In this case a free joint (six degree of freedom) may be added at the handle 2 subsystem base body (ˇb = t) and the kinematics recomputed using this definition. Because the analyst is also free to choose a specified base motion upon which to attach the free joint, the calculation of a second set of kinematic data associated with this backwards solution can be completely avoided. This is done by specifying the backwards solution base motion such that the state +t are identical (and thereˇˇb and A dependent acceleration terms absent from A ˇ ˇ b b t + =A + ) and the spatial forces F also remain unchanged. ˇ =A fore A The backwards kinematic equation of ˇ +p[k] +k = ( Sˇk )T A + Pˇkk u˙ k A
(37)
allows the equation of motion for body tˇ to be written as ˇ +tˇ ˇ = f1 + Fˇ t . Iˇt A
(38)
Equation (38) is then simplified by the definition of a terminal body as ˇ +tˇ ˇ = f1 + F t . ItA
(39)
The assumed recursive form for the equations of motion of all bodies in the local backwards subsystem is written as ˇ k f + Fˇ k , +k = E Iˇk A 1
(40)
where ˇ ˇ ˇ Iˇk = I k + Tˇ ch[k] Iˇch[k] ( Sˇch[k] )T , ˇ ˇ Fˇ k = F k + Tˇ ch[k] Fˇ ch[k] , ˇ k )−1 (Pˇ k )T , Tˇ k = Sˇk U − Iˇk Pˇ k (M k
k
(41a) (41b) (41c)
198
A. Binani et al.
ˇ k = (Pˇ k )T Iˇk Pˇ k , M k k ˇ ˇ k )−1 (Pˇ k )T Iˇk ( Sˇk )T A ˇ k f − Fˇ k +p[k] −E u˙ k = −(M 1 k ˇ ˇ p[k] ˇk. = Tˇ k E E
(41d) and
(41e) (41f)
The trivial boundary data obtained on body tˇ permits the solution for the unknown recursive quantities, which are analogous to those required for handle 1. Thus the coefficients for the subsystem terminal body equation of motion ˇ +ˇ b ˇ ˇb f + S ch[ˇb] f + Fˇ ˇb , =E Iˇb A 1 2
ˇ t f + S ch[t] f + Fˇ t , +t = E = Iˇt A 1 2
(42a) (42b)
are obtained in linear time (where the forward shift and child body definitions must be used exactly as shown).
8 Theoretical Performance Utilizing the efficient terminal and non-terminal subsystem solutions within the DCA defines the efficient Divide and Conquer Algorithm (DCAe). An effective operations count for the DCAe given an arbitrary computer system with p processors is obtained as
"p# n + 637 − 495 multiplications 928 log2 2 p
"p# n + 812 log2 + 548 − 446 additions. (43) 2 p In the operations count, the coefficients of log2 ( p2 ) is exactly the same as the DCA assembly for which the linear algebraic assembly operations of Featherstone have not changed. The coefficients of ( np ) are the processor local O(n) operations which includes a single forward kinematics pass, double the usual articulated body inertia and force computations, and the two additional E terms. The constant term accounts for the triviality of all operations at the boundaries. Further improvements to this count are realized through load balancing, which simply means that twice as many bodies are allocated to terminal subsystems within which the O(n) algorithm is applied without modification. In the presence of a single processor, the entire system is terminal and the O(n) solution is retrieved. Figure 5 illustrates the theoretical performance of the balanced algorithm applied to a 1,024 body chain system. It can be seen that the DCAe widely outperforms the traditional DCA. For today’s parallel computers, the region of practical importance involves small parallel computers and is shown in Fig. 6.
Multibody Algorithms in Limited Parallel Computing Environments
199
DCA DCAe Serial Order−N
Effective Flops
106
105
104 100
101
102 Number of Processors
103
Fig. 5. DCAe, DCA, O(n) applied to a 1,024 body chain with large number of processors
6 2.5 x 10
DCA DCAe Serial Order−N
Effective Flops
2
1.5
1
0.5
0 0
1
2
3
4 6 5 Number of Processors
7
8
9
Fig. 6. DCAe, DCA, O(n) applied to a 1,024 body chain with limited number of processors
200
A. Binani et al.
9 Implementation The three multibody solution algorithms, DCA, O(n), and DCAe were implemented in the object oriented programming language C++. Mathematically the DCAe is the combination of the DCA and O(n) solutions and through interface inheritance, the generic forms of both algorithms can be used to construct a software implementation. The benefits of this approach are not only reduction in the development time of the three solutions, but also the ability to compare algorithm performance against common code instead of three independent implementations. The implementation of the DCA solution algorithm operates on DCA subsystem interface objects to assemble and solve the equations of motion. At the finest level a subsystem is an individual rigid body which satisfies an interface for DCA subsystems. The difference in the DCAe implementation is that at the finest level the subsystem is a collection of bodies with internal motions that are governed by O(n) equations. In the software, the DCAe is achieved by defining an O(n) based DCA subsystem which uses the same interface. This way the DCA and DCAe use the default code (common code) in the assembly of subsystems and solution of root node equations. A common DCA subsystem interface also enables DCAe and individual DCA body subsystems to be used together in the same solution. The hierarchy of objects is shown in Fig. 7. The efficient O(n) subsystems in the DCAe encapsulate a modified form of the traditional O(n) solution. This O(n) subsystem solution is obtained by augmenting the O(n) solution with the DCA constraint force recursion term + and configuring a second O(n) system solution with reversed topology. In E software, the O(n) solution elements used in the DCAe subsystems are derived from a complete implementation of the more traditional O(n) algorithm which is also available for comparison. The software, like the formulation shown here, is a chain system solver and is further simplified by considering only revolute (pin rotation) joints. These simplifications enable the grouping of joint properties and the adjacent outboard body into a single object. The resulting systems are however completely 3 dimensional as the joint axes may be assigned to any direction. The nature of the DCA, O(n), and DCAe solutions detailed here requires an interface to numerical temporal integration schemes through the system state (integrator output) and state derivatives (integrator input). The imple-
DCA Subsystem Interface
Default DCA Subsystem
Root Subsystem
Rigid Body Subsystem
O(n) Subsystem
Fig. 7. Articulated subsystem class inheritance diagram
Multibody Algorithms in Limited Parallel Computing Environments
201
mentation is therefore configured to compute the solution time for the state derivatives given the system state (commonly referred to as a function evaluation) and is independent of integration scheme. Although it is useful to point out that single-step (e.g. Runge-Kutta) and multi-step explicit integrators compute the state update on each state independently and can easily be run in parallel. Parallelism is obtained using individual execution threads in a shared memory environment. Specifically, the POSIX threads library (available on nearly all UNIX based operating systems) is used. All threads are created during the initialization phase of parallel solutions and remain in use until the program terminates. In an effort to minimizes the impact of operating system thread scheduler performance, the number of threads is always less than or equal to the number of processors. Individual DCA subsystems are allocated to processors in an outward sweep (starting from the root subsystem of the binary tree). In a subsystem k, sub1 [k] will execute on the same processor as k, while sub2 [k] depending on availability of processors is either allocated to an unused processor or is on the same processor as k. In the implementation of the DCAe, sub2 [k] is always allocated to an unused processor. The one to one mapping of processors in the DCAe is ensured by requiring the number of subsystems at the finest level to be equal to the total number of available processors. In the shared memory environment, communications amount to synchronization and are implemented using POSIX semaphores. Child threads wait for their parent thread to signal that data is available for computations to begin. Upon completion of its task, the child thread signals the parent and waits for a reply before beginning computation again. In the course of both DCA and DCAe computations these signal hand-offs occur three times. First, execution begins at the root node on the root thread. The assembly tree is traversed signaling any child threads to do the same. Upon reaching the leaf nodes, the local assembly kinematics are computed and passed inward to parent nodes and so on back to the root node (one complete hand-off). The system wide kinematics information is then propagated out the tree and local dynamics are computed on the way back in (two complete hand-offs). The dynamic solution is computed on the following outward sweep and notification that computation is completed is propagated back to the root (three complete hand-offs). The implementation accommodates an interface with the general case of adaptive step integrators. In the case of fixed step integrators the final dynamics step can include the integration rule and compute the new coordinate value to begin the local kinematics computations on the way back to the root of the tree. So doing eliminates one full communications sweep per function evaluation in every step after the initial time step.
202
A. Binani et al.
9.1 Load Balancing An optimized computational load distribution scheme across the processing threads for the DCAe would complete all the computations on each of the threads during the inward traversal without any time delay between them. Furthermore, as the coefficients Φb12 , Φb22 associated with terminal subsystems in the DCAe are of no significance, evaluation of two handle coefficients for the terminal subsystem in DCAe involves fewer numerical computations when compared to non-terminal subsystems. Hence, a balanced processor loading scheme can be achieved by optimally allocating an O(n) subsystem with larger number of bodies to the thread handling the terminal subsystem. The implementation gives the flexibility to control the computational load on each of the threads by changing the number of bodies in the O(n) subsystems and is used to determine the optimized loading scheme that would give the best performance under different processing environments.
10 Results An operations count predicts improved performance and scalability of the DCAe when compared with DCA in a limited parallel computing environment. To verify this, the software implementation was applied to a set of articulated spatial chain systems involving 128, 256, and 1,024 bodies connected by revolute joints and timing results obtained in a parallel environment. Table 1 shows timing results for 200 function evaluations of the DCAe, DCA, and O(n) implementations on a 2.2 GHz AMD Opteron-848 four processor shared memory workstation. The DCA and DCAe were both evaluated using two and four threads (two and four processors) and single threaded runs were also obtained for the DCA and O(n) implementations. The DCAe uses O(n) subsystems each comprised of an equal number of bodies in obtaining these results. The results demonstrate that the computational cost is in close proportion to the predicted operations count. The nearly linear scaling of parallel Table 1. Time required for simulating 200 function evaluations (in secs) using equal thread loading scheme Algorithm
# threads
Number of bodies 128
DCAe DCA DCAe DCA DCA O(n)
4 4 2 2 1 1
256
1,024
3.29 6.41 26.32 5.20 10.08 39.88 6.33 12.41 49.96 9.84 20.57 79.87 20.89 41.94 169.46 5.09 9.97 40.11
Multibody Algorithms in Limited Parallel Computing Environments
203
execution times versus problem size also indicates that the communication costs (data sharing) of the hardware are insignificant for problems of over 128 bodies. 10.1 Load Balanced Results In order to determine the optimized static loading scheme for the DCAe, simulations results were obtained with varying load factors on the terminal subsystem using two and four threads. These “terminal load factors” give the fraction of the total system computational load (per function evaluation) which is assigned to the thread (processor) performing the terminal subsystems computation. The remaining fraction of the overall computational load is distributed evenly between the remain (non-terminal subsystem) threads. As and example, a 0.5 (or 50%) terminal load factor indicates that sufficiently Table 2. DCAe timing results for simulating 200 function evaluations (in secs) using optimized static loading scheme # Threads
Optimal terminal system load (%)
4 2
Average Time Required For Simulation Per Body
10
50 75
Number of bodies 128
256
1,024
3.04 4.42
5.94 8.56
22.95 34.35
x 10−4 DCAe Balanced DCAe Unbalanced DCA Serial Order−N
9 8 7 6 5 4 3 2 1 1
1.5
2
2.5 3 Number of Processors
3.5
4
Fig. 8. Timing results of DCAe, DCA and O(n) applied to a 1,024 body chain with limited number of processors
204
A. Binani et al.
Average Simulation Time Per Body [sec.]
1
128 Body System
x 10−3
0.9 0.8 0.7
DCAe 4 Threads DCAe 2 Threads DCA 4 Threads DCA 2 Threads DCA 1 Thread O(n) 1 Thread
0.6 0.5 0.4 0.3 0.2 0.1 0
0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Terminal System Loading
Fig. 9. 128 body performance results for varied terminal subsystem loads with DCAe using two and four threads
Average Simulation Time Per Body [sec.]
1
256 Body System
x 10−3
0.9 0.8 DCAe 4 Threads DCAe 2 Threads DCA 4 Threads DCA 2 Threads DCA 1 Thread O(n) 1 Thread
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Terminal System Loading
Fig. 10. 256 body system performance results for varied terminal subsystem loads with DCAe using two and four threads
Multibody Algorithms in Limited Parallel Computing Environments
Average Simulation Time Per Body [sec.]
1
205
1024 Body System
x 10−3
0.9 0.8 DCAe 4 Threads DCAe 2 Threads DCA 4 Threads DCA 2 Threads DCA 1 Thread O(n) 1 Thread
0.7 0.6 0.5 0.4 0.3 0.2 0.1 0 0
0.1
0.2
0.3
0.4
0.5
0.6
0.7
0.8
0.9
1
Terminal System Loading
Fig. 11. 1,024 body system performance results for varied terminal subsystem loads with DCAe using two and four threads
many bodies have been assigned to the terminal subsystem thread that the DCAe O(n) computational performed there accounts for half of the overall computational load. Table 2 gives the optimized loading on the terminal subsystem that provide the best performance. The results show that the optimized loading scheme offsets the higher communication cost associated with systems of smaller scale and that the DCAe has the best performance when compared with the other algorithms in terms of time required for simulation in all cases. Figure 8 illustrates the improved performance of the balanced DCAe applied to a 1,024 body chain system when compared to the other algorithms in limited parallel computing environment. Figures 9–11 give the plots for the timing results with varying load factors for 128, 256 and 1,024 body chain mechanism using two and four processing threads.
11 Conclusions The DCA is shown to require four processors to obtain results similar to the recursive sequential results and the DCAe is verified to outperform the efficient recursive solution with just two processors when load balancing is considered. The DCAe thus offers a multibody solution scheme with optimal O(log2 n) time complexity, and is highly competitive with efficient higher order methods in regions of practical interest (small parallel computers).
206
A. Binani et al.
References 1. Anderson KS (1992) An order-n formulation for motion simulation of general constrained multi-rigid-body systems. Comp & Struct 43:565–572 2. Anderson KS (1993) An efficient modeling of constrained multibody systems for application with parallel computing. ZAMM 73:935–939 3. Anderson KS (1993) An order-n formulation for the motion simulation of general multi-rigid-body tree systems. Comp & Struct 46:547–559 4. Anderson KS, Duan S (1999) A hybrid parallelizable low order algorithm for dynamics of multi-rigid-body systems. Part I: Chain systems. Math Comp Modeling 30:193–215 5. Critchley JH (2003) A parallel logarithmic time complexity algorithm for the simulation of general multibody system dynamics. Ph.D. thesis, Rensselaer Polytechnic Institute, Troy, NY 6. Critchley JH (2005) An efficient multibody divide-and-conquer algorithm. In: Proceedings of ASME International Design Engineering Technical Conferences, Long Beach, CA, USA 7. Featherstone R (1983) The calculation of robotic dynamics using articulated body inertias. Int J Robotics Res 2:13–30 8. Featherstone R (1987) Robot dynamics algorithms. Kluwer, New York 9. Featherstone R (1999) A divide-and-conquer articulated body algorithm for parallel O(log(n)) calculation of rigid body dynamics. Part 1: Basic algorithm. Int J Robotics Res 18:867–875 10. Featherstone R (1999) A divide-and-conquer articulated body algorithm for parallel O(log(n)) calculation of rigid body dynamics. Part 2: Trees, loops, and accuracy. Int J Robotics Res 18:876–892 11. Fijany A, Sharf I, D’Eleuterio GMT (1995) Parallel O(log n) algorithms for computation of manipulator forward dynamics. IEEE Trans Robotics Autom 11:389–400 12. Fisette P, Peterkenne JM (1998) Contribution to parallel and vector computation in multibody dynamics. Par Comp 24:717–728 13. Grama A, Gupta A, Karypis G, et al. (2003) Introduction to parallel computing. Pearson Education Limited, Essex, England 14. Kane TR, Levinson DA (1985) Dynamics: theory and application. McGraw-Hill, NY 15. Lathrop LH (1985) Parallelism in manipulator dynamics. Int J Robotics Res 4:80–102 16. Roberson RE, Schwertassek R (1988) Dynamics of multibody systems. Springer, Berlin
Linear Dual Algebra Algorithms and their Application to Kinematics Ettore Pennestr`ı and Pier Paolo Valentini Universit` a di Roma Tor Vergata, via del Politecnico 1, 00133 Roma, Italy E-mails:
[email protected],
[email protected]
Summary. Mathematical and mechanical entities such as line vectors, screws and wrenches can be conveniently represented within the framework of dual algebra. Despite the applications received by this type of algebra, less developed appear the numerical linear algebra algorithms within the field of dual numbers. In this paper will be summarized different basic algorithms for handling vectors and matrices of dual numbers. It will be proposed an original application to finite and infinitesimal rigid body motion analysis.
1 Introduction A dual number + a is an ordered pair of real numbers (a, ao ) associated with a real unit +1, and the dual unit, or operator ε, where ε2 = ε3 = . . . = 0, 0ε = ε0 = 0, 1ε = ε1 = ε. A dual number is usually denoted in the form + a = a + εao .
(1)
A pure dual number has the dual unit only. The algebra of dual numbers has been originally conceived by W.K. Clifford (1873) [9], but its first applications to mechanics are due to A.P. Kotelnikov (1895)1 and E. Study (1901) [26]. Dual vector algebra provides a convenient tool for handling mathematical entities such as screws and wrenches. In fact, helicoidal infinitesimal and finite rigid body displacements can be easily composed under the framework of dual vector algebra. Another distinctive feature of dual algebra is conciseness of notation. For these reasons it has been often used for the search of closed 1
The original paper of A.P. Kotelnikov, published in the Annals of Imperial University of Kazan (1895), is reputed to have been destroyed during the Russian revolution [24].
C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
207
208
E. Pennestr`ı and P.P. Valentini
form solutions in the field of displacement analysis, kinematic synthesis and dynamic analysis of spatial mechanisms. Dual numbers and their algebra proved to be a powerful tool for the analysis of mechanical systems. Textbooks/monographies entirely dedicated to engineering applications of dual numbers have been authored to F.M. Dimentberg [10], R. Beyer [5] and I.S. Fischer [13]. Books with chapters or sections on dual algebra and its applications have been authored by L. Brand [6], M.A. Yaglom [29], J.S. Beggs [4], J. Duffy [11], Gonz´ ales-Palacios and J. Angeles [15], J. McCarthy [20]. A more extensive list of references is given by E. Pennestr`ı and R. Stefanelli [22]. One of the purposes of this investigation is the development and implementation of algorithms for the solution of linear algebra problems using dual numbers. Although the algorithms discussed are mainly related to the solution of kinematic problems, we believe they are potentially useful also for dynamic analyses of mechanisms. The linear algebra algorithms herein presented can be splitted into the following categories: 1. Simple operations involving dual vectors 2. Dual version of basic algorithms of linear algebra The kinematic applications discussed in the numerical examples are typical problems in the field of robotics and biomechanics: 1. Given the initial and final positions of a given number of points on a body, compute the screw motion parameters of such body. 2. Given the velocities of a set of points of a body, compute the instantaneous screw motion parameters. Regarding the first problem, the solution herein proposed appears original and computationally more attractive than the one reported in [27]. Our algorithm for computing the screw axis of an infinitesimal motion, given the velocities of a set of points, is entirely developed within the field of dual numbers and alternative to the one proposed by Page et al. [21]. In this last reference, the equations are splitted into real and dual parts before their numerical solution. The kinematic algorithms, implemented using the Ch programming language2 , can be applied also when numerical data are affected by measurement errors and rigid body properties are not strictly satisfied. In this case, the screw motion parameters of a pseudo rigid body motion are computed through the application of the least squares criterion.
2
The Ch programming language has been developed by Professor H.H. Cheng of University of California at Davis. More info about its main features are available at the web page www.softintegration.com.
Linear Dual Algebra Algorithms
209
2 Dual Numbers Dual numbers can be represented as follows: – – –
Gaussian representation: + a ≡ a + εao Polar representation: + a ≡ ρ (1 + εt), where ρ = a and t = ao /a Exponential representation: + a ≡ ρeεt , where ρ = a, t = ao /a and eεt = 1 + εt
The adoption of one representation instead of another depends on the context.
3 Dual Functions A function F of a dual variable x + = x + εxo can be represented in the form F (+ x) = f (x, xo ) + εg (x, xo ) , where f and g are real functions of real variables x and xo . The necessary and sufficient conditions in order F to be analytic are [10] ∂f =0, ∂xo
∂g ∂f = . ∂x ∂xo
(2)
Therefore, the following equality must hold (see also Table 1) f (+ x) = f (x + εxo ) = f (x) + εxo
∂f . ∂x
(3)
4 Algebra of Dual Vectors A line vector is a vector bound to a definite line L in space. The dual vector V+ = v + εv o
(4)
is combination of two vectors which specifies the position of L with respect to an arbitrary origin O. The primary part is a vector v parallel to L and the −−→ dual part is v o = OP × v, where P is an arbitrary point on L. 4.1 Scalar and Cross Products of Dual Vectors With reference to Fig. 1, + = a + ε (r 1 × a) , A + = b + ε (r 2 × b) , B be two dual vectors representing two distinct line vectors and let s∗ the direction versor of the minimum distance between these line vectors directed from a to b.
210
E. Pennestr`ı and P.P. Valentini Table 1. Computational cost of operations with dual numbers
Dual operation Sum Product Divisiona Square root Vector scaling Dot productb c
Cross product Dual sin Dual cos Dual tan Dual asin Dual acos Dual atan
Mathematical expression
Mult. Sums and Div.
+ a ± +b = (a ± b) + ε (ao ± bo ) + a+b = ab + ε (ao b + abo ) ao b − abo + a a = +ε +b b b2o √ √ + a = a + ε a√ 2 a + a {+ n} T + T + {A} {TB} =o {A} {B}+ ε {A} {B } + {Ao }T {B} &+ + & [A]{ B} = [A]{B}+ o & &o ]{B}) ε([A]{B } + [A sin θ+ = sin θ + εd cos θ cos θ+ = cos θ − εd sin θ d tan θ+ = tan θ + ε 2 cos θ ao arcsin + a = arcsin a + ε √ 1 −o a2 a arccos + a = arccos a − ε √ 1− a2 ao arctan + a = arctan a + ε 1 + a2
Trig.
– 3
2 1
– –
5
1
–
2
–
–
9
3
–
9
6
–
27 1 1
18 0 0
– 2 2
2
0
2
2d
1
1
2d
1
1
2
1
1
a
Division by a pure dual number εbo is not defined. We assume vectors of three elements. c ˜ (·) denotes a skew-symmetric matrix. d The computational cost of a square root operation on real numbers not included in this table is usually considered equivalent to about eight multiplications/divisions. b
s*
N s r2 O
r1
b B
s* θ
M A
a
Fig. 1. Product of dual vectors: nomenclature
Linear Dual Algebra Algorithms
211
Table 2. Noteworthy cases of dual vectors products (adapted from [10]) +·B + A
Line vector Skew Incident (s = 0) Parallel (θ = 0) Coaxial (θ = s = 0)
+×B + A ab cos θ+ abS+ sin θ+ ab cos θ abS+ ab εabs ab 0
In such a context, it is necessary to introduce the concept of dual angle [26] θ+ = θ + εs
(5)
as a variable required to characterize the relative position and orientation of + and B. + The angle θ is measured counterclockwise about s∗ . line vectors A The scalar and cross products of two dual vectors are respectively defined as follows [10]: +·B + = a · b + ε [a · (r 2 × b) + b · (r 1 × a)] , A = a · b + ε [(r 1 − r 2 ) · (a × b)] , = ab cos θ − ε [(ss ) · (ab sin θs )] , = ab [cos θ − εs sin θ] = ab cos θ+ ,
(6)
+×B + = a × b + ε [a × (r 2 × b) + (r 1 × a) × b] , A = a × b + ε [(a · b) (r 2 − r 1 ) + r 1 × (a × b)] , = ab {s sin θ + ε [s cos θs + sin θ (r 1 × s )]} , = abS+ (sin θ + εs cos θ) = abS+ sin θ+ .
(7)
Table 2 summarizes the result of different dual vectors products for different cases of relative position of line vectors. 4.2 Dual Angle Between Line Vectors The dual angle between the two line vectors +i = ai + ε (si × ai ) , A
(i = 1, 2)
must be computed. The computational steps are described in the following and are justified by the geometry depicted in Fig. 2.
212
E. Pennestr`ı and P.P. Valentini
a1 E3
θ +ε 2 a = A 2 S2
a2
a2
s2
s2 O
a1 S1 A =a 1 1+
ε s 1
h
s1 a1
Fig. 2. Computation of dual angle between two line vectors
1. Compute the dual vectors + +i = $Ai $ , E $+$ $Ai $
(i = 1, 2).
(8)
2. Compute their cross product + + +3 = $E1 × E2 $ . E $+ +2 $ $E1 × E $
(9)
3. Compute cosine and sine of the dual angle θ+ between the two line vectors +2 , +1 · E cos θ+ = E +1 × E +2 · E +3 . sin θ+ = E
(10b)
" # + cos θ+ = θ + εh . θ+ = atan2 sin θ,
(11)
(10a)
4. Compute dual angle
The procedure is not valid if line vectors are parallel. In this case, there is an +3 . infinite set of dual vectors E 4.3 Sum of Two Dual Vectors With reference to the geometry of Fig. 3, we want to compute the sum +2 . +=A +1 + A (12) A
Linear Dual Algebra Algorithms
213
A
α1
A2
A1
δ h E12
R S
Q
d
Fig. 3. Sum of dual vectors
+ is obtained by prescribing a screw One can observe that the direction of A +12 and dual angle α + 41 . On the basis motion to A1 defined by the screw axis E of this observation, the following algorithm can be stated: 1. Compute the dual vectors + +1 = $A1 $ , E $+ $ $A1 $ + +2 = $A2 $ , E $+ $ $A2 $ where · denote the Euclidean norm. +12 perpendicular to both 2. Compute the dual angle θ+ and the dual vector E + + A1 and A2 (see previous section). 3. Compute the module of the dual vector sum +1 + A +2 · A +2 + 2A +1 · A +2 . +1 · A A= A 4. Compute sine and cosine of dual angle α +1 +2 · A +2 sin θ+ A , sin α +1 = A +1 +1 · A A cos θ+ cos α +1 = + sin α +1 , A sin θ+ +1 , cos α +1 ) . α +1 = atan2 (sin α
214
E. Pennestr`ı and P.P. Valentini
5. Let
α +1 + t = tan . 2
and compute
+12 . T+ = + tE
(13)
6. Apply formula3 +=E +1 + E
# " 2T+ +1 + T+ × E +1 × E 1++ t2
+1 after the screw motion for the definition of E 7. Finally compute + = AE +. A Example: If we assume +1 = A
001
T
,
+2 = A
01ε
T
,
the previous algorithm gives the following numerical values
+12 E
π θ+ = − ε, 2 T = −1 0 0 ,
ε π − , 4 2 += 0 1 1+ε T . A
α +1 =
+ form a ruled surface named cylindroid. The loci formed by all possible A Ready to use computer-aided solutions of this problem have been also proposed by A. Perez [23].
5 Solution of Matrix Equations When adapting linear algebra numerical algorithms to matrices of dual numbers, one may encounter the following matrix equation T
T
[X] [L] + [L] [X] = [Ao ] ,
(14)
where [Ao ] and [L] are prescribed whereas the elements of matrix [X] are unknown. 3
As it will be explained in Section 7.1, this formula represents the extension to dual algebra of the well known Rodrigues’ formula.
Linear Dual Algebra Algorithms
Equation (14) can be transformed into the following # " T = vec ([Ao ]) , ([L] ⊗ [I]) vec ([X]) + ([I] ⊗ [L]) vec [X]
215
(15)
where ⊗ and vec(·) denote the Kronecker product between two matrices and the vectorization of a matrix, respectively. If we let [H1 ] = [L] ⊗ [I] and [H2 ] the matrix obtained from ([I] ⊗ [L]) exchanging columns appropriately, then (15) can be rewritten in the form [H] vec ([X]) = vec ([Ao ]) .
(16)
where [H] = [H1 ] + [H2 ]. Since [H] is a singular matrix, further conditions need to be added for a unique solution of the problem. For instance, in the case of Cholesky decomposition one must delete the redundant rows of [H] and require that the elements of [X] above the main diagonal are all zero.
6 Algebra of Dual Matrices + is a matrix whose components are dual numbers. It can A dual matrix A be splitted into a real part [A] and a dual part [Ao ] such that + = [A] + ε [Ao ] . A (17) 6.1 Product of Two Dual Matrices + = [B]+ ε [B o ] have the appropriate + = [A] +ε [Ao ] and B Assuming that A dimensions, then their dual product is defined as follows: + B + = [A] [B] + ε ([A] [B o ] + [Ao ] [B]) . A (18) 6.2 Inverse of a Dual Matrix The inverse of a square dual matrix is defined as −1 + A + A = [I] .
(19)
−1 + = A + , from (18) one obtains [3, 16] By letting B # −1 " −1 −1 −1 + . A = [A] − ε [A] [Ao ] [A]
(20)
216
E. Pennestr`ı and P.P. Valentini
6.3 QR Decomposition of a Dual Matrix + can be decomposed as follows A dual matrix A + = Q + R + , A
(21)
+ = [Q] + ε [Qo ] is an orthogonal matrix and R + = [R] + ε [Ro ] where Q + = is an upper triangular matrix. For the QR decomposition of matrix A o [A] + ε [A ] the authors could not find a formula for obtaining directly the + + as a function of [A] and [Ao ]. Thus, for the computation matrices Q and R + and R + , the modified Gram-Schmidt orthogonalization procedure has of Q + . been applied to the rows of A To obtain the QR decomposition one can adapt to dual numbers the modified Gram-Schmidt orthogonalization procedure [14]. Example: The matrix
1 + ε 2 + ε3 + = A 3 + ε9 3 + ε
can be decomposed in the following matrices4 : 0.316 − ε0.569 0.949 + ε0.190 3.162 + ε8.854 3.478 + ε1.328 + = + = Q , R . 0.949 + ε0.190 −0.316 + ε0.569 0.000 + ε0.000 0.948 + ε4.617
6.4 Cholesky Decomposition + can be decomposed as A symmetric dual matrix A T + = L + L + , A
(22)
+ = [L] + ε [Lo ] is a lower triangular matrix with not pure dual where L numbers diagonal entries linear combination of two lower triangular matrices [L] and [Lo ]. Example: The matrix 2 + ε 1 + ε4 + = A 1 + ε4 3 + ε 4
Numerical results are displayed with three decimal digits only.
Linear Dual Algebra Algorithms
217
can be decomposed as follows # " T T T [A] + ε [Ao ] = [L] [L] + ε [Lo ] [L] + [L] [Lo ] .
The matrix [L] =
1.414 0.00 0.710 1.58
is immediately computed, whereas the matrix [Lo ] is obtained by equation similar to (15), with [X] = [Lo ]. In this case ⎡ ⎡ ⎤ 1.414 0.00 0.00 0.00 1.414 0.00 0.00 ⎢ 0.71 1.58 0.00 0.00 ⎥ ⎢ 0.71 0.00 1.58 ⎥ [H1 ] = ⎢ [H2 ] = ⎢ ⎣ 0.00 0.00 1.414 0.00 ⎦ , ⎣ 0.00 1.414 0.00 0.00 0.00 0.71 1.58 0.00 0.71 0.00 The system to be solved is ⎡ 2.828 0.00 ⎢ 0.71 1.414 ⎢ ⎣ 0.00 1.42 0 0 hence
0.00 1.58 0 1
(23)
solving an ⎤ 0.00 0.00 ⎥ ⎥. 0.00 ⎦ 1.58
⎫ ⎧ ⎫ ⎤⎧ 0.00 ⎪ x11 ⎪ ⎪1⎪ ⎪ ⎪ ⎬ ⎪ ⎨ ⎪ ⎬ ⎨ x 4 0.00 ⎥ 21 ⎥ = , x12 ⎪ 3.16 ⎦ ⎪ ⎪1⎪ ⎪ ⎪ ⎭ ⎪ ⎩ ⎪ ⎭ ⎩ x22 0 0
0.35 0 . [L ] = 2.65 −0.87 o
6.5 Pseudoinverse of a Dual Matrix + be a matrix with m rows and n columns. Its dual Moore-Penrose Let A pseudoinverse can be computed as follows: ⎧ −1 T ⎪ + T + + A A if m ≥ n, ⎨ A + ⎪ + = −1 (24) A T T ⎪ ⎪ + + + ⎩ A A A if m < n , T T + (or A + A + ) exists. + A when the inverse of A
+ + = [I]. If + A If m > n, then the left pseudoinverse exists such that A + + A + = [I]. Adopting m < n then the right pseudoinverse exists such that A a reasoning similar to the one used for the definition of the inverse matrix, one can demonstrate that + # " + = [A]+ − ε [A]+ [Ao ] [A]+ . A (25)
218
E. Pennestr`ı and P.P. Valentini
The definition of the Moore-Penrose matrix is often associated with the least squares solution of a linear system, such as (26), but with the number m of equations different from the number n of unknowns. It is well known that T T [A] [A] (or [A] [A] ) may not have an inverse and, even when it is invertible, one could obtain large numerical errors using directly (24). Thus, for reliable numerical answers, more sophisticated techniques are recommended. Example: The Moore-Penrose pseudoinverses of ⎡ ⎤ 1 + ε4 3 + ε0 1 + ε4 3 + ε0 4 + ε +2 = +1 = ⎣ 9 + ε2 22 + ε4 ⎦ , A . A 9 + ε2 22 + ε4 4 + ε4 4 + ε4 4 + ε1 are, respectively, + −0.051 + ε0.064 −0.069 + ε0.082 0.418 − ε0.533 + A1 = , 0.028 − ε0.025 0.073 − ε0.038 −0.170 + ε0.199 ⎡ ⎤ −0.035ε − 0.014 0.021 + ε0.000 + +2 = ⎣ −0.038 − ε0.035 0.044 − ε0.001 ⎦ . A 0.287 − ε0.007 −0.038 − ε0.011 6.6 Solution of a System of a Dual Linear Equations Let us denote with
+ {+ A x} = +b
(26) + = [A] + ε [Ao ] and +b = a system of linear dual equations [8] where A x} = {x} + ε {xo } is {b} + ε {bo }. Assuming [A] nonsingular the solution {+ computed by solving the systems [A] {x} = {b} , [A] {xo } = {bo } − [Ao ] {x} .
(27a) (27b)
To improve the overall computational efficiency it is convenient to factor matrix [A] only once. Thus, we suggest to proceed as follows: 1. 2. 3. 4.
Apply QR decomposition to matrix [A] so that [A] = [Q] [R]. T Solve system [R] {x} = [Q] {b}. T o Solve system [R] {x } = [Q] ({bo } − [Ao ] {x}). Form the dual vector {+ x} = {x} + ε {xo }.
Since [R] is an upper triangular matrix, at steps II and III only simple procedures of back substitution are executed.
Linear Dual Algebra Algorithms
219
7 The Principle of Transference The Principle of Transference proved to be a powerful tool for kinematic analysis of spatial linkages. The Principle has been declared by A.P Kotelnikov for the first time, but the correspondence between equivalent spherical and spatial configurations is due to V.V. Dobrovolski (1947). A thoughtful discussion of the Principle of Transference is given by J. Rooney [24], L.M. Hsia and A.T. Yang [18]. The Principle of Transference can be stated as follows [10, 18, 24]: All valid laws and formulae relating to a system of intersecting unit line vectors (and hence involving real variables) are equally valid when applied to an equivalent system of skew vectors, if each variable a, in the original formulae is replaced by the corresponding dual variable + a = a + εao . 7.1 Applications of the Principle of Transference By virtue of the Principle of Transference, formulas for the composition of spherical motions can be extended to the general helicoidal motion case by simply substituting the angle of rotation θ with the dual angle θ+ = θ + εs, where s is the displacement of the body along the screw axis. Extension of Rodrigues’ Formula With reference to the geometry of Fig. 5, let r 2 be the position of vector r 1 after a rotation about axis of versor u of an angle θ. The well known Rodrigues’ formula, in vector notation, can be rewritten in the form 2τ × (r 1 + τ × r 1 ) , (28) r2 = r1 + 1 + t2 where t = tan θ2 and τ = tu. When θ = π the previous expression cannot be applied and the following should be adopted r 2 = 2 (u · r 1 ) u − r 1 .
(29)
By applying the principle of Transference, the Rodrigues’ formula (28) can + as be generalised to define a screw motion about the line vector defined by E follows
2T+ θ+ + + + + + (30) × R1 + E tan × R1 , R2 = R1 + + 2 1 + tan2 θ2 where
220
E. Pennestr`ı and P.P. Valentini
U2
s
U1
U12
θ
Fig. 4. Nomenclature
u A1
s1
θ
C E
θ s2 A2
r1
R2 R1
h
r2 O Fig. 5. Spherical motion (left) and screw motion about axis h (right)
θ+ = θ + εs is the dual angle whose primary part is the angle of rotation and s the displacement along the screw axis (see Fig. 4); +2 are the initial and final positions of a line vector framed to the +1 and R – R rigid body, respectively (see Fig. 5).
–
Composition of Finite Screw Motions Let us specify two spherical rotations: the first one of an angle θ1 about the axis u1 and the second of an angle θ2 about the axis u2 . Hence, the components of the vectors θi (31) τ i = tan ui , (i = 1, 2) , 2 can be computed. It can be demonstrated [19, 20] that the resultant spherical motion is defined by the following vector τ3 =
τ1 + τ2 − τ1 × τ2 . 1 − τ1 · τ2
(32)
Linear Dual Algebra Algorithms
221
Consider two finite screw motions about the axes located by the unit line +i (i = 1, 2). The rotation angles and the displacements along the dual vector E axis are denoted with θi and hi (i = 1, 2), respectively. The formula for the composition of these finite motions is obtained by substituting vectors with dual vectors. Therefore Eq. (32) changes into the following T+1 + T+2 − T+1 × T+2 , T+3 = 1 − T+1 · T+2 where
+ +i tan θi , T+i = E 2
(33)
(i = 1, 2).
(34)
The Dual Screw Matrix T + = + and θ+ be the line versor representing the screw axis Let E h2 + h3 h1 + + Therefore, the of the finite rigid body motion and the dual rotation angle θ. orthogonal transform matrix representing this motion can be written in the form ⎤ ⎤ ⎡ ⎡ 2 e+0 + e+21 − 12 e+1 e+2 − e+0 e+3 e+1 e+3 + e+0 e+2 a12 + a13 + a11 + ⎥ ⎥ ⎢ + =⎢ a21 + a22 + a23 ⎦ = 2 ⎣ e+1 e+2 + e+0 e+3 e+20 + e+22 − 12 e+2 e+3 − e+0 e+1 ⎦ , A (35) ⎣+ 1 2 2 + a31 + a32 + a33 e+1 e+3 − e+0 e+2 e+2 e+3 + e+0 e+1 e+0 + e+3 − 2 where + aij = aij + εaoij ,
(i, j = 1, 2, 3),
and e+0 = cos
θ+ , 2
e+1 = + h1 sin
θ+ , 2
e+2 = + h2 sin
θ+ , 2
e+3 = + h3 sin
θ+ , 2
(36)
are the dual Euler parameters. The matrix (35) transforms the dual coordinates of a line vector attached to a rigid body undergoing a screw motion from its initial to the final position. Computing Screw Motion Parameters from the Dual Screw Matrix In this section it will be discussed how to obtain screw motion parameters + and dual angle such as position and orientation of the screw axis versor E θ+ = θ+εs about such axis. The problem is solved by different authors (e.g. [1]). However,the approaches of I.S. Fischer [12] and H.H. Cheng [7] rely upon dual numbers. However, in the algorithm herein proposed all computations are entirely made in the field of dual numbers, i.e. no splitting of matrix elements into a real and dual part is required.
222
E. Pennestr`ı and P.P. Valentini
1. Compute √ √ 1++ a11 + + a22 + + a33 1++ a11 − + a22 − + a33 , e+1 = ± , e+0 = ± 2 2 √ √ 1−+ a11 + + a22 − + a33 1−+ a11 − + a22 + + a33 e+2 = ± , e+3 = ± . 2 2 The choice of the algebraic sign affects only the sense of the versor of the screw axis. 2. Avoiding the division by pure dual numbers, choose the appropriate kth case for the computation of the remaining dual Euler parameters. Case 0 a23 + a −+ e+1 = 32 4+ e0 a13 − + a31 e+2 = + 4+ e0 a21 − + a12 e+3 = + 4+ e0
Case 1
Case 2
Case 3
a12 + + a21 e+2 = + 4+ e1 a13 + + a31 e+3 = + 4+ e1 a32 − + a23 e+0 = + 4+ e1
a23 + + a32 e+3 = + 4+ e2 a13 − + a31 e+0 = + 4+ e2 a12 + + a21 e+1 = + 4+ e2
a32 + + a23 e+2 = + 4+ e3 a13 + + a31 e+1 = + 4+ e3 a21 − + a12 e+0 = + 4+ e3
3. Compute θ+ = 2 cos−1 e+k , + hi =
e+i +
sin θ2
,
(37a)
(i = 1, 2, 3).
(37b)
Our procedure is still valid for half-turn screw motion, whereas the one proposed by I.S. Fischer [12] needs to be modified. For the case of pure translation (i.e θ = 0◦ ) (37b) cannot be applied. It must be observed that under a pure translation there is a unique motion direction, but not a unique screw axis. Hence, the dual part of the screw versor is meaningless and needs not be calculated. If a pure translation occurs, then the direction of the motion axis is given by [12] h1 =
a32 , s
h2 =
a13 , s
h3 =
a21 . s
(38)
8 Computing Screw Finite Motion Parameters from Redundant Noisy Landmark Data In the field of biomechanics or robotics it is often required the computation of body screw motion parameters from the Cartesian coordinates of tracked markers attached to the body [2, 17, 25, 28].
Linear Dual Algebra Algorithms
223
Since these coordinates are affected by experimental errors of different nature, the condition of constant distance between the centers of the markers is not satisfied. Therefore, the tracked object cannot be strictly considered a rigid body. It is well known that the trajectories of three non aligned points are required to uniquely specify a rigid body motion. With the purpose to reduce the influence of experimental errors, the number of tracked points is usually more than three. Such redundancy, although beneficial for the accuracy of computations, even in the case of data affected by errors, rules out algorithms based on the hypothesis of rigid body motion (e.g. [1, 19]). The technical literature reports several techniques which reduce the motion of the tracked object to a pseudo rigid body motion. Error smoothing is usually based on the least squares criterion. In biomechanics, the use of dual numbers for the solution of the problem is not new [27]. Let us denote with {r0i } and {ri } (i = 1, 2, . . . , n), respectively, the initial and final coordinates of points attached to a body subjected to a screw motion. These coordinates are collected through different experimental techniques such as photogrammetry, magnetic sensors, laser sensors, etc. The initial and final positions of the centroid of the points are given by {c0i } =
1 {r0i } , n i
(39a)
{ci } =
1 {ri } . n i
(39b)
The initial and final positions of line vectors attached to the moving body are expressed, respectively, by the following dual vectors c0i ] {r0i − c0i } , {+ r0i } = {r0i − c0i } + ε [& ci ] {ri − ci } , {+ ri } = {ri − ci } + ε [& ˜ denotes the skew-symmetric matrix of a vector. where the symbol (·) In the absence of errors, the following equality would hold: + {+ A r0i } = {+ ri } ,
(40a) (40b)
(41)
+ expressed by (35). However, due to the presence of errors with A +1 {+ A r0i } ≈ {+ ri } ,
(42)
+1 is an unknown matrix to be computed trying to minimize the where A differences with the least squares optimality criterion.
224
E. Pennestr`ı and P.P. Valentini
In this section a two step method is proposed: 1. After forming the matrices + 0 = r+01 r+02 . . . r+0n , R + 1 = r+1 r+2 . . . r+n , R
(43a) (43b)
+1 matrix is simply obtained as follows a dual transform A
+ +0 . +1 R +1 = R A
(44)
Then the dual QR decomposition is applied + R + +1 = Q A
(45)
+ = Q + . A (46) + will result into an identity matrix. Under ideal conditions the matrix R Hence, its elements can be used as a rough estimate of the deviation from rigid body condition. 2. By means of the algorithm discussed in the previous section, from [A] the + are then retrieved. screw motion parameters θ+ and E and we let
The present method offers the following advantages: • •
It is entirely based on dual equations and this greatly reduces the number of equations and the order of matrices involved. No iterative optimization solver is required.
9 Computing Screw Infinitesimal Motion Parameters Let {pi }, {p˙i } i = 1, 2, . . . , n be the positions and velocities of a set of points of a rigid body and 1 {pi }, n i 1 {g} ˙ = {p˙i }, n i
{g} =
{+ ri } = {pi − g} + ε[& g ]{pi − g}, d {+ ri } = {p˙i − g} ˙ + ε[& g]{p ˙ i − g} + ε[& g ]{p˙i − g}. ˙ dt
(47a) (47b) (47c) (47d)
Linear Dual Algebra Algorithms
225
From the rule of differention of vectors, the following equality follows d & ri } = ω (48) {+ vi } = {+ + {+ ri }, dt T +x ω +y ω +z is the dual angular velocity vector. where {+ ω} = ω If {+ u} is the dual vector which defines the spatial position of the instantaneous screw axis of a rigid body, ω the angular velocity, v the velocity of the points on such axis, then the dual angular velocity vector is defined as follows {+ ω } = (ω + εv) {+ u} .
(49)
This dual vector characterize the entire field of velocities of all the points on the rigid body. In some applications, given the velocities of some points on a rigid body, it is necessary to compute {+ ω }. For n ≥ 3, then from (48) one obtains ⎤ ⎡ 0 −+ r1z r+1y ⎫ ⎧ ⎢ r+1z 0 −+ r1x ⎥ v+1x ⎪ ⎪ ⎥ ⎪ ⎪ ⎢ ⎪ ⎪ ⎪ ⎪ ⎢ v + −+ r r + 0 ⎥ ⎪ 1y ⎪ 1y 1x ⎥ ⎪ ⎪ ⎢ ⎪ ⎪ ⎪ ⎢ 0 −+ v+1z ⎪ r2z r+2y ⎥ ⎪ ⎪ ⎥ ⎪ ⎪ ⎢ ⎪ ⎪ ⎪ ⎢ r+2z 0 −+ r2x ⎥ ⎪ ⎪ v+2x ⎪ ⎥⎧ ⎫ ⎪ ⎪ ⎢ ⎪ +x ⎬ ⎬ ⎨ v+2y ⎪ ⎨ω ⎢ −+ 0 ⎥ ⎥ ⎢ r2y r+2x ω +y . (50) = − ⎥ ⎢ v + 2z ⎪ ⎥⎩ ⎭ ⎪ ⎢ ⎪ ⎪ ω + ⎪ ⎥ ⎪ ⎢ . . z ⎪ ⎪ . ⎪ .. ⎥ ⎪ ⎢ . ⎪ ⎪ ⎪ ⎥ ⎪ ⎢ ⎪ ⎪ ⎪ ⎪ ⎢ v+nx ⎪ 0 ⎥ ⎪ ⎪ ⎥ ⎪ ⎢ ⎪ ⎪ ⎪ ⎢ 0 −+ v+ny ⎪ rnz r+ny ⎥ ⎪ ⎪ ⎪ ⎥ ⎪ ⎢ ⎭ ⎩ ⎣ r+nz v+nz 0 −+ rnx ⎦ −+ rny r+nx 0 This can be interpreted as a redundant system of linear equations with the +y and ω +z . Its solution can be obtained using the pseudoindual unknowns ω +x , ω verse matrix of the matrix coefficients. However, in order to obtain meaningful results, one must ensure that all the points whose velocities are prescribed do not lie on a single line or belong to a plane parallel with the infinitesimal screw axis. Equation (50) can be applied also for the case of velocity data affected by measurement errors. Obviously, the computed {+ ω } approximates the real one according to the least squares criterion.
10 Numerical Applications Three different examples will be discussed in this section. The first one regards the computation of screw parameters of a rigid body motion using the initial and final coordinates of four point. The second example deals also with the
226
E. Pennestr`ı and P.P. Valentini
computation of screw motion parameters. In this case, the values of the coordinates are perturbated. Under the new conditions, the hypothesis of rigidity is not valid anymore. In the third example, the results obtained by solving Eq. (50) are shown. I. Screw Parameters for a Rigid Body Finite Motion The prescribed initial and final positions of four points attached to the body are: T T , {r02 }= 1 2 1 , {r01 } = 1 1 1 T T {r03 } = 0 2 3 , {r04 }= 3 6 7 . T T {r1 } = 2.612370 0.387620 1.500000 , {r2 } = 2.862370 1.137620 2.113720 , T T , {r4 }= 9.036607 0.963393 6.337117 . {r3 } = 3.337110 −0.337110 3.724740
Once the matrices (43) are formed, from (44) it follows that ⎡ ⎤ 0.750 − ε0.250 0.250 − ε0.750 0.612 + ε0.612 + = A +1 = ⎣ 0.250 + ε1.365 0.750 − ε0.368 −0.612 + ε0.117 ⎦ . A −0.613 + ε0.243 0.614 + ε0.750 0.499 − ε0.609) The method herein discussed gives θ+ = 1.048 + ε0.712, + = 0.707 + ε0.078 0.707 − ε0.077 0.000 + ε1.218 T . E
II. Screw Parameters for a Pseudo Rigid Body Finite Motion Let us assume that the final coordinates are as follows {r1 } = {r3 } =
2.600 0.380 1.500
T
3.300 −0.330 3.700
{r2 } =
,
T
,
{r4 }=
2.800 1.130 2.100 9.000 0.960 6.300
T T
In this case from (44) and (45) it follows that: ⎡ ⎤ 0.780 + ε0.016 0.200 − ε0.689 0.640 + ε0.443 +1 = ⎣ 0.242 + ε1.382 0.750 − ε0.488 −0.609 + ε0.236 ⎦ , A −0.600 + ε0.229 0.600 + ε0.787 0.500 − ε0.637 ⎡ ⎤ 0.770 − ε0.141 0.221 − ε0.741 0.599 + ε0.456 + = ⎣ 0.239 + ε1.315 0.770 − ε0.456 −0.592 − ε0.063 ⎦ , Q −0.592 + ε0.347 0.598 + ε0.861 0.540 − ε0.574)
, .
Linear Dual Algebra Algorithms
⎡
227
⎤
1.013 + ε0.207 −0.022 + ε0.053 0.051 + ε0.056 + = ⎣ 0.000 + ε0.000 0.981 − ε0.031 −0.028 + ε0.133 ⎦ . R 0.000 + ε0.000 0.000 + ε0.000 1.013 − ε0.175 + = Q + as transform matrix one obtains the finite screw motion Assuming A parameters θ+ = 1.001 + ε0.696, + = 0.707 + ε0.233 0.707 − ε0.251 0.010 + ε1.216 T . E
III. Screw Parameters for a Rigid Body Infinitesimal Motion Let {p1 } = { 1 0 0 }T ,
{p2 } = { 0 1 0 }T ,
{p3 } = { 0 0 1 }T ,
be the position vectors of three points of a rigid body whose velocities are respectively expressed by the following cartesian vectors √ √ √ {p˙1 } = { π 0 2 }T , {p˙2 } = { 0 −π 2 }T , {p˙3 } = { π −π 2 }T . From the application of (50) one obtains ⎧ ⎫ ⎨ επ ⎬ −επ√ {+ ω} = . ⎩ ⎭ π+ε 2 Hence, the following conclusions are drawn: – {u} = { 0 0 1 }T is the versor of the infinitesimal screw axis. – ω = π is the magnitude of the angular velocity. T 110 are the coordinates of a point on the screw axis. – √ – V = 2 is the velocity of a point along the screw axis.
11 Conclusions This paper presented several basic algorithms regarding vectors and matrices of dual numbers. The algorithms, arranged in a form suitable for a ready implementation into a code, should provide useful numerical tools for the development of analyses based on the use of dual numbers. In most of the cases the algorithms are accompanied by simple numerical examples to demonstrate their effectiveness. The paper also included some applications of these algorithms to the solution of common kinematic problems in the field of robotics and biomechanics. In particular two new methods are proposed for the computation of finite and infinitesimal screw motion parameters.
228
E. Pennestr`ı and P.P. Valentini
References 1. Angeles J (1982) Spatial kinematic chains. Springer, New York 2. Angeles J (1986) Automatic computation of the screw parameters of rigid body motions. Part I: Finitely separated positions. ASME J Dyn Syst Meas Cont 108:32–38 3. Angeles J (1991) The application of dual algebra to kinematic analysis. In: Angeles J, Zakhariev E (eds) Computational methods in mechanical systems, pp 3–32. Springer, New York 4. Beggs JS (1966) Advanced mechanisms. Macmillan, New York 5. Beyer R (1963) Technische raumkinematik. Springer, Berlin/G¨ ottingen/ Heidelberg 6. Brand L (1947) Vector and tensor analysis. Wiley, New York 7. Cheng HH (1994) Programming with dual numbers and its applications in mechanisms design. Eng Comp 10(4):212–229 8. Cheng HH, Thompson S (1997) Dual iterative displacement analysis of spatial mechanisms using the Ch programming language. Mech Mach Th 32(2):193–207 9. Clifford WK (1873) Preliminary sketch of biquaternions. Proc London Math Soc 4(64):381–395 10. Dimentberg FM (1968) The screw calculus and its applications. Technical Report AD 680993, Clearinghouse for Federal and Scientific Technical Information, Springfield, VA, USA 11. Duffy J (1980) Analysis of mechanisms and robot manipulators. Halstead Press 12. Fischer IS (1998) The dual angle and axis of screw motion. Mech Mach Th 33(3):331–240 13. Fischer IS (1999) Dual-number methods in kinematics, statics and dynamics. CRC, Boca Raton, FL/London/New York/Washington, DC 14. Golub GH, Van Loan CF (1996) Matrix computations. John Hopkins University Press, Baltimore, MD 15. Gonz´ ales-Palacios MA, Angeles J (1993) Cam synthesis. Kluwer, Dordrecht 16. Gu You-Liang, Luh JYS (1987) Dual-number transformations and its applications to robotics. IEEE J Robotics Autom RA-3:615–623 17. Gupta KC, Chutakanonta P (1998) Accurate determination of object position from imprecise data. ASME J Mech Design 120:559–564 18. Hsia LM, Yang AT (1981) On the principle of transference in three dimensional kinematics. ASME J Mech Design 103:652–656 19. Marcolongo R (1953) Meccanica razionale. Ulrico Hoepli, Milano, in Italian 20. McCarthy JM (2000) Geometric design of linkages. Springer, New York 21. Page A, Mata V, Hoyos JV, Porcar R (2007) Experimental determination of instantaneous screw axis in human motion. Error analysis. Mech Mach Th 42:429–441 22. Pennestr`ı E, Stefanelli R (2007) Linear algebra and numerical algorithms using dual numbers. Mult Syst Dyn 18:323–344 23. Perez A (1999) Analysis and design of Bennett linkages. M.S. thesis, University of California, Berkeley, CA 24. Rooney J (1975) On the principle of transference. In: Proceedings of the IV IFToMM Congress, pp 1089–1094. New Castle Upon Tyne, UK 25. Shifflett G, Laub A (1995) The analysis of rigid body motion from measurement data. ASME J Dyn Syst Meas Control 117:578–584
Linear Dual Algebra Algorithms
229
26. Study E (1903) Geometrie der dynamen. Verlag Teubner, Leipzig 27. Teu KK, Kim W (2006) Estimation of the axis of screw motion from noisy data. A new method based on Pl¨ ucker lines. J Biomech 39:2857–2862 28. Veldpaus F, Woltring H, Dortmans L (1988) Least squares algorithm for the equiform transformation from spatial marker coordinates. J Biomech 21:45–54 29. Yaglom IM (1968) Complex numbers in geometry. Academic, New York
A Memory Based Communication in the Co-simulation of Multibody and Finite Element Codes for Pantograph-Catenary Interaction Simulation Jorge Ambr´ osio, Jo˜ ao Pombo, Frederico Rauter, and Manuel Pereira IDMEC – Instituto Superior T´ecnico, Technical University of Lisbon Av Rovisco Pais 1, 1049-001 Lisbon, Portugal E-mails: {jorge,jpombo,frauter,mpereira}@dem.ist.utl.pt Summary. Many complex systems require that computational models of different nature are used for their sub-systems. The evaluation of the dynamics of each one of these models requires the use of different codes, which in turn use different time integration algorithms. The work presented here proposes a co-simulation environment that uses an integrated memory shared communication methodology between the multibody and finite element codes. The methodology is general being applicable to the dynamic co-simulation of models running in different codes. The benefits and drawbacks of the proposed methodology and of its accuracy and suitability are supported by the application to a real operation scenario of a high-speed catenarypantograph system for which experimental test data is available.
1 Introduction The limitation on the top velocity of high-speed trains concerns the ability to supply the proper amount of energy required to run the engines, through the catenary-pantograph interface. Due to the loss of contact not only the energy supply is interrupted but also arching between the collector bow of the pantograph and the contact wire of the catenary occurs, leading to the deterioration of the functional conditions of the two systems. An alternative would be to increase the contact force between the two systems. But such force increase would lead to a rapid wear of the registration strip of the pantograph and of the contact wire with negative consequences on the durability of the systems. Even in normal operating conditions, a control on the catenary-pantograph contact force is required to ensure longer maintenance cycles and a better reliability of the systems. The foreseeable developments of active pantographs suggest new forms of controlling the energy supply for the high speed trains. But all these situations require that the dynamics of the pantograph-catenary C.L. Bottasso (ed.), Multibody Dynamics: Computational Methods and Applications, c Springer Science+Business Media B.V. 2009
231
232
J. Ambr´ osio et al.
are properly modeled and that software used for analysis, design or to support maintenance decisions is not only accurate and efficient but also allows for modeling all details relevant to the train overhead energy collector operation. A large number of works dedicated to the study of the catenary-pantograph interaction are being presented to different communities emphasizing not only the mechanical aspects of construction, operation and maintenance but also the challenges for simulation due to the multi-physics characteristic of this problem. Gardou [10] presents a rather simple model for the catenary, using 2 dimensional finite elements, where all nonlinear effects are neglected. The single model of catenary analyzed is excited by a lumped mass model of a pantograph. Jensen [16] presents a very detailed study on the wave propagation problem on the catenary and a 2 dimensional model for the catenarypantograph dynamics. In a similar line of work Dahlberg [8] describes the contact wire as an axially loaded beam and uses modal analysis to represent its deflection when subjected to transversal and axial loads, showing in the process its relation to the critical velocity of the pantograph. In both references [10, 16] not only the representation of the contact forces is not discussed but also no reference is made on how the integration algorithms are able to handle the contact loss and impact between registration strip and contact wire. Labergri [19] presents a very thorough description of the pantograph catenary system that includes a 2 dimensional model for the catenary based on the finite element method, and a pantograph model based on a multibody approach, being the contact treated by unilateral constraints. In all works mentioned it is claimed that the catenary structural deformations are basically linear and, consequently, the catenaries are modeled using linear finite elements, except for the droppers’ slacking which is handled as a nonlinear effect but not by nonlinear finite elements. Both Seo et al. [28, 29] state the need to treat the catenaries as being nonlinear due to their large deformations. They treat the catenary contact wire with a finite elements based on the absolute nodal coordinate formulation while the pantograph is a full 3 dimensional multibody model. The contact is represented by a kinematic constraint between contact wire and registration strip and no loss of contact is represented. None of the models used has been validated and no comparative studies are provided to support the claims regarding the need to handle nonlinear catenary deformations or the suitability of using linear deformations only. Most of the works focusing the pantograph-catenary interaction elect the finite element method to develop and analyze linear models catenaries and use lumped mass pantograph models due to the need to maintain the linearity of the analysis. However, it is recognized by a large number of researchers that the nonlinearities of the pantograph system play a very important role in the energy collection and, therefore, either nonlinear finite element or multibody models can deliver superior analysis capabilities [1, 17, 21, 25–27, 32, 33]. Due to the multiphysics problem involved in modeling the catenary-pantograph system and the need for its simulation Arnold and co-workers [1, 31] suggest the co-simulation between the finite difference discretization of the catenary
A Memory Based Communication in the Co-simulation of Codes
233
and the multibody representation of the pantograph. Mei et al. [21] suggest a coupling procedure between a finite element discretization of the catenary and a physical prototype of a pantograph. This work shows the possibility of coupling numerical and experimental techniques. Rauter et al. [25] show how the coupling between finite element software, to solve the dynamics of the catenary, and multibody software, to obtain the dynamic response of the pantograph, can be efficiently achieved. In these references it is observed that the finite element code ANSYS [22] is the most popular choice of software for the catenary [17, 21, 32, 33] while no major preferences for a particular multibody code are stated. There are, currently, no accepted general numerical tools designed to simulate the pantograph-catenary system in nominal, operational, and deteriorated conditions. Here it is understood that operating conditions must take into account the wear effects and the deteriorated conditions include extreme climatic conditions, material defects or mechanical problems. Several important efforts have been reported to understand the mechanisms of wear in catenaries and the effect of defect conditions on the dynamics of the complete system by Collina and Bruni [6] and by Collina et al. [7] or to describe the aerodynamic effects on the quality of the catenary-pantograph contact [5, 8]. The dynamic analysis procedures and the models developed for catenaries and pantographs are also used for designing pantograph control paradigms [12,27] or even wire-actuator control and contact force observers [2]. The different computational procedures and methods developed for representing the pantograph-catenary interaction led to the development of several computer programs used by designers and analysts. The code CATMOS [4], developed in the early 1990s, allows for the vibration analysis of the system. The pantograph is represented by a lumped-mass model and the catenary by Euler-Bernoulli and Timoshenko beam elements. No nonlinearities are considered in the system. Using finite element models in the framework of the nonlinear finite element code ABAQUS the program FAMOS [26] enables the development of linear finite element models for the catenary and nonlinear finite element models for the pantograph. This program enables the analysis of fully three dimensional pantograph models. Veitl and Arnold [1] proposed a co-simulation strategy between the code PROSA, where a catenary is described by the finite difference method and the SIMPACK commercial multibody code used to simulate the pantograph. All models involved in this work are 3 dimensional but the catenaries are hard coded, and therefore, the models and programs can hardly be used for different catenary systems. The program DINACAT/WINCAT [9] uses the finite element method to represent both catenary and pantograph. This is a two-dimensional program in which the lumped-mass pantograph models are used. The methodology presented here, developed in the framework of the EUROPAC project, uses co-simulation between a finite element code to describe catenaries, EUROPACAS-FE, and a multibody codes to handle the dynamics of pantographs, EUROPACAS-MB. The co-simulation strategy is achieved
234
J. Ambr´ osio et al.
by using memory based communication only and efficiently coordinating the integration of the two dynamic subsystems, catenary and pantograph models. The application to a high-speed train with multiple pantographs running on a suitable catenary is used to demonstrate the methods described here.
2 Multibody Dynamics: Direct Solution A typical multibody model is defined as a collection of rigid or flexible bodies that have their relative motion constrained by kinematic joints and is acted upon by external forces. The forces applied over the system components may be the result of springs, dampers, actuators or external applied forces describing gravitational, contact/impact or other forces. Let the configuration of the multibody system be described by n Cartesian coordinates q, and a set of m algebraic kinematic independent holonomic constraints Φ. A system of differential algebraic equations, representing the constrained equations of motion is defined as [24]
q ¨ g M ΦTq = (1) λ γ Φq 0 where M is the mass matrix, Φq the Jacobian matrix, q ¨ the acceleration vector, λ the vector of the Lagrange multipliers, g a vector with the forces applied on the system body and the terms dependent on the system velocities and γ is a vector with the right-hand side of the constraint acceleration equations. Equation (1) has to be solved for q ¨ and A unique solution is obtained when the constraint equations are considered simultaneously with the differential equations of motion and a proper set of initial conditions [11, 30]. In each integration time step, the accelerations vector, q ¨, together with velocities vector, q, ˙ are integrated in order to obtain the system velocities and positions at the next time step. This procedure is repeated up to final time will be reached. The multibody system for the pantograph, considered here, has to be integrated for long analysis periods requiring efficient and stable numerical procedures. The set of differential algebraic equations of motion (1) does not use explicitly the position and velocity equations associated to the kinematic constraints. Consequently, for moderate or long time simulations, the original constraint equations are rapidly violated due to the integration process. Thus, in order to stabilize or keep under control the constraints violation, Eq. (1) is solved by using the Baumgarte Stabilization Method [3] or the Augmented Lagrangean formulation [15], and the integration process is performed using a predictor – corrector algorithm with variable step and order [11, 30]. Furthermore, due to the long time simulations typically required for pantograph-catenary interaction analysis, it is also necessary to implement constraint violations correction methods. The Coordinate Partition method is used for the purpose [24].
A Memory Based Communication in the Co-simulation of Codes
235
3 Finite Element Dynamic Analysis The second part of the electric collecting system is the catenary. These systems do not exhibit large displacements, i.e., large rotations, and therefore, they are typically modeled by using linear finite elements. The main catenary elements, the contact and messenger wires are modeled by using pre-tensioned Euler-Bernoulli beams. The motion of the catenary is characterized by small rotations and small deformations in which the only nonlinear effect is the slacking of the droppers. Using the finite element method to represent the structure, the equilibrium equations for the structural system are [34] Ma + Cv + Kx = f ,
(2)
where M, C and K are the finite element global mass, damping and stiffness matrices, respectively, x is the vector with the nodal displacements, v is the vector of nodal velocities, a is the vector of nodal accelerations and f is the vector with the applied forces. Equation (2) needs to be solved for x or for a depending on the integration method used to obtain the structural system dynamic response. In this work the integration of the nodal accelerations is achieved using a Newmark family integration algorithm [23]. For what follows it is important to review the features of the integration algorithm used. At any given time step the algorithm proceeds by first predicting the displacements and velocities for the new time step by using the information of the last completed time step as 2 ˜ t+∆t = dt + ∆t vt + ∆t (1 − 2β) at , d 2
(3a)
v ˜t+∆t = vt + ∆t (1 − γ) at .
(3b)
Based on the position and velocity predictions for the FE mesh and on the pantograph predicted position and velocity the contact forces are evaluated for t + ∆t and the FE mesh accelerations are calculated from the equilibrium equation ˜ t+∆t . vt+∆t − Kd (4) M + γ∆tC + β∆t2 K at+∆t = f t+∆t − C˜ Then, with the acceleration at+∆t the positions and velocities of the finite elements at time t + ∆t are corrected by ˜ t+∆t + β∆t2 at+∆t , dt+∆t = d
(5a)
˜t+∆t + γ∆t at+∆t . vt+∆t = v
(5b)
This procedure is repeated until a stability value is reached for a given time step.
236
J. Ambr´ osio et al.
4 Contact Model for Catenary-Pantograph The contact force due to pantograph-catenary interaction, regarding present operating conditions and pantograph and catenary technology, is characterized by a high-frequency oscillating force with high relative amplitude. Railway industry measurement data shows that reasonable values for the contact force are, for a train running at approximately 80 m/s: a mean value of 200N oscillating between 400N and 100N. Loss of contact in particular points of the catenary may also occur. Therefore impact effects must be included in the model. Although different continuous force contact force models may have different aspects they all have similar features, i.e., they evaluate the contact force as a function of a pseudo-penetration between two elements and a proportionality factor often designated as stiffness of the contact elements. The contact model, exemplified here as one of the typical contact force models that is both accurate and computationally efficient, is based on the work of Hunt and Crossley [14] and it has been proposed by Lankarani and Nikravesh [20]. This contact force model also includes hysteresis damping for impact between components of a system or with external components to the system. In this work, the Hertzian type contact force including internal damping can be written as [20] 2 3 3(1 − e2 ) δ˙ n , (6) FN = Kδ 1 + 4 δ˙ (−) where K is the generalized stiffness contact, e is the restitution coefficient, δ˙ is the relative penetration velocity and δ˙ (−) is the relative impact velocity. The proportionality factor K is obtained from the Hertz contact theory, the external contact between two cylinders with perpendicular axis. In the contact law, expressed by Eq. (1), it is implied that the pseudopenetration δ is known, or calculated based on the position of the elements in contact. Then the force is evaluated based on the material geometric properties, K and e, and on the kinematic variables, which is applied, in turn, to finite element mesh, in the EUROPACAS-FE code, and to the multibody model, in the EUROPACAS-MB software. Note that the calculation and application of the contact forces in each subsystem has different implications on the integration algorithms that are further explored in this work.
5 Structure of Co-simulation Procedure The analysis of the pantograph-catenary interaction is done by two independent codes, the pantograph code, EUROPACAS-MB, which uses a multibody formulation, and the catenary code, EUROPACAS-FE that is a finite element software. Both programs can work as stand-alone codes. The structure of the communication between the codes is shown in Fig. 1. The EUROPACAS-MB
A Memory Based Communication in the Co-simulation of Codes
rstrip, rstrip
Contact catenary - pantograph
Position,Velocity
.
fcontact = K 1+
3 4
(1 −e ) 2
δ
.
δ (−)
MB - Pantograph ζi ηi
.
FEM - Catenary
n
δ n
fcontact, s′Pstrip Force, Point
ξi
z x
→
fcontact
siP P
→
→
ri
237
→
P
ri
y
Fig. 1. Structure of the communication scheme between the MB and the FE codes
code provides the EUROPACAS-FE code with the positions and velocities of the pantographs registration strips. EUROPACAS-FE calculates the contact force, using the contact model represented by Eq. (6) or by any equivalent contact law, and the location of the application points in the pantographs and catenary, using geometric interference. These forces are applied to the catenary, in the finite element code, and to the pantograph model, in the MB code. Each code handles separately the equations of motion of each sub-system based on the shared force information. The typical numerical integration algorithms used by FEM codes are Newmark family algorithms, as defined by Newmark [23]. Therefore the FEM code needs a prediction of the positions and velocities not only of the catenary but also of the pantograph in a forthcoming time before advancing to a new time step. A predicted contact force is calculated and, using the finite element method equations of equilibrium, the catenary accelerations are computed for the new time-step. The calculated acceleration values are used to correct the initially predicted positions and velocities of the catenary. The MB code uses a Gear multi-step multi-order integration algorithm [11, 30]. To proceed with the dynamic analysis, the MB code needs information about the positions and velocities of the pantograph components and also the contact force and its application point coordinates at different time instants during the integration time period, and not only at its start and end. The compatibility between the two integration algorithms imposes that the state variables of the two subsystems are readily available during the integration time but also that a reliable prediction of the contact forces is also available at any given time step. Several strategies can be envisaged to tackle this co-simulation problem such as the gluing algorithms proposed by Hulbert et al. [13] or the co-simulation procedures suggested by Kubler and Schiehlen [18]. The key of the synchronization procedure between the the MB and FE codes is the time integration, which must be such that it is ensured the correct dynamic analysis of the pantograph-catenary system, including the loss and regain of contact. Let it be assumed that the FE integration code is of the Newmark family and has a constant time step. Moreover, let it be assumed that the time step of the FE is small enough not only to assure the stability
J. Ambr´ osio et al.
Initialize Communication
FEM to MB
Initialize Communication
Information Exchange
MB to FEM
Information Exchange
EUROPACASMB
EUROPACAS FEM Input Data
EUROPACASFEM
238
EUROPACAS MB Input Data
Fig. 2. Communication stages between the MB code and the FE codes
of the integration of the catenary but also to be able to capture the initiation of the contact between the pantograph registration strip and the contact wire of the catenary. The only restriction that is imposed in the integration algorithm of the multibody code is that its time step cannot exceed the time step of the FE code. Finally let it be assumed that both codes can start independently from each other, i.e., the catenary FE model and the pantograph MB model include the initial conditions for the start of the analysis expressed in terms of the initial positions and velocities of all components of the systems. A fully integrated communication interface is implemented according to the two stages represented in Fig. 2. In first stage, where both codes exchange input data information, it is necessary to perform initialization procedures, while in a second stage data is shared during the dynamic analysis. In the first stage, the EUROPACASMB code provides the EUROPACAS-FE code with the information about the number of registrations strips used in the model, initial position and velocity for each registration strip. Subsequently, the EUROPACAS-FE code provides the MB code with information about the initial and final analysis time and the time step to be used in the FE analysis. The code also provides the catenary height for the location of each registration strip, in order to ensure that no pantograph penetration violation occur at the initial steps and that the registration strips are below the catenary contact wire at the start of the simulation. 5.1 Memory Mapped Files Memory mapped files are the centerpiece of a methodology that allows high speed transfer of large amount of data between several applications. The use of memory mapped files enables simultaneous access to data from independent applications. The interface may be independent of the type of application code or environment; and the time used for the exchange of data must be negligible when compared with the computational time of the analysis. A memory mapped file is a file that the operating system recognizes in the memory, interacting with it and performing all allowed operations. The memory mapped file behaves like a memory array and, therefore, it allows the use of memory copy commands or pointer manipulation. The use of memory mapped files requires two main components: a helper object that defines
A Memory Based Communication in the Co-simulation of Codes
239
Fig. 3. Representation of the code components for using memory mapped files
structures and handles the data array in memory; a C++ function that performs the operations with the data array, as outlined by Fig. 3. This routine is compiled by the applications. If one of the applications runs in MATLAB, it will assume the form of a MEX-file being compiled into a DLL that exports the standard entry point that MATLAB understands. 5.2 Multibody Communication Interface The communication procedure implemented during the initialization stage for the EUROPACAS-MB module of the common software, and outlined in Fig. 4, consists in: 1. Enter the process after reading the MB input files. 2. Communicate to EUROPACAS-FE module the number of registration strips and the velocity of the train sets. For each registration strip communicate the pantographs positions of the extremities points P and Q of the registration strip points. 3. Inquire if the data generated in the communication procedure by the EUROPACAS-FE module is available. When the data exists proceed to next step. 4. Read the initial and final time of analysis and the time step used. 5. For each registration strip check if its initial vertical position exceeds the maximum allowed position for contact purposes. If any of the registration strips is in contact the user is warned for a penetration violation and the program ended. If not proceed. 6. Inquire if the data generated in the communication by EUROPACAS-MB has been used by the EUROPACAS-FE. When the data ceases to exist the MB communication can proceed to the next step. 7. Communicate to EUROPACAS-FE the initial conditions for each registration strip at the starting time. This information enables the EUROPACAS-FE module to perform its first integration step.
240
J. Ambr´ osio et al. MB Communication Flowchart
FE Communication Flowchart
Enter Process
Enter Process
Create info MB to FE Write Flag equal to “0” Write Nb of Reg. Strips, Velocity For i=1 to Number of reg. Strips Write R_Pi, R_Qi
No Info MB to FE exists? Yes
No Info FE to MB exists?
Access info MB to FE Read Number of Reg. Strips For i=1 to Number of reg. Strips Read R_Pi, R_Qi, Fstat_i, Vel_i Erase info MB to FE Change Flag to “1”
Yes Access info FE to MB Read t_0, t_end, delta_t For i=1 to Number of reg. Strips Read height_i; gap_i Erase info FE to MB Change Flag to “0”
Yes End Program
Create info FE to MB Write Flag equal to “1” Write t_0, t_end, delta_t For i=1 to Number of reg. Strips Write height_i; gap_i
Penetration of any reg. Strip?
No Yes
Dynamic Analysis
Time
No Create info MB to FE Change Flag to “0” Write t_Dap For i=1 to Number of reg. Strips Write R_Pi, RD_Pi,...
FE integration time steps
MB integration time steps
Info MB to FE exists?
t0P
t0C
Dynamic Analysis
Fig. 4. Flowchart of the communication procedure during the initialization
Let it be assumed that at some instant, in the middle of the time step or even past its end, the contact forces and their application points are known. Then, a table with the values of the forces and application points can be constructed. By interpolation of this table the equilibrium equations of the MB system can be formulated at any time and the integration algorithm can estimate the state variables. Besides internal control of the time step size the integration algorithms need to predict the state variables at different instants
A Memory Based Communication in the Co-simulation of Codes
241
Fig. 5. Communication procedure during the dynamic analysis
of the integration time step before calculating the state variables at the end of the time step [11, 30]. Therefore, the existence of a table of forces that can be interpolated is fundamental to avoid that the communication interface has to deal with the details of the integration algorithms implemented in the MB and in the FE software. The communication procedure, represented in Figs. 4 and 5, consists on the updating of the referred table, assumed to be available at every instant to the construction of the MB equilibrium equations. The steps involved in the communication between the EUROPACAS-FE and EUROPACAS-MB modules during the dynamic analysis, as seen by the multibody module, are highlighted by the following steps:
242
J. Ambr´ osio et al.
8. If the MB integrator is predicting the state variables terminate this communication and return to EUROPACAS-MB because the table with the forces cannot be updated. Otherwise, the integrator is correcting the state variables and, therefore, continues to next step in the communication procedure. 9. Inquire if the data generated to the communication by EUROPACAS-MB has already been used by EUROPACAS-FE. If the data still exists, halt EUROPACAS-MB until EUROPACAS-FE uses it and catches up with the current time of the MB analysis. When the data is erased the MB communication proceeds to next step. 10. Enter the process when the forces on the MB system are being evaluated. 11. If the current time in the MB integration of the pantograph is larger than the final time for which there is data available in the table with forces proceed to the table update. Otherwise, the table with the forces can still be interpolated and the communication finishes being the control transferred back to EUROPACAS-MB. 12. Inquire if EUROPACAS-FE ended the time integration cycle, in which case, proceed for the finalizing procedures of EUROPACAS-MB and terminate the dynamic analysis. If not, proceed to next step. 13. Generate data for the EUROPACAS-FE with the information on the position and velocity of the pantograph registration strip, as described in Fig. 1. 14. When the data generated by EUROPACAS-FE becomes available to EUROPACAS-MB, proceed to the next step of the communication. If such data does not exist, wait until it is available. 15. Update the table with the new forces and application point locations on the pantographs and the time at which they have been calculated, denominated by tcatenary . 16. Erase the data from the memory file sent by EUROPACAS-FE to EUROPACAS-MB, flagging in this form that the analysis can proceed on both codes. 17. Return the control of the analysis of the pantograph to EUROPACAS-MB. Nowhere in the communication procedure outlined it is implied what kind of integration algorithm is used for the FE catenary analysis, provided that it is a fixed time step integrator. Even this condition can be relaxed, but it would not have any practical implication as it is not usual that FE dynamic analysis is performed with variable time step algorithms. 5.3 Programming Structure for the Communication The memory mapped files are data files directly handled by the operating system of the computer that are defined in memory and not in the hard drive of the computer. The memory mapped files can be accessed directly by multiple programs, as described in Fig. 6. In terms of speed of access, the memory
A Memory Based Communication in the Co-simulation of Codes
243
Fig. 6. (a) Coding of the helper object that defines and allows access to memory by EUROPACAS MB and FE; (b) flowchart of the communication procedure
244
J. Ambr´ osio et al.
Fig. 7. Mex file to perform the memory operations, compiled internally in the multibody code
data file behaves as any other data array in memory. For the implementation considered in this work the stand-alone program 1, shown in Fig. 6, is the MATLAB environment working program EUROPACAS-FE while the stand alone program 2 is the Fortran based program EUROPACAS-MB. The Mex File is a DLL that exports a standard entry point defined in the MATLAB code. The helper object is a C++ coded function that accesses and transfers data to and from the memory mapped file. The helper object is interpreted in EUROPACAS-FE via the Mex file and compiled directly in EUROPACAS-MB. The helper object, which is the central piece in the communication, is described in Fig. 6, where its actual coding is presented. The Mex file that allows EUROPACAS-FE to use the helper object function as a MATLAB function is presented in Fig. 7 with its actual coding. This Mex file, written in C++, is compiled internally in EUROPACAS-MB and made available to EUROPACAS-FE. The reading and writing operations to the memory mapped files in EUROPACAS-MB are described in Fig. 8 with their actual coding. By controlling the number of accesses to the memory files it is possible to create the simplest and most efficient interface without sacrificing the generality of both FE and MB codes and maintaining the accuracy in all methods used.
A Memory Based Communication in the Co-simulation of Codes
245
Fig. 8. Writing and reading data in the multibody code using memory operations
6 Demonstration Cases The methodology presented here to ensure the co-simulation of two pieces of software is demonstrated by having several pantographs in the same train running on a high-speed catenary. For the purpose, the CX pantograph and the SNCF High Speed 25 kV catenary are used. 6.1 Multibody Pantograph Consider the pantograph depicted in Fig. 9 where the origin of the reference frame (X, Y, Z) is coincident to its insertion point on the carbody. A rigid multibody representation of the pantograph system rigid bodies is shown in Fig. 10. 6.2 Model Characteristics and Simulation Description The properties and initial conditions of the CX pantograph multibody model used are shown in Table 1. The characteristics of the kinematic joints used in the multibody model are shown in Table 2. The initial condition for the simulation scenario has the train running with a speed of 83.3 m/s. Spring-damper-actuator elements are used to model the forces transmitted in the rigid bodies that compose the pantograph system. The characteristics of the force elements are shown in Table 3. The velocity of the pantograph and the static force at the bow-head are defined in Table 4. 6.3 Catenary Model The catenary used in this simulation scenario includes the contact wire, which hangs from the droppers and the steady arms, a messenger wire, which in turn
246
J. Ambr´ osio et al.
Fig. 9. Pantograph system to be modeled
ζhead
ζlower arm
ζupper arm
ηhead
ηlower arm ηupper arm ξhead
ξupper arm
ξlower arm
Fig. 10. Multibody representation of pantograph constitutive bodies
Table 1. Mass, inertia properties and initial conditions of pantograph rigid bodies ID Rigid body
Mass (Kg)
1
32.00 32.18 15.60 3.10 1.15 1.51
2 3 4 5 6 7
Pantograph base Lower arm Top arm Lower link Top link Stabilization arm Registration strip
9.50
Inertia properties (Kg.m2 ) Iξξ /Iηη /Iζζ
Initial position (m) x0 /y0 /z0
Initial orientation e1 /e2 /e3
2.31/2.76/4.87
0.00/0.00/0.00
0.00/0.00/0.00
0.31/10.43/10.65 0.15/7.76/7.86 0.05/0.46/0.46 0.05/0.48/0.48 0.07/0.05/0.07
0.57/0.00/0.41 0.39/0.00/1.06 0.36/0.00/1.00 −0.20/0.00/1.32 −0.55/0.00/1.42
0.00/0.17/0.00 0.00/−0.18/0.00 0.00/−0.16/0.00 0.00/0.26/0.00 0.00/0.00/0.00
1.59/0.21/1.78
−0.55/0.00/1.51
0.00/0.00/0.00
is hanging from the stitch wires supported in the bracket, as shown in Fig. 11. A bracket and a mast are hinged to each support, spaced from each other by 54 m, being the registration arm connected to the mast, and the steady arm hinged to the registration arm. The catenary geometry is fully three dimensional being the stagger imposed by the location of the steady arms.
A Memory Based Communication in the Co-simulation of Codes
247
Table 2. Kinematic joints used in the multibody model of the pantograph ID
Kinematic joint
Connected bodies
End stroke stop
i
j
α
β
(0)1 (0)2 (0)1 (0)2 (0)1 (0)2 (0)1 (0)2 (0)1 (0)2 (0)1 (0)2 (0)1 (0)2
(0)1 (0)2 (0)1 (0)2 (0)1 (0)2 (0)1 (0)2 (0)1 (0)2 (0)1 (0)2 (0)1 (0)2
1
Revolute joint
1
2
2
Revolute joint
2
3
3
Revolute joint
3
6
5
Spherical joint
1
4
5
Spherical joint
3
4
6
Spherical joint
2
5
8
Spherical joint
5
6
Attachment points Local coordinates (m) Body i ξj /ηj /ζ
Body j ξj /ηj /ζj
(0.02/0.00/0.13)P (0.02/1.00/0.13)Q (−0.82/0.00/0.00)P (−0.82/1.00/0.00)Q (1.01/0.00/0.00)P (1.01/1.00/0.00)Q (−0.26/0.00/0.00)P (−/−/−)Q (−1.19/0.00/−0.13)P (−/−/−)Q (−0.78/0.00/0.00)P (−/−/−)Q (0.96/0.00/0.00)P (−/−/−)Q
(0.82/0.00/0.00)P (0.82/1.00/0.00)Q (−1.01/0.00/0.00)P (−1.01/1.00/0.00)Q (0.00/0.00/0.00)P (0.00/1.00/0.00)Q (0.69/0.00/0.00)P (−/−/−)Q (−0.62/0.00/−0.03)P (−/−/−)Q (−1.00/0.00/0.00)P (−/−/−)Q (0.00/0.00/−0.10)P (−/−/−)Q
Table 3. Characteristics of the linear force elements used in the multibody model of the pantograph ID
1 2 3 4
Force Spring Undef. Damping Actuator Bodies element stiffness length coeff. force connected (N/m) (m) (N.s/m) (N)
Spr-Dpr 1,000 Spr-Dpr 3,600 Spr-Dpr 3,600 Spr-Dpr 10,000
0.41 0.10 0.10 0.10
3,000 13 13 300
0 0 0 0
Attachment points Local coordinates (m)
i
j
Body i ξi /ηi /ζi
Body j ξj /ηj /ζj
1 6 6 6
2 7 7 7
−0.57/0.00/0.00 0.00/0.34/0.00 0.00/−0.34/0.00 −0.1/0.00/0.09
0.00/0.00/0.00 0.00/0.34/0.00 0.00/−0.34/0.00 0.00/0.00/0.00
Note: Force element number 4 act as translational joint between bodies 6 and 7. Table 4. Velocity and static force and traveled distance by the pantograph Velocity (m/s) 80
Static force (N) Catenary covered distance [start-end] (m) 170
110–1,150
Messenger wire Bracket
Hinge Dropper
Mast
Stichwire Stich wire
Registration arm Steady arm Hinge
Contact wire
Fig. 11. Representation of the SNCF high-speed 25 kV catenary
248
J. Ambr´ osio et al. 350
Measured CX-Reference
300
Force (N)
250 200 150 100 50 0 513
567
621
675
729
783
Track Length (m)
Fig. 12. Contact force results (filtered at 20 Hz) for the CX pantograph model versus experimental measurements, for the scenario of a train equipped with a single pantograph
The catenary is modeled using a three-dimensional finite element model. In particular, Euler Bernoulli beam elements are used to model the contact and messenger wires, bar elements are used for the steady and registration arms models. The droppers and suspension elements are the only nonlinear structural components of the catenary. 6.4 Contact Force Results for a Single Pantograph Scenario The numerical results of the model, using the communication based on the memory mapped files interface, presented in this paper are shown together with experimentally measured values of the pantograph-catenary system in Fig. 12. It is observed the correlation, according to the EN50318 standard (filtered at 20 Hz) with the experimental measurements. The numerical results include the vibration modes of the real system even if, due to the rigid body models, the stiffness of the model enhances the contact force maxima and minima. The results are satisfactory specially considering that no aerodynamic effects, cross-winds and railway dynamics effects are included in the actual model. The filtered results include the effects of the structural elements of the catenary, due to the passage of the pantograph, but may not include, for example, the effects of singular defects. Regardless of these differences the results show that the model developed and the methodology applied is suitable to describe the pantograph catenary contact. 6.5 Train with Multiple Pantographs The same simulation scenario is used to analyze the dynamic behavior of the pantograph-catenary interaction on a high-speed train equipped with multiple pantographs. The train, depicted in Fig. 13, is equipped with two pantographs spaced by 200 m while running with a speed of 83.3 m/s. The contact forces of the pantographs on the catenary are presented in Fig. 14. The observation of the contact force, corresponding to the length run
A Memory Based Communication in the Co-simulation of Codes x
ζ j
y z
Point 2
ζ
j
η
ζ ϕ
j
j
η
Point 1 xG
ξ≡ξ
x
η
j
η
y z Point N
ϕ
Point N
CM
j
ζ
249
CM
j
ξ≡ξj
Trajectory
yG zG
Fig. 13. Scenario made of a high-speed train equipped with two pantographs running on a tangent track 350 CX-Front
300
CX-Rear
Force (N)
250 200 150 100 50 0 513
567
621 675 Track Length (m)
729
783
Fig. 14. Contact force results (filtered at 20 Hz) of the CX pantographs, for the scenario of a train equipped with two pantographs 0.9 Contact Point Y (m)
0.8 0.7 0.6 0.5 0.4 0.3 0.2
CX-Front
0.1 0 513
CX-Rear 567
621
675
729
783
Track Length (m)
Fig. 15. Variation of the position of the catenary contact point on the train pantographs registration strips
by the pantograph between two consecutive registration arms of the catenary, indicates that the front pantograph contact force is similar to that of the case of the train equipped with a single pantograph. The rear pantograph contact force presents contact forces with much larger oscillations than the front pantograph. Not only has the rear pantograph higher contact forces but it also has the lowest. The most serious problem with the interference of the front pantograph on the rear one concerns the high number of contact losses that are apparent in the results presented in Fig. 14. The variation of the application point of the contact point on the pantographs registration strips along the local lateral direction are displayed in Fig. 15. There, it is clear the existence of the stagger on the catenary and of
250
J. Ambr´ osio et al.
Contact Point Z (m)
5.15 5.14 5.13 5.12 5.11
CX-Front CX-Rear
5.1 513
567
621 675 Track Length (m)
729
783
Fig. 16. Vertical position of the pantographs for a train operating at 300 km/h
the small oscillations on such contact points. More interesting is the global vertical location of the contact point, presented in Fig. 16, which also shows that the rear pantograph oscillates much more than the front pantograph.
7 Conclusions This work presents a communication between the two programs using independent dynamic integration algorithms, providing a transparent co-simulation environment. This communication strategy has been developed such a way that any choice for integrators, for both Finite Element and Multibody codes, can be made independently. Fully three dimensional methodology for the analysis and modeling of catenaries and pantographs, each using the most suitable method, have been presented in this work to demonstrate the cosimulation environment proposed. The finite element method is used for the catenary while a multibody methodology is used for the pantograph in completely independent programs, actually coded with different programming languages. The overall procedure has been demonstrated through its application to the study of the dynamics of a high-speed train equipped with two pantographs. In the process of analyzing the results, the application clearly shows that the operation of the front pantograph has important interferences on the quality of the rear pantograph contact, eventually leading to frequent contact losses.
Acknowledgements The work presented has been developed in the framework of the European funded project EUROPAC (European Optimized Pantograph Catenary Interface, contract no STP4-CT-2005-012440). The collaboration of SNCF, Faiveley Transport and Politecnico di Milano is specially acknowledged. The support of Fundacc˜ ao para a Ciˆencia e Tecnologia (FCT) through the grant SFRH/BD/18848/2004 is also gratefully acknowledged.
A Memory Based Communication in the Co-simulation of Codes
251
References 1. Arnold M, Simeon B (2000) Pantograph and catenary dynamics: a benchmark problem and its numerical solution. Appl Numer Math 34(4):345–362 2. Balestrino A, Bruno O, Landi A, Sani L (2000) Innovative solutions for overhead catenary-pantograph system: wire actuated control and observed contact force. Vehicle Syst Dyn 33(2):69–89 3. Baumgarte J (1972) Stabilization of constraints and integrals of motion in dynamical systems. Comp Meth Appl Mech Eng 1:1–16 4. Becker K, Konig A, Resch U, Zweig B-W (1995) Hochgeschwindigkeitsfahrleitung – ein thema fur die forschung (the high-speed catenary – a subject for research). ETR - Eisenbahantechnische Rundschau 44(1-2):64–72 5. Bocciolone M, Resta F, Rocchi D, Tosi A, Collina A (2006) Pantograph aerodynamic effects on the pantograph-catenary interaction. Vehicle Syst Dyn 44(S1):560–570 6. Collina A, Bruni S (2002) Numerical simulation of pantograph-overhead equipment interaction. Vehicle Syst Dyn 38(4):261–291 7. Collina A, Melzi S, Facchinetti A (2002) On the prediction of wear of contact wire in OHE lines: a proposed model. Vehicle Syst Dyn 37(S1):579–592 8. Dahlberg T (2006) Moving force on an axially loaded beam – with applications to a railway overhead contact wire. Vehicle Syst Dyn 44(8):631–644 9. Fernandez J-A, Pastor M (1998) Analisis mediante elementos finites del acoplamiento dinamico catenaria-pant´ ografo (Finite element analysis of the dynamic coupling catenary-pantograph). Minist´erio de Fomento, Centro de Estudos y Experimentacion de Obras P´ ublicas, Madrid, Spain 10. Gardou M (1984) Etude du comportement dynamique de l’ensemble pantographe-cat´enaire (Study of the dynamic behavior of the pantographcatenary) (in French). Diploma thesis, Conservatoire National des Arts et Metiers, Paris, France 11. Gear CW (1971) Simultaneous numerical solution of differential-algebraic equations. IEEE Trans Circuit Th 18(1):89–95 12. Huang YJ (2004) Discrete fuzzy variable structure control for pantograph position control. Elect Eng 86:171–177 13. Hulbert G, Ma Z-D, Wang J (2005) Gluing for dynamic simulation of distributed mechanical systems. In: Ambr´ osio J (ed) Advances on computational multibody systems. Springer, Dordrecht, The Netherlands 14. Hunt KH, Crossley FR (1975) Coefficient of restitution interpreted as damping in vibroimpact. J Appl Mech 7:440–445 15. Jalon J, Bayo E (1994) Kinematic and dynamic simulation of multibody systems: the real-time challenges in mechanical systems simulation. Springer, New York 16. Jensen C (1997) Nonlinear systems with discrete and discontinuous elements. Ph.D. thesis, Technical University of Denmark, Lyngby, Denmark 17. Jerrelind J, Stensson A (2003) Nonlinear dynamic behaviour of coupled suspension systems. Meccanica 38:43–59 18. Kubler R, Schiehlen W (2000) Modular simulation in multibody system dynamics. Mult Syst Dyn 4:107–127 19. Labergri F (2000) Mod´elisation du comportement dynamique du syst`eme pantographe-cat´enaire (Model for the dynamic behavior of the system
252
20.
21.
22. 23. 24. 25.
26.
27.
28.
29.
30. 31.
32.
33. 34.
J. Ambr´ osio et al. pantograph-catenary) (in French). Ph.D. thesis, Ecole Doctorale de Mechanique de Lyon, Lyon, France Lankarani HM, Nikravesh PE (1990) A contact force model with hysteresis damping for impact analysis of multibody systems. AMSE J Mechl Design 112:369–376 Mei G, Zhang W, Zhao H, Zhang L (2006) A hybrid method to simulate the interaction of pantograph and catenary on overlap span. Vehicle Syst Dyn 44(S1):571–580 Moaveni S (2007) Finite element analysis theory and application with ANSYS. Prentice-Hall, Englewood-Cliffs, NJ Newmark, NM (1959) A method of computation for structural dynamics. ASCE J Eng Mech Div 85(EM 3):67–94 Nikravesh P (1988) Computer-aided analysis of mechanical systems. PrenticeHall, Englewood-Cliffs, NJ Rauter F, Pombo J, Ambr´ osio J, Chalansonnet J, Bobillot A, Pereira M (2006) Contact model for the pantograph-catenary interaction. In: Proceedings of the Third Asian Conference of Multibody Dynamics (on CD), Tokyo, Japan, August 2–4 Reinbold M, Deckart U (1996) FAMOS – Ein programm zur simulation von oberleitungen und stromabnehmer (FAMOS – a program for the simulation of catenaries and pantographs). ZEV+DET Glases Annalen 120(6):239–243 Resta F, Collina A, Fossati F (2001) Actively controlled pantograph: an application. In: Proceedings of the IEEE/ASME International Conference on Advanced Intelligent Mechatronics, Como, Italy, July 8–12 Seo J-H, Sugiyama H, Shabana A (2005) Modeling pantograph/catenary interactions for multibody railroad vehicle systems. In: Goicolea J, Orden J-C, Cuadrado J (eds) Proceedings of the ECCOMAS Thematic Conference Multibody Dynamics 2005, Madrid, Spain, June 21–24 Seo J-H, Kim S-W, Jung I-H, Park T-W, Mok J-Y, Kim Y-G, Chai J-B (2006) Dynamic analysis of a pantograph-catenary system using absolute nodal coordinates. Vehicle Syst Dyn 44(8):615–630 Shampine L, Gordon M (1975) Computer solution of ordinary differential equations: the initial value problem. Freeman, San Francisco, CA Veitl A, Arnold M (1999) Coupled simulations of multibody systems and elastic Structures. In: Ambr´ osio J, Schiehlen W (eds) Proceedings of EUROMECH Colloquium 404 Advances in Computational Multibody Dynamics, Lisbon, Portugal, September 20–23 Vera C, Suarez B, Paulin J, Rodr´ıguez P (2006) Simulation model for the study of overhead rail current collector systems dynamics, focused on the design of a new conductor rail. Vehicle Sys Dyn 44(8):595–614 Zhang W, Mei G, Wu X, Shen Z (2002) Hybrid Simulation of dynamics for the pantograph-catenary system. Vehicle Sys Dyn 38(6):393–414 Zienkiewicz OC, Taylor RL (2000) The finite element method. ButterworthHeinemann, Woburn, MA