Geometry of nonholonomically constrained systems

Geometry of Nonholonomically Constrained Systems ADVANCED SERIES IN NONLINEAR DYNAMICS* Editor-in-Chief: R. S. MacKa...

Author: Richard H. Cushman | Jedrzej Sniatycki

21 downloads 514 Views 4MB Size Report

This content was uploaded by our users and we assume good faith they have the permission to share this book. If you own the copyright to this book and it is wrongfully on our website, we offer a simple DMCA procedure to remove your content from our site. Start by pressing the button below!

Report copyright / DMCA form

DOWNLOAD PDF

Geometry of

Nonholonomically Constrained Systems

ADVANCED SERIES IN NONLINEAR DYNAMICS* Editor-in-Chief: R. S. MacKay (Univ. Warwick) Published Vol. 9

Transport, Chaos and Plasma Physics 2 S. Benkadda, F. Doveil & Y. Elskens

Vol. 10 Renormalization and Geometry in One-Dimensional and Complex Dynamics Y.-P. Jiang Vol. 11 Rayleigh–Bénard Convection A. V. Getling Vol. 12 Localization and Solitary Waves in Solid Mechanics A. R. Champneys, G. W. Hunt & J. M. T. Thompson Vol. 13 Time Revers bility, Computer Simulation, and Chaos W. G. Hoover Vol. 14 Topics in Nonlinear Time Series Analysis – With Implications for EEG Analysis A. Galka Vol. 15 Methods in Equivariant Bifurcations and Dynamical Systems P. Chossat & R. Lauterbach Vol. 16 Positive Transfer Operators and Decay of Correlations V. Baladi Vol. 17 Smooth Dynamical Systems M. C. Irwin Vol. 18 Symplectic Twist Maps C. Gole Vol. 19 Integrability and Nonintegrability of Dynamical Systems A. Goriely Vol. 20 The Mathematical Theory of Permanent Progressive Water-Waves H. Okamoto & M. Shoji Vol. 21 Spatio-Temporal Chaos & Vacuum Fluctuations of Quantized Fields C. Beck Vol. 22 Energy Localisation and Transfer eds. T. Dauxois, A. Litvak-Hinenzon, R. MacKay & A. Spanoudaki Vol. 23 Geometrical Theory of Dynamical Systems and Fluid Flows T. Kambe Vol. 24 Microscopic Chaos, Fractals and Transport in Nonequilibrium Statistical Mechanics R. Klages Vol. 25 Smooth Particle Applied Mechanics – The State of the Art W. G. Hoover Vol. 26 Geometry of Nonholonomically Constrained Systems by R. H. Cushman, J. Ðniatycki & H. Duistermaat

*For the complete list of titles in this series, please visit http://www.worldscibooks.com/series/asnd_series.shtml

A d v a n c e d

S e r i e s

i n

Nonlinear Dynamics VOL U ME

2 6

Geometry of

Nonholonomically Constrained Systems Richard Cushman University of Calgary, Canada

Hans Duistermaat University of Utrecht, The Netherlands

Je˛ drzej S´niatycki University of Calgary, Canada

World Scientific NEW JERSEY

•

LONDON

•

SINGAPORE

•

BEIJING

•

SHANGHAI

•

HONG KONG

•

TA I P E I

•

CHENNAI

Published by World Scientific Publishing Co. Pte. Ltd. 5 Toh Tuck Link, Singapore 596224 USA office 27 Warren Street, Suite 401-402, Hackensack, NJ 07601 UK office 57 Shelton Street, Covent Garden, London WC2H 9HE

Library of Congress Cataloging-in-Publication Data Cushman, Richard H., 1942– Geometry of nonholonomically constrained systems / by Richard H. Cushman, J“drzej Ðniatycki & Hans Duistermaat. p. cm. -- (Advanced series in nonlinear dynamics ; v. 26) Includes bibliographical references and index. ISBN-13: 978-981-4289-48-1 (hardcover : alk. paper) ISBN-10: 981-4289-48-5 (hardcover : alk. paper) 1. Nonholonomic dynamical systems. 2. Geometry, Differential. I. Ðniatycki, J“drzej. II. Duistermaat, Hans. III. Title. QA614.833.C87 2009 516.3'6--dc22 2009024384

British Library Cataloguing-in-Publication Data A catalogue record for this book is available from the British Library.

Copyright © 2010 by World Scientific Publishing Co. Pte. Ltd. All rights reserved. This book, or parts thereof, may not be reproduced in any form or by any means, electronic or mechanical, including photocopying, recording or any information storage and retrieval system now known or to be invented, without written permission from the Publisher.

For photocopying of material in this volume, please pay a copying fee through the Copyright Clearance Center, Inc., 222 Rosewood Drive, Danvers, MA 01923, USA. In this case permission to photocopy is not required from the publisher.

Printed in Singapore.

Dedicated to our wives

This page intentionally left blank

Acknowledgments

This book started many years ago about 1994 from a conversation I (RC) had with Hans (JJD) at the University of Utrecht about a book on nonholonomically constrained systems – a topic which we were currently (and still are) interested in. As I usually spent the summers at the University of Calgary, I mentioned my interest to Jedrzej (JS) and he said he would like to participate. Many years passed as our mathematical ideas clarified. The general theory contains a discussion of all known approaches to derive the equations of motion for a nonholonomically constrained system. The description as a distributional Hamiltonian system had to be invented. Theory of symmetry reduction for such systems under went a major evolution, when it was realized that even in the regular case generalized distributions were needed. To handle the singular case, the concept of a differential space proved very useful. The notion of a vector field on a differential space had to be invented. The realization that all the geometry of the reduced system is contained in the almost Poisson structure of the algebra of smooth functions on the reduced space is quite recent. All of the above new mathematics required many rewrites until the text became stable. As this book took a long time to write many thanks need to be given. First, the Department of Mathematics at the University of Utrecht, where I worked until I retired. Second, the Department of Mathematics and Statistics at the University of Calgary under the headship of Prof. Ted Bisztriczky, where I am currently working as an adjunct professor. Without the support of the departments and of my coauthor JS this project would never have been completed. JS work was partially supported by a Killam and an NSERC grant. JJD used some of his time as an Academy Professor to work on this book. His manuscripts on most of the chapters were of vii

viii

Acknowledgments

great importance in preparing the final text. Finally, I would like to thank Dr. David Kemppainen of Mount Royal College in Calgary for creating the figures in chapters 5 and 6 and figures 7.10 and 7.11 in chapter 7. All the remaining figures were produced by JJD. In spite of our best efforts, there may be (hopefully few and inessential) mistakes. If you find any, please contact me at rcushman(at)ucalgary.ca . Richard Cushman University of Calgary July 12, 2009

Foreword

This book is aimed mainly at researchers and graduate students in the field of geometric mechanics, especially the theory of nonholonomic constraints. A wider audience may consist of pure mathematicians who are interested in applications of modern differential geometry to mechanics. The first part of the book is devoted to the general theory of systems with linear nonholonomic constraints. In chapter 1 we review different approaches to the description of dynamics of nonholonomically constrained systems appearing in the literature. Our only addition to this literature is the systematic use of the Levi Civita connection of the kinetic energy metric. We discuss restrictions on motions of nonholonomically constrained systems provided by conserved quantities and by accessible sets of the constraint distribution. In chapter 2 we treat the basic properties of the action of a Lie group on a smooth manifold, concentrating especially on the case when the action is proper. We also discuss the differential geometry of the space of orbits of a proper action using the concept of a differential space. In chapter 3 we use the results presented in chapter 2 to discuss reduction of a proper action of the symmetry group of a nonholonomically constrained system. We begin with a discuss symmetry and reduction for a general dynamical system. Next, we define the notion of symmetry of a nonholonomically constrained system and then show how to reduce it, when the action of the symmetry group is proper. We specialize our results to the case when the action of the symmetry group is free and proper. We then compare the above results to different approaches to reduction of systems with nonholonomic constraints. In chapter 4 we discuss reconstruction of solutions of the original equaix

x

Foreword

tions of motion from solutions of the reduced equations. We also study relative equilibria and relative periodic orbits of nonholonomically constrained systems. The second part of the book contains a comprehensive analysis of concrete systems with nonholonomic constraints. In chapter 5 we discuss the classical nonholonomically constrained system known as Carath´eodory’s sleigh. In order to illustrate the theory given in the preceding chapters, we derive its equations of motion in five different ways, construct the reduced system in three different ways, and carry out reconstruction explicitly. In chapter 6 we treat the example of a smooth strongly convex rigid body rolling without slipping on a horizontal plane under the influence of a constant vertical gravitational force. We use traditional notation, which sometimes clashes with that of the preceding chapters. In chapter 7 we give a comprehensive analysis of the motion of a rolling disk, which is a body of revolution whose edge rolls on a horizontal plane under the influence of a constant vertical gravitational field. The rim of the disk is a planar circle with its center at the center of mass. We assume that during the motion the lowest point of the rim remains in contact with the horizontal plane, which prevents the disk from taking off into space. The rolling disk has a symmetry group E(2) × S 1 , where E(2) is the Euclidean motion group of the plane, and S 1 is the group of internal symmetries of the disk. We reduce these symmetries, solve the reduced equations of motion and then reconstruct the reduced motion. We give a complete qualitative analysis of the motion of the disk. We obtain a global gyroscopic stabilization principle, namely, relative equilibria are stable (= elliptic) if their energy is larger than a fixed number. For exceptional values of the parameters, the disk falls flat in finite time. We give an asymptotic analysis of the situation when the disk nearly falls flat and then rises up again. A surprising result of this analysis is the existence of a universal constant change in the angle of the point of contact.

Contents

Acknowledgments

vii

Foreword

ix

1. Nonholonomically constrained motions 1.1 1.2 1.3 1.4 1.5 1.6

1.7

1.8

1.9 1.10 1.11 1.12

Newton’s equations . . . . . . . . . . . . . . . . Constraints . . . . . . . . . . . . . . . . . . . . Lagrange-d’Alembert equations . . . . . . . . . Lagrange derivative . . . . . . . . . . . . . . . . Hamilton-d’Alembert equations . . . . . . . . . Distributional Hamiltonian formulation . . . . 1.6.1 The symplectic distribution (H, $) . . 1.6.2 H and $ . . . . . . . . . . . . . . . . . 1.6.3 Distributional Hamiltonian vector field Almost Poisson brackets . . . . . . . . . . . . . 1.7.1 Hamilton’s equations . . . . . . . . . . 1.7.2 Nonholonomic Dirac brackets . . . . . . Momenta and momentum equation . . . . . . . 1.8.1 Momentum functions . . . . . . . . . . 1.8.2 Momentum equations . . . . . . . . . . 1.8.3 Homogeneous functions . . . . . . . . . 1.8.4 Momenta as coordinates . . . . . . . . Projection principle . . . . . . . . . . . . . . . Accessible sets . . . . . . . . . . . . . . . . . . Constants of motion . . . . . . . . . . . . . . . Notes . . . . . . . . . . . . . . . . . . . . . . . xi

1 . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . .

1 3 5 7 9 13 13 16 20 24 24 28 33 33 35 37 38 39 41 43 46

xii

Contents

2. Group actions and orbit spaces 2.1 2.2 2.3

2.4

2.5 2.6

2.7 2.8 2.9 2.10

2.11

49

Group actions . . . . . . . . . . . . . . . . . . . . . Orbit spaces . . . . . . . . . . . . . . . . . . . . . . Isotropy and orbit types . . . . . . . . . . . . . . . 2.3.1 Isotropy types . . . . . . . . . . . . . . . . 2.3.2 Orbit types . . . . . . . . . . . . . . . . . . 2.3.3 When the action is proper . . . . . . . . . 2.3.4 Stratification by orbit types . . . . . . . . Smooth structure on an orbit space . . . . . . . . . 2.4.1 Differential structure . . . . . . . . . . . . 2.4.2 The orbit space as a differential space . . . Subcartesian spaces . . . . . . . . . . . . . . . . . Stratification of the orbit space by orbit types . . . 2.6.1 Orbit types in an orbit space . . . . . . . . 2.6.2 Stratification of an orbit space . . . . . . . 2.6.3 Minimality of S . . . . . . . . . . . . . . . Derivations and vector fields on a differential space Vector fields on a stratified differential space . . . Vector fields on an orbit space . . . . . . . . . . . Tangent objects to an orbit space . . . . . . . . . . 2.10.1 Stratified tangent bundle . . . . . . . . . . 2.10.2 Zariski tangent bundle . . . . . . . . . . . 2.10.3 Tangent cone . . . . . . . . . . . . . . . . . 2.10.4 Tangent wedge . . . . . . . . . . . . . . . . Notes . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

3. Symmetry and reduction 3.1

3.2 3.3 3.4 3.5 3.6

Dynamical systems with symmetry . . . . . . . . 3.1.1 Invariant vector fields . . . . . . . . . . . 3.1.2 Reduction of symmetry . . . . . . . . . . 3.1.3 Reduction for a free and proper G-action 3.1.4 Reduction nonfree proper action . . . . . Nonholonomic singular reduction . . . . . . . . . Nonholonomic regular reduction . . . . . . . . . Chaplygin systems . . . . . . . . . . . . . . . . . Orbit types and reduction . . . . . . . . . . . . . Conservation laws . . . . . . . . . . . . . . . . . 3.6.1 Momentum map . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . .

49 50 52 52 53 54 55 56 56 59 62 64 64 66 67 68 73 74 76 76 77 77 78 79 81

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. . . . . . . . . . .

. 81 . 81 . 82 . 82 . 82 . 84 . 93 . 97 . 102 . 105 . 105

Contents

3.7

3.8

xiii

3.6.2 Lifted 3.7.1 3.7.2 Notes

Gauge momenta . . . . . . . . . . . actions and the momentum equation Lifted actions . . . . . . . . . . . . Momentum equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

. . . . .

4. Reconstruction, relative equilibria and relative periodic orbits 4.1

4.2

4.3

4.4

123

Reconstruction . . . . . . . . . . . . . . . . . . . . . . . 4.1.1 Reconstruction for proper free actions . . . . . . 4.1.2 Reconstruction for nonfree proper actions . . . . 4.1.3 Application to nonholonomic systems . . . . . . Relative equilibria . . . . . . . . . . . . . . . . . . . . . 4.2.1 Basic properties . . . . . . . . . . . . . . . . . . 4.2.2 Quasiperiodic relative equilibria . . . . . . . . . 4.2.3 Runaway relative equilibria . . . . . . . . . . . . 4.2.4 Relative equilibria for nonfree actions . . . . . . 4.2.5 Other relative equilibria . . . . . . . . . . . . . . 4.2.6 Famlies of quasiperiodic relative equilibria . . . Relative periodic orbits . . . . . . . . . . . . . . . . . . 4.3.1 Basic properties . . . . . . . . . . . . . . . . . . 4.3.2 Quasiperiodic relative periodic orbits . . . . . . 4.3.3 Runaway relative periodic orbits . . . . . . . . . 4.3.4 G-action is nonfree . . . . . . . . . . . . . . . . 4.3.5 Other relative periodic orbits . . . . . . . . . . . 4.3.6 Families of quasiperiodic relative periodic orbits Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . .

5. Carath´eodory’s sleigh 5.1

5.2

Basic set up . . . . . . . . . . . . . . . . . . . . 5.1.1 Configuration space . . . . . . . . . . . 5.1.2 Kinetic energy . . . . . . . . . . . . . . 5.1.3 Nonholonomic constraint . . . . . . . . Equations of motion . . . . . . . . . . . . . . . 5.2.1 Lagrange-d’Alembert equations . . . . 5.2.2 Nonholonomic Dirac brackets . . . . . . 5.2.3 Lagrange-d’Alembert in a trivialization 5.2.4 Almost Poisson bracket form . . . . . . 5.2.5 Distributional Hamiltonian system . . .

112 113 113 117 120

123 123 125 125 126 126 129 132 133 134 139 152 152 153 157 157 158 160 171 173

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

173 173 174 175 176 176 177 179 181 183

xiv

Contents

5.3

5.4 5.5

5.6

Reduction of the E(2) symmetry . . . . . . . . . 5.3.1 The E(2) symmetry . . . . . . . . . . . . 5.3.2 The momentum equation . . . . . . . . . 5.3.3 E(2)-reduced equations of motion . . . . Motion on the E(2) reduced phase space . . . . . Reconstruction . . . . . . . . . . . . . . . . . . . 5.5.1 Relative equilibria . . . . . . . . . . . . . 5.5.2 General motions . . . . . . . . . . . . . . 5.5.3 Motion of a material point on the sleigh . Notes . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

. . . . . . . . . .

6. Convex rolling rigid body 6.1 6.2 6.3 6.4

6.5

6.6

6.7

6.8

Basic set up . . . . . . . . . . . . . . . . . . . . . . . . . Unconstrained motion . . . . . . . . . . . . . . . . . . . Constraint distribution . . . . . . . . . . . . . . . . . . . Constrained equations of motion . . . . . . . . . . . . . 6.4.1 Vector field on D . . . . . . . . . . . . . . . . . 6.4.2 Computation of H and $ in a trivialization . . 6.4.3 Distributional vector field in a trivialization . . Reduction of the translational R2 symmetry . . . . . . . 6.5.1 The R2 -reduced equations of motion . . . . . . . 6.5.2 Comparison with the Euler-Lagrange equations 6.5.3 The R2 -reduced distribution H DN and the 2-form $DN . . . . . . . . . . . . . . . . . . Reduction of E(2) symmetry . . . . . . . . . . . . . . . 6.6.1 E(2) symmetry . . . . . . . . . . . . . . . . . . . 6.6.2 E(2)-orbit space . . . . . . . . . . . . . . . . . . 6.6.3 E(2)-reduced distribution and 2-form . . . . . . 6.6.4 Reduced distributional system . . . . . . . . . . Body of revolution . . . . . . . . . . . . . . . . . . . . . 6.7.1 Geometric and dynamic symmetry . . . . . . . . 6.7.2 Reduction of the induced axial symmetry . . . . 6.7.3 Axially reduced equations of motion . . . . . . . Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . .

7. The rolling disk 7.1 7.2

187 188 189 192 196 198 198 199 200 203 205

. . . . . . . . . .

205 208 210 215 215 218 221 222 223 224

. . . . . . . . . . .

226 230 231 233 235 240 243 244 248 249 262 265

General set up . . . . . . . . . . . . . . . . . . . . . . . . 267 Reduction of the E(2) × S 1 symmetry . . . . . . . . . . . 269

xv

Contents

7.3

7.4

7.5

7.6 7.7

7.8

7.9

7.10

7.11

7.2.1 First E(2), then S 1 . . . . . . . . . . . . . . 7.2.2 First S 1 , then E(2) . . . . . . . . . . . . . . Reconstruction . . . . . . . . . . . . . . . . . . . . . 7.3.1 The E(2)-reduced flow . . . . . . . . . . . . 7.3.2 The full motion . . . . . . . . . . . . . . . . 7.3.3 The S 1 -reduced flow . . . . . . . . . . . . . . 7.3.4 Geometry of the E(2) × S 1 reduction map . Relative equilibria . . . . . . . . . . . . . . . . . . . 7.4.1 The manifold of relative equilibria . . . . . . 7.4.2 One parameter groups . . . . . . . . . . . . 7.4.3 Angular speeds in terms of invariants . . . . 7.4.4 Motion of the relative equilibria . . . . . . . 7.4.5 Nearly flat relative equilibria . . . . . . . . . A potential function on an interval . . . . . . . . . . 7.5.1 Chaplygin’s equations . . . . . . . . . . . . . 7.5.2 A conservative Newtonian system . . . . . . 7.5.3 Qualitative behavior . . . . . . . . . . . . . . 7.5.4 A special case of falling flat . . . . . . . . . . Scaling . . . . . . . . . . . . . . . . . . . . . . . . . . Solutions of the rescaled Chaplygin equations . . . . 7.7.1 The recessive solution . . . . . . . . . . . . . 7.7.2 Asymptotics . . . . . . . . . . . . . . . . . . 7.7.3 The normalized even and odd solutions . . . 7.7.4 Computation of r(0) and r0 (0) . . . . . . . . Bifurcations of a vertical disk . . . . . . . . . . . . . 7.8.1 Degenerate equilibria . . . . . . . . . . . . . 7.8.2 Vertical degenerate relative equilibria . . . . 7.8.3 Normal form of the potential . . . . . . . . . 7.8.4 Cusps of the degeneracy locus . . . . . . . . The global geometry of the degeneracy locus . . . . 7.9.1 The circle of degenerate critical points . . . 7.9.2 A global description of the degeneracy locus Falling flat . . . . . . . . . . . . . . . . . . . . . . . 7.10.1 When the disk does not fall flat . . . . . . . 7.10.2 When the disk falls flat . . . . . . . . . . . . 7.10.3 Limiting behavior when falling flat . . . . . Near falling flat . . . . . . . . . . . . . . . . . . . . . 7.11.1 Elastic reflection . . . . . . . . . . . . . . . . 7.11.2 The increase of the angles ψ and χ . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .

270 274 276 276 277 278 280 283 283 284 285 287 290 290 290 292 293 295 296 298 298 299 301 301 303 303 304 306 310 311 312 315 319 320 322 324 326 326 328

xvi

Contents

7.12

7.13

7.14

7.15

7.16

7.11.3 Motions near falling flat . . . . . . . . . . . . . . The bifurcation diagram . . . . . . . . . . . . . . . . . . 7.12.1 The bifurcation set B . . . . . . . . . . . . . . . 7.12.2 Off the bifurcation set B . . . . . . . . . . . . . 7.12.3 On a coordinate axis or in an open quadrant . . 7.12.4 Near `± . . . . . . . . . . . . . . . . . . . . . . . 7.12.5 Global qualitative description of Vσ 3 ,σ 4 . . . . . 7.12.6 Global description of the orbits of Xσ3 ,σ 4 . . . . The integral map . . . . . . . . . . . . . . . . . . . . . . 7.13.1 Regular values of I . . . . . . . . . . . . . . . . 7.13.2 The global geometry of the critical value surface Constant energy slices . . . . . . . . . . . . . . . . . . . 7.14.1 Numerical pictures of the constant energy slices 7.14.2 Geometric features of the constant energy slices 7.14.3 Outward radial growth . . . . . . . . . . . . . . 7.14.4 The swallow tail sections . . . . . . . . . . . . . 7.14.5 Behavior of cusp points . . . . . . . . . . . . . . 7.14.6 Over the coordinate axes in the (σ 3 , σ 4 )-plane . 7.14.7 Σ over `± . . . . . . . . . . . . . . . . . . . . . . The spatial rotational shift . . . . . . . . . . . . . . . . 7.15.1 The shift . . . . . . . . . . . . . . . . . . . . . . 7.15.2 Quasiperiodic motion . . . . . . . . . . . . . . . 7.15.3 The spatial rotational shift . . . . . . . . . . . . 7.15.4 Near elliptic relative equilibria . . . . . . . . . . 7.15.5 Nearly flat solutions . . . . . . . . . . . . . . . . Notes . . . . . . . . . . . . . . . . . . . . . . . . . . . .

. . . . . . . . . . . . . . . . . . . . . . . . . .

332 340 340 341 343 346 347 349 351 352 353 358 358 362 362 365 365 367 368 370 370 371 372 376 381 384

Bibliography

387

Index

395

Chapter 1

Nonholonomically constrained motions

1.1

Newton’s equations

An English translation1 of Newton’s formulation of his second law reads: “A change in motion is proportional to the motive force impressed and takes place along a straight line in which that force is impressed.” The current formulation reads: mass times acceleration is equal to the force. It uses the term acceleration for change of motion. We will use a suitably generalized version of Newton’s second law as our starting point in our presentation of the dynamics of nonholonomically constrained systems on manifolds. The main point of our generalization of Newton’s equations is giving an appropriate definition of the notion of acceleration. When configuration space is R3 , the motion of a particle is described by giving its position vector q as a function of time, namely, t 7→ q(t). In this dv(t) case, its velocity v(t) = dq(t) dt and acceleration a(t) = dt are well defined 3 vectors in R . If one interprets force F as a vector in R3 , then the usual formulation of the second law reads m a(t) = F, (1) where m is the mass of the particle. The force F acting on the particle may depend on q(t), v(t), and t. The kinetic energy of a particle with mass m and velocity v is k = 21 m hv, vi, where h , i is the standard Euclidean metric on R3 . Thus k is one half of length of v with respect to the kinetic energy metric k = m h , i.

For a nsystem of n interacting particles in R3 , the configuration space is (R ) = × R3α . The position q of the system, its velocity v and acceleration 3 n

α=1

1 See

page 417 of [83]. 1

2

Nonholonomically constrained motions

a are vectors in (R3 )n . In this case the second law reads mα aα (t) = Fα ,

for α = 1, . . . , n.

(2)

Here the subscript α labels the individual particles having mass mα , acceleration aα , and subjected to a force Fα . In this case the kinetic energy of the system is one half the square of the length of the velocity with respect to the kinetic energy metric n ! mα " , #α , (3) k= α=1

where " , #α is the standard Euclidean metric on R3 α . If we interpret the force F imposed on the system as a covector on (R3 )n , then the left hand side of (2) can be interpreted as the covector given by evaluating the kinetic energy metric k, given by (3), on the vector a(t). Since the coefficients of k are constant, the derivative of the velocity v(t) coincides with the covariant derivative of v(t) with respect to the Levi-Civita connection ∇ associated to the kinetic energy metric k. This interpretation enables us to extend Newton’s equations to manifolds. Consider now a mechanical system with configuration space Q, which is a smooth manifold. The kinetic energy of the system defines a Riemannian metric k on Q such that for each v ∈ Tq Q, the kinetic energy of the motion of the system with velocity v is k(v) = 12 k(v, v). We shall refer to k as the kinetic energy metric on the manifold Q of the system. The metric k gives rise to the vector bundle isomorphisms k" : T Q → T ∗ Q and k# : T ∗ Q → T Q where "k" (v) | u# = k(v, u) for every u, v ∈ Tq Q,

(4)

and k# = (k" )−1 .

Let t '→ q(t) be a smooth curve on Q which describes a motion of our system. Let t '→ v(t) = q(t) ˙ be its tangent lift to T Q. In other words, v(t) is the velocity of the system at time t. Define the acceleration of the system as the covariant derivative of its velocity with respect to the Levi-Civita connection ∇ associated to the kinetic energy metric k. In other words, the acceleration a(t) of the system at time t is a(t) = ∇v v(t) = q¨(t).

(5)

A connection is needed to interpret acceleration as a tangent vector. Throughout the rest of this chapter we will use the Levi-Civita connection associated to the kinetic energy metric.

1 2. Constraints

3

Let πQ : T ∗ Q → Q and τQ : T Q → Q be the cotangent and tangent bundle projection maps, respectively and observe that τQ = πQ ◦ k" . A force acting on a mechanical system, which is in the configuration q ∈ Q, is a covector in Tq∗ Q. Consider a map ϕ : R × T Q → T ∗ Q such that πQ ◦ ϕ = τQ ◦ pr2 , where pr2 : R × T Q → T Q : (t, vq ) '→ vq . Then ϕ describes the dependence of the force on time, position and velocity of the system. In other words, if t ∈ R and vq ∈ Tq Q, then ϕ(t, vq ) is the force acting on the system at time t, position q, and velocity vq . Newton’s equations for the motion t '→ q(t) of the system are q (t)) = ϕ(t, q(t)). ˙ k" (¨

(6)

In what follows we shall show that our formulation of Newton’s equations motion for a mechanical system subject to a linear nonholonomic constraint is equivalent to all other standard formulations of its equations of motion. 1.2

Constraints

Constraints in dynamics are restrictions on positions and velocities of the system. Phenomenological constraints are introduced instead of unknown forces to describe observed motions. For example, a rigid body is a system of material points with fixed distances between each pair of points. Another example is the no slip condition in the motion of a rolling body. This constraint requires that the relative velocity of the point of contact of the rolling body vanishes. In the first example, the constraint depends only on the position of the material point. Such constraints are called holonomic. In the second example, the no slip condition is a linear relation on the velocity of the motion. Such conditions are called linear nonholonomic constraints. Nonlinear nonholonomic constraints appear in control theory, but they will not be considered here, see [74]. Neither shall we consider inhomogeneous linear nonholonomic constraints, which appear in the problem of a sphere rolling on a turntable when regarded in rotating coordinates [89]. There are two ways of dealing with holonomic constraints. First, we can modify the configuration space by imposing the holonomic constraints. Alternatively, we can present the holonomic constraints as linear nonholonomic constraints by extending the tangent bundle of the constraint mani-

4

Nonholonomically constrained motions

fold to a distribution on configuration space. Recall that a distribution on a smooth manifold M is a smooth vector subbundle of the tangent bundle T M of M . The original (holonomic) constraint manifold is then an integral manifold of this distribution. Let D be a distribution on Q describing the linear nonholonomic constraints under consideration. A motion t 7→ q(t) of the system is dynamically allowed if its velocity q(t) ˙ at time t lies in Dq(t) . If D is involutive,2 then we are dealing with a family of holonomically constrained systems. Here each integral manifold M of D describes a holonomic constraint. Our nonholonomically constrained mechanical system is acted on by an external force ϕext and a reaction force of the constraints ϕconstr . We have hypothesized that the constraints do not depend on time. In particular, the reaction force of the constraint depends only on the velocity and has no explicit dependence on time. We assume that the external force is conservative, that is, there is a potential function V ∈ C ∞ (Q) such that ϕext = − dV . Then Newton’s equation for the nonholonomically constrained system is ∗ k[ (¨ q (t)) = −πQ dV (q(t)) + ϕconstr (q(t)). ˙

(7)

In order to be able to solve equation (7) we have to make an assumption on the nature of the reaction force of the constraints. Let D0 ⊆ T ∗ Q be the set of all covectors which annihilate the corresponding vector in the constraint distribution D. In other words, for each q ∈ Q, Dq0 = D0 ∩ Tq∗ Q = {p ∈ Tq∗ Q hp | vi = 0 for every v ∈ Dq }.

(8)

Interpreting covectors in T ∗ Q as forces, we see that D0 is the set of all forces which do no work on virtual displacements in D. We now make the following Hypothesis 1.2.1. The work of the reaction force of the constraints on virtual displacements in D vanishes. In other words, ϕconstr (v) ∈ Dτ0Q (v)

for every v ∈ D.

Constraints satisfying this hypothesis are called perfect. A motion given by a curve t 7→ q(t) in Q with q(t) ˙ ∈ Dq(t) , which satisfies Newton’s equation (7) with constraint force ϕconstr , is called dynamically admissible. We summarize the above discussion as 2 Frobenius’

theorem states that if D is involutive, then D is integrable.

5

1.3. Lagrange-d’Alembert equations

Theorem 1.2.2. (d’Alembert principle) A smooth curve q : (t0 , t1 ) ⊆ R → Q : t 7→ q(t) is a dynamically admissible motion of a mechanical system with kinetic energy k(v) = 21 k(v, v) and potential energy V subject to the linear nonholonomic constraint D ⊆ T Q if and only if 0 k[ (¨ q (t)) + dV (q(t)) ∈ Dq(t) and q(t) ˙ ∈ Dq(t) ,

(9)

for all t ∈ (t0 , t1 ). 1.3

Lagrange-d’Alembert equations

The formulation of the dynamics of constrained motion given in (9) can ∗ be expressed in terms of the Lagrangian ` = k − τQ V : T Q → R for ∗ the unconstrained system. Here τQ V = V ◦ τQ is the pull back of V by the tangent bundle projection map τQ . Let c : R → Q : t 7→ c(t) be a smooth curve on Q and let c˙ : R → T Q : t 7→ T c(t) be its tangent lift to T Q. The Lagrange derivative of ` at c(t) ˙ is a 1-form on Tc(t) Q, which is defined as follows. Let {q i } be local coordinates on Q and let {q i , v i } be the corresponding local coordinates on T Q. Then the curves c and c˙ are given by t 7→ q(t) = (q i (t)) and t 7→ (q(t), q(t)) ˙ = (q i (t), v i (t)), respectively. The Lagrangian ` along c˙ is a function of (q(t), q(t)). ˙ Its Lagrange derivative δ` at (q(t), q(t)) ˙ is the 1-form on Tq(t) Q d ∂` ∂` δ`(q(t), q(t)) ˙ = − dq i , (10) dt ∂ q˙i ∂q i

where summation is performed over repeated indices. The Lagrange derivative is a natural operator which takes the same form in all coordinate systems, see [65], [60].3 Lemma 1.3.3. For every smooth curve c : R → Q : t 7→ q(t) on Q, the Lagrange derivative of the kinetic energy k(v) = 21 k(v, v) along c˙ is [

δk(q(t), q(t)) ˙ = k(q(t)) (∇q˙ q(t)), ˙

(11)

for all t. 3 What we mean by this is the following. Let Q and Q e be configuration spaces which are diffeomorphic by the diffeomorphism ϕ. The map ϕ induces a diffeomorphism T ∗ ϕ of the e onto the cotangent bundle T ∗ Q, which is defined as follows. For cotangent bundle T ∗ Q e and every α e we have T ∗ ϕ(e every qe ∈ Q eq ∈ Tq∗ Q, αq ) is the cotangent vector (Tq ϕ−1 )T (e αq ) e →R to Q at q. Here the superscript T denotes transpose. If ` : T Q → R and `e : T Q are Lagrangians, where `e = ` ◦ T ϕ, then δ`e = δ` ◦ T ∗ ϕ. As Lagrange [65] observed, the

preceding formula follows R from the interpretation of the Lagrange derivative δ` as the variational derivative of ` dt.

6

Nonholonomically constrained motions

Proof. We compute " " ! ! ∂k d d ∂k i j 1 ∂kj! j ! (k dq dq i − = q ˙ ) − q ˙ q ˙ δk(q(t), q(t)) ˙ = ij 2 ∂q i dt ∂ q˙i ∂q i dt " ! ∂kij ! j 1 ∂kj! j ! d dq i = kij q˙j + q ˙ q ˙ − q ˙ q ˙ 2 ∂q i dt ∂q ! # $ " ! d ∂kjn ∂kj! ∂kn! n ! q ˙ dq i = kij q˙j + 12 + − q ˙ dt ∂q ! ∂q n ∂q j ! " d j = kij ˙ j. q˙ + Γjn! q˙n q˙! dq i = kij (∇q˙ q) dt Here # $ ∂kjn ∂kj! ∂kn! (12) + − kij Γjn! = 12 ∂q ! ∂q n ∂q j are the Christoffel symbols of the Levi-Civita connection ∇ of the kinetic energy metric k on Q. From (9) it follows that if # = k − τ ∗ V , then

∗ δ#(q(t), q(t)) ˙ = δ(k − τQ V )(q(t), q(t)) ˙ = δk(q(t), q(t)) ˙ + dV (q(t))

= k" (¨ q (t)) + dV (q(t)).

Hence, we are led to the Lagrangian form of the equations of motion for a nonholonomically constrained system. Theorem 1.3.4. (Lagrange-d’Alembert principle) A smooth curve c : (t0 , t1 ) → Q : t $→ q(t) is a dynamically admissible motion of a nonholonomically constrained mechanical system with constraint distribution D ∗ V if and only if for every t ∈ (t0 , t1 ) and Lagrangian # = k − τQ 0 δ#(q(t), q(t)) ˙ ∈ Dq(t) and q(t) ˙ ∈ Dq(t) .

(13)

∗ V is constant on Corollary 1.3.5. The total energy function h = k + τQ all dynamically admissible motions.

Proof. Since & ∂∂!q˙ | q' ˙ = k(q)(q, ˙ q) ˙ = k(q)(q), ˙ we may write

∂# | q' ˙ − #(q, q). ˙ (14) ∂ q˙ Let t $→ q(t) be an admissible motion. Then differentiating (14) gives # $ dh(q(t), q(t)) ˙ d ∂# ∂# ∂# ∂# =& | q' ˙ +& | q¨' − & | q' ˙ −& | q¨' dt dt ∂ q˙ ∂ q˙ ∂q ∂ q˙ h(q, q) ˙ =&

using (13).

= &δ#(q(t), q(t)) ˙ | q(t)' ˙ = 0,

7

1.4. Lagrange derivative

1.4

Lagrange derivative in a trivialization

We now show how to compute the Lagrange derivative in a trivialization of T Q. Suppose that Q is an n-dimensional smooth manifold. We say that the tangent bundle τQ : T Q → Q of Q is trivial if there is a smooth mapping κ : T Q → Q × Rn such that κ|Tq Q is a linear isomorphism with Rn . The map κ is called a trivialization. Recall that the tangent bundle of Q is locally trivial, that is, about every point of Q there is an open neighborhood U such −1 −1 that the bundle τ |τQ (U ) : τQ (U ) → U is trivial. Let λ : Q × Rn → T Q n be the inverse of the trivialization κ. Using the standard basis {ei }i=1 of n R , we see that Xi : Q → T Q : q 7→ λ(q, ei ) n

is a smooth vector field on Q such that {Xi (q)}i=1 is a basis of Tq Q for n every q ∈ Q. In other words, {Xi }i=1 is a moving frame on Q. Conversely, n suppose that {Xi }i=1 is a moving frame on Q. For each c ∈ Rn define a vector field cQ on Q by cQ (q) =

n X

ci Xi (q).

(15)

i=1

Q 4 Then Xi = eQ i and c 7→ c is a linear mapping. Therefore the map

λ : Q × Rn → T Q : (q, c) 7→ cQ (q) = cQ q

(16)

is the inverse of a trivialization of the tangent bundle. In this way we see that moving frames on Q and trivializations of T Q are equivalent objects. Let ` : T Q → R be a smooth function, called a Lagrangian, and let λ : Q × Rn → T Q be the inverse of a trivialization of T Q. Define the Lagrangian in the trivialization λ−1 to be L = ` ◦ λ : Q × Rn → R. The following proposition gives a formula for the Lagrange derivative of L.5 Proposition 1.4.6. Let γ be a smooth curve on Q. Define a smooth curve c : R → Rn : t 7→ c(t) by requiring that λ(γ(t), c(t)) = γ(t), ˙ which lies in Tγ(t) Q. For any a ∈ Rn we have

4 In

§30 of chapter II Whittaker [118] calls the ci quasi-coordinates. (17) is due to Poincar´ e [92], see next page.

5 Formula

8

Nonholonomically constrained motions

hδL(γ(t), c(t)) | ai =h

d ∂L (γ(t), c) dt ∂c −h

c=c(t)

∂L (γ(t), c) ∂c

| ai − h

c=c(t)

∂L (q, c(t)) ∂q

q=γ(t)

| aQ γ(t) i

Q Q | λ−1 γ(t) [c(t) , a ]γ(t) i.

(17)

Here [c(t)Q , aQ ] denotes the Lie bracket of the vector fields c(t)Q and aQ on Q. Proof. We compute h

∂` ∂L ∂` Q (q, c) | ai = h (q, cQ (q, q) ˙ q ) | aq i = h ∂c ∂c ∂ q˙

q=c ˙ Q q

| aQ q i.

(18)

We now use local coordinates on Q. Substituting q = γ(t), c = c(t), q˙ = γ(t) ˙ = cQ γ(t) into (18), and then differentiating the result with respect to t gives h

d ∂L (γ(t), c) dt ∂c =h

c=c(t)

| ai

d ∂` (γ(t), q) ˙ dt ∂ q˙

q= ˙ γ(t) ˙

| aQ γ(t) i + h

∂` (γ(t), q) ˙ ∂ q˙

Q

q= ˙ γ(t) ˙

| DaQ γ(t) c(t)γ(t) i.

But ∂` ∂` ∂L (q, c) = (q, cQ (q, q) ˙ q )= ∂q ∂q ∂q

+ q=c ˙ Q q

∂` (q, q) ˙ ∂ q˙

q=c ˙ Q q

◦ DcQ . q

Consequently, h

d ∂L (γ(t), c) dt ∂c

∂L (q, c(t)) | aQ ˙ | aQ γ(t) i = hδ`(γ(t), γ(t)) γ(t) i ∂q c=c(t) ∂` ˙ + h (γ(t), q) i. | DaQ c(t)Q − Dc(t)Q aQ γ(t) γ(t) γ(t) γ(t) ∂ q˙ q= ˙ γ(t) ˙ | ai − h

This implies (17) because δL(γ(t), c(t)) = δ`(γ(t), γ(t)) ˙ and the Lie bracket of two vector fields X and Y in local coordinates is given by [X, Y ](q) = DY (q)X(q) − DX(q)Y (q).

9

1.5. Hamilton-d’Alembert equations

We now look at the special case of proposition 1.4.6 when Q is a Lie group G with Lie algebra g. The standard left trivialization of T G is κ : T G → G × g : (g, g) ˙ 7→ (g, ξ = Te Lg−1 g). ˙

(19)

Here Lg : G → G : h 7→ gh is multiplication on the left by g. Another way to define ξ is to take the derivative of the curve t 7→ g(0)−1 g(t) at t = 0, where t 7→ g(t) is a smooth curve in G with g(0) = g and g(0) ˙ = g. ˙ The inverse of κ is given by λ : G × g → T G : (g, ξ) 7→ Te Lg ξ.

(20)

Corollary 1.4.7. Using the standard left trivialization λ−1 of T G, the Lagrangian derivative of L = ` ◦ λ is δL(γ(t), c(t)) =

d ∂L (γ(t), c) dt ∂c −

c=c(t)

∂L (q, c(t)) ∂q

−

∂L (γ(t), c) ∂c

◦ ad c(t) c=c(t)

◦λ γ(t) .

(21)

q=γ(t)

Here t 7→ γ(t) is a smooth curve on G with the curve c : R → g : t 7→ c(t) defined by requiring that γ(t) ˙ = λ(γ(t), c(t)). Moreover, adξ : g → g : η 7→ [ξ, η] for every ξ ∈ g. 1.5

Hamilton-d’Alembert equations

In this section we derive the Hamilton-d’Alembert formulation of dynamics of a nonholonomically constrained system. Because the constraints are given by a distribution D ⊆ T Q, it is more convenient to work in velocity space T Q than in momentum space T ∗ Q. The canonical 1-form θQ on T ∗ Q is defined by hθQ (p) | up i = hp | T πQ (up )i

for all up ∈ Tp (T ∗ Q).

(22)

The canonical symplectic form on T ∗ Q is ωQ = − dθQ . We use the notation θ = (k[ )∗ θQ and ω = (k[ )∗ ωQ . Since k[ is a diffeomorphism, it follows that ω is a symplectic form on T Q. Moreover, ω = − dθ. Lemma 1.5.8. For every w ∈ Tu (T Q) with u ∈ T Q, we have hθ(u) | wi = k(u, T τQ (w)).

10

Nonholonomically constrained motions

Proof. Because πQ ◦ k[ = τQ we get hθ(u) | wi = h(k[ )∗ θQ (u) | wi = hθQ (k[ (u)) | T k[ (w)i = hk[ (u) | T πQ (T k [ (w))i = k(u, T τQ (w)).

Since the fiber of τQ : T Q → Q is a vector space, for each q ∈ Q and each u ∈ Tq Q, we have a vector space isomorphism ιu : Tq Q → ker Tu τQ such that for every v ∈ Tq Q and every f ∈ C ∞ (T Q), the fiber derivative of f in the direction ιu (v) is ιu (v)f =

d dt

f (u + tv).

(23)

t=0

Lemma 1.5.9. For every u, v ∈ Tq Q, w ∈ Tu (T Q) we have ω(w, ιu (v)) = k(T τQ (w), v). Proof. For each q ∈ Q, u ∈ Tq Q and w ∈ Tu (T Q),

hθ(u) | wi = hk [ (u) | T τQ (w)i.

(24)

Every vector w ∈ Tu (T Q) can be extended to a vector field on T Q which projects to a vector field on Q under τQ . Therefore, ω(w, ιu (v)) = hιu (v) =

d dt

dθ(u) | wi = ιu (v)hθ(u) | wi

k(u + tv, T τQ (w)) = k(v, T τQ (w)). t=0

On the right hand sign of the first equality denotes the left interior product (contraction). In other words, for every pair of vector fields X and X 0 on T Q, hX ω | X 0 i = ω(X, X 0 ). The 2-form ω on T Q is symplectic. In other words, ω is nondegenerate and closed. For every function h ∈ C ∞ (T Q), the Hamiltonian vector field corresponding to h is the vector field Xh on (T Q, ω) such that Xh

ω = dh.

(25)

Lemma 1.5.10. The projection from (T Q, ω) to Q under τQ of an integral curve of the Hamiltonian vector field Xk of the kinetic energy k(v) = 12 k(v, v) is a geodesic of the Levi-Civita connection ∇ associated to k.

1.5. Hamilton-d’Alembert equations

11

Proof. In terms of local coordinates {q i } on Q and induced local coordinates {q i , v j = q˙j } on T Q, we have ω = − d(kij v j ) ∧ dq j = −kij dv i ∧ dq j −

∂kij i ! v dq ∧ dq j ∂q !

and dk = kij v i dv j +

1 2

∂kij i j ! v v dq . ∂q !

If Xk = q˙i ∂q∂ i + v˙ i ∂v∂ i , then Xk

Hence, Xk

Therefore,

∂kij i ! j ∂kij i j ! ω = −kij v˙ i dq j + kij q˙j dv i − v q˙ dq + v q˙ dq ∂q ! ∂q ! " ! ∂kij i ! ∂ki! i ! = kij q˙j dv i − kij v˙ i + v q˙ − v q˙ dq j . ∂q ! ∂q j ω = dk implies that q˙j = v j and ∂ki! i ! ∂kij i ! ∂ki! i ! kij v˙ i + v q˙ − v q˙ = − 21 vv . ! j ∂q ∂q ∂q j

! " ∂kn! n ! ∂kn! n ! ∂knm n ! v˙ j = k im − 12 v v + v v − v v ∂q m ∂q m ∂q ! ! " ∂kn! n ! ∂knm n ! v v − v v = −Γin! v n v ! , = k im 12 ∂q m ∂q !

where k im kmj = δij and Γink is the Christoffel symbol (12). Consequently, Xk (v) = v i

∂ ∂ − Γin! v n v ! i . ∂q i ∂v

(26)

If t $→ v(t) is an integral curve of Xk and t → q(t) = τQ (v(t)) is its projection to Q, then d i q˙ (t) + Γin! (q(t))q˙n (t)q˙! (t) = 0. dt Thus t → q(t) is a geodesic of the Levi-Civita connection ∇ associated to the kinetic energy metric k. Lemma 1.5.11. Let (t0 , t1 ) → Q : t $→ q(t) be a smooth integral curve of ˙ be its tangent lift to T Q, that is, τQ (u(t)) = q(t). Xk and let t $→ u(t) = q(t) Then for all t ∈ (t0 , t1 ).

˙ = ιq(t) ˙ u(t) ˙ − Xk (q(t)) ˙ (∇q˙ q(t))

(27)

12

Nonholonomically constrained motions

Proof. Since u(t) = q(t) ˙ it follows that u(t) ˙ = v i (t) ∂q∂ i +v˙ i (t) ∂v∂ i . Equation (26) implies that # ∂ " u(t) ˙ − Xk (q(t)) ˙ = v˙ i + Γink v n v k ∂v i $ % d i ∂ i n k q˙ (t) + Γnk (q(t))q˙ (t)q˙ (t) = dt ∂v i ∂ = ∇q˙ q(t) ˙ = ιq(t) ˙ ˙ (∇q˙ q(t)). ∂v i The set F = {w ∈ T (T Q) | T τQ (w) ∈ D}

(28)

is a distribution on Q. To see this let u ∈ D. Since the map Tu τQ : Tu (T Q) → TτQ (u) Q is surjective, (Tu τQ )−1 DτQ (u) is a linear subspace of Tu (T Q), which has constant dimension and depends smoothly on u. There& fore u∈D (Tu τQ )−1 DτQ (u) is a smooth vector subbundle of TD (T Q), the bundle of tangent vectors to T Q with base point in D. Denote by F 0 the annihilator of F , that is, for every u ∈ T Q, Fu0 = {p ∈ Tu∗ (T Q) | "p | w# = 0 for all w ∈ Fu }.

(29)

Theorem 1.5.12. (Hamilton-d’Alembert principle) A curve (t0 , t1 ) → Q : t '→ q(t) with lift t '→ u(t) = q(t) ˙ ∈ D is a dynamically admissi∗ V subject to the ble motion corresponding to the Hamiltonian h = k + τQ nonholonomic constraint D if and only if u(t) ˙ for every t ∈ (t0 , t1 ).

0 ω − dh(q(t)) ˙ ∈ Fq(t) ˙

(30)

Proof. For each w ∈ Fq(t) ˙ , from lemma 1.5.11 we obtain "u(t) ˙

ω | w# = ω(u(t), ˙ w) = ω(Xk (q(t)), ˙ w) + ιq(t) ˙ w) ˙ (∇q˙ q), ˙ w) + ω(ιq(t) ˙ w) = ω(Xk (q(t)), ˙ (∇q˙ q),

˙ T τQ (w)) = "dk | w# − k(∇q˙ q,

˙ T τQ (w)) = "dh | w# − "dV | T τQ (w)# − k(∇q˙ q,

˙ + dV | T τQ (w)#. = "dh | w# − "k" (∇q˙ q)

0 ω − dh(q(t)) ˙ ∈ Fq(t) if and only if k" (∇q˙ q(t)) ˙ + dV (q(t)) ∈ Hence, u(t) ˙ ˙ 0 Dq(t) . By hypothesis q(t) ˙ ∈ D. The theorem follows from d’Alembert’s principle 1.2.2.

13

1.6. Distributional Hamiltonian formulation

Corollary 1.5.13. The Hamiltonian function h is constant on every admissible motion. Proof. Because u(t) ˙ ∈ Fq(t) we get ˙ 0 = u(t) ˙

1.6

(u(t) ˙

ω − dh) = ω(u, ˙ u) ˙ − dh(u(t)) ˙ = − dh(u(t)). ˙

Distributional Hamiltonian formulation

In this section we give a distributional Hamiltonian formulation of the equations of motion of a nonholonomically constrained system. 1.6.1

The symplectic distribution (H, $)

Since D is a distribution on Q, it is a submanifold of T Q. We denote by T ω D the symplectic annihilator of T D. In other words, for every u ∈ D, ω

Tuω D = {w ∈ Tu (T Q) | ω(w, v) = 0

for all v ∈ Tu D}.

(31)

T D is a distribution on D being a vector subbundle of TD (T Q), which is the bundle of tangent vectors to T Q whose base point lies in D. Similarly, we denote the symplectic annihilator of the distribution F (28) by F ω , that is, for every u ∈ D, Fuω = {w ∈ Tu (T Q) | ω(w, v) = 0

for all v ∈ Fu }.

(32)

Theorem 1.6.1.14. The vector bundle TD (T Q) has two direct sum decompositions TD (T Q) = F ω ⊕ T D = F ⊕ T ω D.

(33)

Proof. First we show that F ω ∩ T D = {0}. Because ker T τQ is a Lagrangian distribution on (T Q, ω), that is, ω| ker T τQ = 0 and dim ker T τQ = 1 ω 2 dim T (T Q), and ker T τQ ⊆ F , it follows that F ⊆ ker T τQ . Hence, F ω ∩ T D = F ω ∩ (T D ∩ ker T τQ ).

Let u ∈ D and w ∈ Tu D ∩ Fuω . Then w ∈ T D ∩ ker T τQ and is of the form w = ιu (v) for some v ∈ DτQ (u) . Moreover, w ∈ Fuω which implies that ω(ιu (v), w0 ) = ω(w, w0 ) = 0 for every w0 ∈ Fu . Using lemma 1.5.9 we get k(v, T τQ (w0 )) = 0 for all w0 ∈ Fu . Since T τQ (F ) = D, we can choose w0 so that T τQ (w0 ) = v. Thus k(v, v) = 0, which implies that v = 0 and w = ιu (v) = 0. Therefore, F ω ∩ T D = {0}.

14

Nonholonomically constrained motions

If d is the rank of the distribution D and n = dim Q, we have dim Fu = n + d, dim Fuω = 2n − dim Fu = n − d, and dim Tu D = n + d. Hence, dim Fuω + dim Tu D = 2n = dim Tp (T Q). Since Fuω ∩ Tu D = {0}, it follows that Tu (T Q) = Fuω ⊕Tu D. Taking symplectic annihilators we get Tu (T Q) = Fu ⊕ Tuω D. Let H be the intersection of F (28) and the tangent bundle of D, that is, H is a distribution on D.

H = F ∩ T D.

(34)

Proposition 1.6.1.15. For each u ∈ D,

Hu ∩ ker T τQ = {ιu (v) | v ∈ DτQ (u) }.

Proof. For every u, v ∈ Dq , u + sv ∈ Dq for all s ∈ R. Hence, ιu (v) ∈ T D. Since ιu (v) ∈ ker T τQ ⊆ F, it follows that ιu (v) ∈ Hu . Consequently, {ιu (v) | v ∈ DτQ (u) } ⊆ Hu ∩ ker T τQ .

Conversely, suppose that w ∈ Hu ∩ ker T τQ . Then T τQ (w) = 0 which implies that there exists v ∈ TτQ (u) Q such that w = ιu (v). If v ∈ / DτQ (u) , then there exists a 1-form η on Q with values in D0 such that hη | vi 6= 0. The hypothesis that η has values in D0 implies that the function f : T Q → R : v 0 7→ hη | v 0 i vanishes on D. But d hdf | wi = hη | u + svi = hη | vi 6= 0 ds s=0 which contradicts the assumption that w = ιu (v) ∈ H ⊆ T D. For each u ∈ D write

$u = ω u

Hu ×Hu

.

(35)

Corollary 1.6.1.16. (H, $) is a symplectic distribution on D, that is, for every u ∈ D the 2-form $u is nondegenerate. Proof. Let {v1 , ..., vd } be a k-orthonormal basis of Dq . For each u ∈ Dq , let wi ∈ Tu D be a lift of vi , that is T τQ (wi ) = vi . Then {w1 , ..., wd , ιu (v1 ), ..., ιu (vd )} is a basis of Hu . P Suppose that w = i (ai wi + bi ιu (vi )) is in the kernel of the restriction of ω to Hu , that is, ω(w, w0 ) = 0 for all w0 ∈ Hu . Taking w0 = ιu (vi ), from lemma 1.5.9 we obtain X X 0 = ω(w, ιu (vi )) = k(vi , T τQ ( (aj wj + bj ιu (vj )))) = k(vi , aj vj ) = ai . j

j

15

1.6. Distributional Hamiltonian formulation

P

bi ιu (vi ). Similarly, taking w0 = wi , we get X X 0 = ω(w, wi ) = ω( bj ιu (vj ), wi ) = bj ω(ιu (vj ), wi )

Hence, w =

i

=−

X j

j

j

b k(vj , T τQ (wi )) = −

X j

j

j

b k(vj , vi ) = −bi .

Therefore w = 0, that is, the restriction of ω to Hu is nondegenerate. In other words, (H, $) is a symplectic distribution on D. In order to get a clearer idea of the distributions D, F , and H defined above, we express them in local coordinates. Let q = {q i } be local coordinates on Q. Then (q, v = q) ˙ = {q i , v i = q˙i } and (q, v, q, ˙ v) ˙ = {q i , v i , q˙i , v˙ i } are induced local coordinates on T Q and T (T Q), respectively. The tangent bundle projection τQ : T Q → Q is τQ (q, v) = q and its tangent T τQ : T (T Q) → T Q is T τQ (q, v, q, ˙ v) ˙ = (q, q). ˙ We have (q, v) ∈ Dq if and only if locally on Q there are ` linearly ` independent 1-forms {αj }j=1 such that hαj (q)|vi = 0 for every 1 ≤ j ≤ `. From the definition (28) of the distribution F we see that F(q,v) = {(q, ˙ v) ˙ ∈ T(q,v) (T Q) (q, q) ˙ ∈ Dq ⊆ Tq Q}.

Consequently, the distribution H (34) is the set of (q, ˙ v) ˙ ∈ T(q,v) (T Q) such that 0 = hαj (q)|qi ˙ = h(αj (q), 0)|(q, ˙ v)i ˙ (36) 0 = hαj (q)|vi ˙ + hDαj (q)q|vi ˙ = h((Dαj (q))t v, αj (q))|(q, ˙ v)i ˙

for j = 1, . . . , `, where Dαj (q) denotes the derivative of αj at q. Clearly, H(q,v) is a linear subspace of T(q,v) (T Q) of codimension 2` and ` (q,v)∈D H(q,v) is a smooth vector subbundle of TD (T Q). In local coordinates the symplectic form ω on T Q is ω(q,v) ((q, ˙ v), ˙ (q˙0 , v˙ 0 )) = hk[ (v˙ 0 )|qi ˙ − hk[ (v)| ˙ q˙0 i = h(−k[ (v), ˙ k[ (q))|( ˙ q˙0 , v˙ 0 )i,

where k is a Riemannian metric on Q.

We now give an alternative proof of corollary 1.6.1.16. Proof of corollary 1.6.1.16. Suppose that (q, ˙ v) ˙ ∈ H(q,v) and that for every (q˙0 , v˙ 0 ) ∈ H(q,v) we have 0 = ω(q, v)((q, ˙ v), ˙ (q˙0 , v˙ 0 )). From (36) and the local expression for the symplectic form ω, it follows that for 1 ≤ j ≤ ` there are (λj , µj ) ∈ R2 such that (−k[ (v), ˙ k[ (q)) ˙ =

` X j=1

λj (αj (q), 0) + µj ((Dαj (q))t v, αj (q)) ,

16

Nonholonomically constrained motions

that is, k[ (q) ˙ =

` X j=1

µj αj (q) and − k[ (v) ˙ =

` X λj αj (q) + µj (Dαj (q))t . (37) j=1

Since (q, ˙ v) ˙ ∈ H(q,v) , the first equation in (36) reads

` ` X X µj hαi (q)|k] (αj (q))i 0 = hαi (q)|k] ( µj αj (q))i = j=1

(38)

j=1

for every 1 ≤ i ≤ `, using the first equation in (37). But k is a nondegenerate symmetric bilinear form. Thus its ` × ` matrix (hαi (q)|k] (αj (q))i) is invertible. Hence (38) implies that µj = 0 for every 1 ≤ j ≤ `. Since (q, ˙ v) ˙ ∈ H(q,v) , using the second equation in (37) and the fact that µj = 0 for every 1 ≤ j ≤ `, the second equation in (36) reads 0=−

` X j=1

λi hαi (q)|k] (αj (q))i,

1 ≤ i ≤ `.

Since k is nondegenerate, we obtain λj = 0 for every 1 ≤ j ≤ `. Thus (q, ˙ v) ˙ = (0, 0), using (37). 1.6.2

H and $ in a trivialization

In this subsection we determine the distribution H and the symplectic form $ in a trivialization. First we define a trivialization of the constraint distribution D on Q. d Suppose that {Xi }i=1 is a moving frame on the distribution D. Then λD : Q × Rd → D ⊆ T Q : (q, c = (c1 , . . . cd )) 7→ q,

d X i=1

ci Xi (q)

(39)

is the inverse of a trivialization of the distribution D, thought of as a vector subbundle of T Q whose fibers have dimension d. Second we determine what the distribution H on D is in the trivialization λ−1 D . From (34) we see that the vector subbundle H of T D is defined by Hu = Tu D ∩ (Tu τQ )−1 (Dq ) for every u ∈ D. Here q = τQ (u), where τQ : T Q → Q is the tangent bundle projection map. Let π1 = τQ ◦ λD : Q × Rd → Q : (q, c) 7→ q.

Then T(q,c) π1 = Tu τQ ◦ T(q,c) λD : Tq Q × Tc Rd → Tq Q : (vq , ξ) 7→ vq ,

(40)

17

1.6. Distributional Hamiltonian formulation

where u = λD (q, c). From the fact that λD is a diffeomorphism, we obtain (T(q,c) λD )−1 Hu = (T(q,c) λD )−1 Tu D ∩ (Tu τQ )−1 (Dq ) = T(q,c) (Q × Rd ) ∩ (T(q,c) π1 )−1 (Dq )

= D q × T c Rd = D q × Rd .

(41)

Thus the distribution H in the trivialization λ−1 D is the distribution HQ×Rd defined by (HQ×Rd )(q,c) = Dq × Rd .

(42)

Finally we determine what the symplectic form $ on D restricted to the distribution H is in the trivialization λ−1 D . Let k be a Riemannian metric on T Q. Then on T Q we have the symplectic form ω = (k[ )∗ ωQ , where ωQ is the canonical 2-form ωQ = − dθQ on T ∗ Q and θQ is the canonical 1-form, see (22). Pulling ω back by λD gives the 2-form µ∗ ωQ on Q × Rd where µ = k[ ◦ λD : Q × Rd → T ∗ Q : (q, c = (c1 , . . . , cd )) 7→ (q,

d X

ci αi (q)) (43)

i=1

d

and αi (q) = k[ (q)(Xi (q)). In other words, the 1-forms {αi }i=1 define a moving coframe on the vector subbundle D∗ = k[ (D) of T ∗ Q. To determine the 2-form ωQ×Rd = µ∗ ωQ on Q × Rd we compute µ∗ θQ (q, c) = θQ (q, c) T(q,c) µ =

d X

ci αi (q) ◦ Tµ(q,c) πQ ◦ T(q,c) µ,

using (43)

i=1

=

d X

ci αi (q) ◦ T(q,c) π1 =

i=1

d X

ci π1∗ αi (q),

i=1

d

where we view ci as a function on Q × R . Therefore we get ωQ×Rd = µ∗ (− dθQ ) = − d(µ∗ θQ ) = −

d X i=1

dci ∧ π1∗ αi + ci π1∗ (dαi ) . (44)

To give an explicit expression for ωQ×Rd restricted to HQ×Rd , we argue Pd as follows. For a = (a1 , . . . , ad ) ∈ Rd let aQ (q) = i=1 ai Xi (q), where Xi (q) = λD (q, ei ). Then aQ is a vector field on Q with values in D. For Q every a ∈ Rd define two vector fields aQ → and a↑ on Q with values in HQ×Rd ⊆ D × Rd by Q Q aQ → (q) = (a (q), 0) and a↑ (q) = (0, a).

(45)

18

Nonholonomically constrained motions

Q From the definition of aQ → , a↑ , and the mapping π1 it follows that

Moreover, Also

Q Q T(q,c) π1 aQ → (q) = a (q) and T(q,c) π1 a↑ (q) = 0.

(46)

Q hdci (q, c) | aQ → (q)i = 0 and hdci (q, c) | a↑ (q)i = ai .

(47)

Q Q hπ1∗ αi (q) | aQ → (q)i = hαi (q) | T(q,c) π1 a→ (q)i = hαi (q) | a (q)i, Q

= k(q)(Xi (q), a (q)),

using (46)

by definition of αi .

We are now in position to calculate $Q×Rd , which is the restriction of ωQ×Rd to HQ×Rd × HQ×Rd . Using (44) we find that for every a, b, e a, eb ∈ Rd we have −$Q×Rd (q, c) aQ (q) + bQ (q), e aQ (q) + ebQ (q) →

=

d X i=1

d X i=1

d X i=1

Q eQ ci π1∗ (dαi )(q, c) aQ aQ → (q) + b↑ (q), e → (q) + b↑ (q)

∗ Q hdci (q, c) | ebQ ↑ (q)i hπ1 αi (q) | a→ (q)i

∗ + hdci (q, c) | bQ aQ → (q)i ↑ (q)i hπ1 αi (q) | e +

d X i=1

=−

↑

Q eQ (dci ∧ π1∗ αi )(q, c) aQ aQ → (q) + b↑ (q), e → (q) + b↑ (q)

+ =−

→

↑

d X i=1

ci π1∗ (dαi )(q)(aQ aQ → (q), e ↑ (q)),

ebi k(q)(Xi (q), aQ (q)) + +

d X i=1

d X i=1

using (47)

bi k(q)(Xi (q), e aQ (q))

ci dαi (q)(aQ (q), e aQ (q)).

So the symplectic form $ on D restricted to the distribution H on D in the trivialization λ−1 D is Q eQ $Q×Rd (q, c) aQ aQ → (q) + b↑ (q), e → (q) + b↑ (q) = k(q) aQ (q), ebQ (q) − k(q) e aQ (q), bQ (q) −

d X i=1

ci dαi (q) aQ (q), e aQ (q) ,

(48)

1.6. Distributional Hamiltonian formulation

19

where αi (q) = k" (q)(Xi (q)). Therefore dαi (q) = d(k" (Xi ))(q), which implies d d ! ! ci dαi (q) = ci d(k" (Xi ))(q) i=1

i=1

d ! = dk ( ci Xi )(q) = d(k" (cQ ))(q).

(49)

"

i=1

In other words, with respect to the basis {((Xi (q), 0), (0, ei ))}1≤i≤d of the „ « B At . Here A = A(q) is space (HQ×Rd )(q,c) , the matrix of (Q×Rd (q, c) is −A 0 " # the d × d positive definite symmetric matrix6 (Ajk ) = k(q)(Xj (q), Xk (q)) and B = B(q) is the d × d antisymmetric matrix (B&,m ), where d ! " # B&m = ci dαi (q)(X& (q), Xm (q)) = LX! k(cQ , Xm ) (q) i=1

" # " # − LXm k(cQ , X& ) (q) − k(q) cQ (q), [X& , Xm ](q) .

(50)

To obtain the last equality in (50), we have used the fact that for a 1-form β its exterior derivative evaluated on the vector fields X and Y is β) − LY (X β) − β([X, Y ]). dβ(X, Y ) = LX (Y Using (49) we can rewrite (50) as d(k" (cQ ))(q)(X& , Xm ) = LX! (Xm

k" (cQ ))(q) − LXm (X&

k" (cq ))(q)

− ([X& , Xm ] d(k" (cQ )))(q), which is bilinear in X& (q) and Xm (q). Thus we obtain d ! " # " # ci dαi aQ (q), ' aQ (q) = d(k" (cQ )) aQ (q), ' aQ (q) i=1

aQ = LaQ ('

Consequently

k" (cQ ))(q) − LeaQ (aQ

− ([aQ , ' aQ ]

k" (cQ ))(q)

k" (cQ ))(q).

(51)

" # Q 'Q aQ (Q×Rd (q, c) aQ → (q) + b↑ (q), ' → (q) + b↑ (q) # " Q # " a (q), bQ (q) = k(q) aQ (q), 'bQ (q) − k(q) ' " # aQ (q) − d(k" (cQ )) aQ (q), ' # " Q # " a (q), bQ (q) = k(q) aQ (q), 'bQ (q) − k(q) ' aQ − LaQ ('

6 Note

k" (cQ ))(q) + LeaQ (aQ

+ ([aQ , ' aQ ]

(52)

k" (cQ ))(q)

k" (cQ ))(q).

that A = I if and only if {Xi (q)}di=1 is a k-orthogonal basis of Dq .

(53)

20

Nonholonomically constrained motions

Lemma 1.6.2.17. For any u ∈ D the form $u is a symplectic form on Hu . Proof. We have to show that $Q×Rd (q, c) is nondegenerate for every „ « B At (q, c) ∈ Q × Rd . This holds if and only if the matrix −A is invert0 ible, that is, if and only if A is invertible. But A is invertible because it is positive definite. 1.6.3

Distributional Hamiltonian vector field

We return to developing the general theory of distributional Hamiltonian systems. Let H ω be the symplectic annihilator of H in (T (T Q), ω), that is, for each u ∈ D, Huω = {w ∈ Tu (T Q) | ω(w, v) = 0 for all v ∈ Hu }.

(54)

TD (T Q) = H ⊕ H ω .

(55)

ω|TD (T Q) = $ ⊕ $ω .

(56)

df |TD (T Q) = ∂H f ⊕ ∂H ω f.

(57)

Since H is symplectic, it follows that H ω is also symplectic. The bundle TD (T Q) has a direct sum decomposition Let $ be the restriction of the symplectic form ω on T Q to H × H and $ω the restriction of ω to H ω × H ω . Both of these 2-forms are nondegenerate. Moreover, ∞

Similarly, for every f ∈ C (T Q), we denote by ∂H f and ∂H ω f the restrictions of df to H and H ω , respectively. Thus we obtain the following decomposition

Lemma 1.6.3.18. F ∩ H ω = F ω . Moreover, for every u ∈ D each w ∈ Fuω is of the form w = iv (u), where v is k-orthogonal to D and hdf | wi = 0 for each f ∈ C ∞ (T Q). Proof. Since H = F ∩T D, it follows that H ω = F ω +T ω D. From theorem 1.6.1.14, it follows that F ∩ T ω D = {0}. Hence,

F ∩H ω = F ∩(F ω + T ω D) = F ∩F ω + F ∩T ω D = F ∩F ω = F ω ⊆ ker T πQ .

Therefore, each w ∈ Fu ∩ Huω is of the form w = ιu (v) for some v ∈ Tu Q. Since w ∈ Huω , taking lemma 1.5.9 into account we get 0 = ω(w0 , w) = ω(w0 , ιu (v)) = k(v, T τQ (w0 ))

1.6. Distributional Hamiltonian formulation

21

for every w0 ∈ Hu . However, T τQ (Hu ) = DτQ (u) . Therefore v is korthogonal to D. This implies that d 1 hdf | wi = hdf | ιu (v)i = k(u + sv, u + sv) + V (τQ (u)) (58) ds s=0 2 = k(u, v) = 0,

because u ∈ D.

We now define the distributional Hamiltonian vector field of h ∈ C ∞ (D) with respect to the symplectic distribution (H, $) on D to be the unique vector field Yh on D with values in H such that Yh

$ = ∂H h.

(59)

Since the skew symmetric bilinear form $ on H ⊆ T D is nondegenerate, it follows that Yh is well defined for every h ∈ C ∞ (D). Moreover, equation (33) and lemma 1.6.3.18 imply that for every extension e h ∈ C ∞ (T Q) of h ∈ C ∞ (D) the H-component of the restriction of the Hamiltonian vector field Xeh on (T Q, ω) coincides with Yh . If the symplectic distribution (H, $) on D is understood, we shall refer to Yh as the distributional Hamiltonian vector field of h. Lemma 1.6.3.19. Let t → u(t) be an integral curve of the distributional Hamiltonian vector field Yh of h and t → q(t) = τQ (u(t)). Then u(t) = q(t) ˙ for all t. Proof. By proposition 1.6.1.15, Hu ∩ ker T τQ = {ιu (v) | v ∈ DτQ (u) }. Hence lemma 1.5.9 ensures that ω(w, ιu (v)) = k(v, T τQ (w)) for each u ∈ D and each v ∈ DτQ (u) . Now 0 = hYh

$ − ∂H h | ιu (v)i = ω(Yh (u), ιu (v)) − hdh | ιu (v)i

= k(v, T τQ (Yh (u))) − k(v, u) = k(v, T τQ (Yh (u)) − u).

Thus T τQ (Yh (u)) − u is k-orthogonal to every vector v ∈ DτQ (u) . But, T τQ (Yh (u)) − u ∈ DτQ (u) , which gives T τQ (Yh (u)) − u = 0. In other words, T τQ (Yh (u)) = u for all u ∈ D. This implies that every integral curve t 7→ u(t) of Yh is the lift of its projection t 7→ q(t) to Q by τQ , that is, u(t) = q(t) ˙ for all t. Theorem 1.6.3.20. A curve t 7→ q(t) in Q is a dynamically admissible ∗ motion of a system on (T Q, ω) with Hamiltonian h = k + τQ V subject to the linear nonholonomic constraint D if and only if it is the projection to Q under τQ of an integral curve t 7→ u(t) of the distributional Hamiltonian vector field Yh of h.

22

Nonholonomically constrained motions

Proof. Suppose that t 7→ q(t) in Q is a dynamically admissible motion of ∗ a system with Hamiltonian h = k + τQ V and nonholonomic constraint D. Let u(t) = q(t). ˙ Since q(t) ˙ ∈ Dq(t) it follows that u(t) ˙ ∈ T D. Moreover, the motion t 7→ q(t) = τQ (u(t)) satisfies the constraint q˙ ∈ D. But q(t) ˙ = T τQ (u(t)) ˙ which implies that u(t) ˙ ∈ F . Hence u(t) ˙ ∈ H. Each w ∈ Fq(t) can be decomposed as w = wH + wH ω with wH ∈ H ˙ ω and wH ω ∈ H . Now hu(t) ˙

ω − dh | wi = hu(t) ˙

= hu(t) ˙

ω − dh | wH + wH ω i

$ − ∂H h | wH i − h∂H ω h | wH ω i, (60)

since u(t) ˙ ∈ Hq(t) ˙ ˙ . By the Hamilton-d’Alembert principle 1.5.12, u(t) 0 dh(q(t)) ˙ ∈ Fq(t) . Because H ⊆ F , we obtain ˙ u(t) ˙

$ = ∂H h,

ω− (61)

and hdh | ui = 0

for all u ∈ F ∩ H ω ,

(62)

using (60). Equation (61) just says that t → u(t) = q(t) ˙ is an integral curve of the distributional Hamiltonian vector field Yh of h. Conversely, if t → u(t) is an integral curve of the distributional Hamiltonian vector field Yh of h, then equation (61) is satisfied. Moreover, by lemma 1.6.3.18, equation (62) is also satisfied. Hence, equation (60) implies 0 . In addition, if t → q(t) = τQ (u(t)) is the projecthat u(t) ˙ ω − dh ∈ Fu(t) tion of t → u(t) to Q, then q(t) ˙ = u(t) for all t by lemma 1.6.3.19. Hence, the Hamilton-d’Alembert principle 1.5.12 ensures that the curve t → q(t) is a dynamically admissible motion. An equation for integral curves t → u(t) of the distributional Hamiltonian vector field Yh of h can be written in the form u(t) ˙

$ = ∂H h(u(t)).

(63)

We shall refer to (63) as the distributional form of Hamilton’s equations of motion. We shall refer to the quadruple (D, H, $, h), where D is a manifold, (H, $) is a symplectic distribution on D and h ∈ C ∞ (D) as a distributional Hamiltonian system. The vector field Yh is determined by the restriction of dh to the distribution H, which implies that it is determined by the restriction of h to the constraint distribution D. However, this last statement uses the symplectic form $ on H, which is not determined by the restriction of the Lagrangian `, the kinetic energy, or the total energy to D.

1.6. Distributional Hamiltonian formulation

23

We now give an alternative proof of theorem 1.6.3.20 using the local coordinates that we introduced in proving corollary 1.6.1.16. Proof of theorem 1.6.3.20. Suppose that the curve t 7→ q(t) is an admissible motion of the nonholonomic system on (T Q, ω) with Hamiltonain h and constraint distribution D. Recall that in our local coordinates 1. (q, v) lies in D if and only if locally on Q there are linearly inde` pendent 1-forms {αj }j=1 such that hαj (q)|vi = 0 for 1 ≤ j ≤ `. 2. (q, ˙ v) ˙ lies in F(q,v) if and only if T τQ (q, v, q, ˙ v) ˙ = (q, q) ˙ ∈ Dq . 3. (q, ˙ v) ˙ lies in H(q,v) if and only if hαj (q)|vi = 0 and hαj (q)|vi ˙ + hDαj (q)q|vi ˙ = 0 for 1 ≤ j ≤ `. Because t 7→ q(t) is an admissible motion, the curve t 7→ u(t) = (q(t), q(t)) ˙ ∈ D. Since T τQ (u(t), u(t)) ˙ = T τQ (q(t), q(t), ˙ q(t), ˙ q¨(t)) = (q(t), q(t)) ˙ lies in D, it follows that u(t) ˙ ∈ F(q(t),q(t)) . Now u(t) ˙ = (q(t), ˙ q¨(t)) ∈ ˙ H(q(t),q(t)) . To see this note that ˙ 0 = hαj (q(t))|q(t)i, ˙

(64)

since (q(t), q(t)) ˙ ∈ D. Differentiating (64) with respect to t gives 0 = hαj (q(t))|¨ q (t)i + hDαj (q(t))q(t)| ˙ q(t)i. ˙

(65)

By item 3 above, equations (64) and (65) imply that u(t) ˙ = (q(t), ˙ q¨(t)) ∈ H(q(t),q(t)) . Since t → 7 q(t) is an admissible motion, the Hamilton˙ d’Alembert principle 1.5.12 holds, that is, for every (q˙0 , v˙ 0 ) ∈ F(q,q) ˙ we have ((q, ˙ q¨)

ω)(q˙0 , v˙ 0 ) − dh(q, q)( ˙ q˙0 , v˙ 0 ) = 0.

(66)

0 0 But H(q,q) ˙ ⊆ F(q,q) ˙ . So (66) holds for all (q˙ , v˙ ) ∈ H(q,q) ˙ . Thus (66) reads

u˙

$ = ∂H h.

(67)

ω ˙ q˙0 , v˙ 0 ) = 0 because If (q˙0 , v˙ 0 ) ∈ H(q, ˙ , then (66) reads dh(q, q)( q) ˙ ∩ F(q,q) (q, ˙ q¨) ∈ H(q,q) ˙ . In other words,

hdh|ui = 0 for all u ∈ H ω ∩ F .

(68)

This verifies the =⇒ implication of theorem 1.6.3.20. We now prove the converse. Suppose that (67) and (68) hold. Note that F = (F ∩ H) ⊕ (F ∩ H ω ) = H ⊕ (F ∩ H ω ).

(69)

24

Nonholonomically constrained motions

From (67) we see that (66) holds for every (q˙0 , v˙ 0 ) ∈ H(q,q) ˙ ; while from (68) 0 0 ω it follows that (66) holds for every (q˙ , v˙ ) ∈ H(q,q) ˙ . Thus from (69) ˙ ∩ F(q,q) we find that equation (66) holds for every (q˙0 , v˙ 0 ) ∈ F(q,q) ˙ . In other words, the Hamilton-d’Alembert principle holds for the curve t 7→ q(t). Let t 7→ u(t) = (q(t), r(t)). (q(t), ˙ r(t)) ˙ satisfies

Then by (66) the curve t 7→ u(t) ˙ =

ω(q, r)((q, ˙ r), ˙ (q˙0 , v˙ 0 )) = dh(q, r)(q˙0 , v˙ 0 ) 0

(70)

0

for every (q˙ , v˙ ) ∈ F(q,r) . Using the local expression for the symplectic ∗ form ω and differentiating the Hamiltonian h = k + τQ V , we see that (70) is equivalent to k(q)(q, ˙ v˙ 0 ) − k(q)(r, ˙ q˙0 ) =

1 2

(Dk(q)q˙0 )(r, r) + k(q)(r, v˙ 0 ) + DV (q)q˙

(71)

for every (q˙0 , v˙ 0 ) ∈ F(q,r) . Set q˙0 = 0. Then (71) becomes k(q)(q, ˙ v˙ 0 ) = q(r, v˙ 0 )

for every v˙ 0 .

This implies that r = q, ˙ since k(q) is nondegenerate. Hence u(t) = (q(t), q(t)). ˙ But (u(t), u(t)) ˙ ∈ F(q(t),q(t)) . So T τQ (u(t), u(t)) ˙ = (q(t), q(t)) ˙ ∈ ˙ D. In other words, the curve t 7→ (q(t), q(t)) ˙ lies in the constraint distribution D. Hence t 7→ q(t) is an admissible motion. 1.7

Almost Poisson brackets

1.7.1

Hamilton’s equations

In this section we define an almost Poisson bracket and give a formulation of Hamilton’s equations of the distributional Hamiltonian system (D, H, $, h) in terms of this bracket. Define a bracket { , } on C ∞ (D) by {f, g}(u) = $(u)(Yg (u), Yf (u)) = hdg(u) | Yf (u)i ,

(72)

{f, g · h} = {f, g} · h + g · {f, h},

(73)

for every f, g ∈ C ∞ (D) and every u ∈ D. Here Yf and Yg are the distributional Hamiltonian vector fields (59) corresponding to f and g, respectively. Because {f, g} = LYf g, the bracket { , } satisfies Leibniz’ rule ∞

for every f, g, h ∈ C (D). Here (f · g)(u) = f (u)g(u) for every u ∈ D. Therefore the bracket { , } defines an almost Poisson structure on C ∞ (D), that is, an antisymmetric bilinear map { , } : C ∞ (D) × C ∞ (D) → C ∞ (D).

25

1.7. Almost Poisson brackets

Because the distribution D need not be integrable, the bracket { , } need not satisfy the Jacobi identity. Thus an almost Poisson bracket on C ∞ (D) need not be a Poisson bracket on C ∞ (D). We can use the almost Poisson structure { , } on C ∞ (D) to rewrite the distributional form of Hamilton’s equations (63) as df (u(t)) = {h, f }(u(t)) dt

(74)

for every f ∈ C ∞ (D). Proposition 1.7.1.21. A curve u : R → D : t 7→ u(t) satisfies equation (74) for every f ∈ C ∞ (D) and all t if and only if it is an integral curve of the distributional Hamiltonian vector field Yh , that is, du(t) dt = Yh (u(t)). Proof. Suppose that t 7→ u(t) is an integral curve of Yh and that f ∈ C ∞ (D). Then df (u(t)) = (LYh f )(u(t)) = hdf (u(t)) | Yh (u(t))i = {h, f }(u(t)). dt Conversely, suppose that (74) holds for every f ∈ C ∞ (D). Recall that every tangent vector wu to D at u can be identified with the derivation f 7→ hdf (u)|wu i. From the definition of the bracket { , }, the derivation f 7→ {h, f } corresponds to LYh . Hence the tangent vector to the curve t 7→ u(t) at u(t) is Yh (u(t)), that is du(t) dt = Yh (u(t)). Therefore t 7→ u(t) is an integral curve of Yh . Corollary 1.7.1.22. The smooth function h : D → R is constant on the integral curves of Yh . Proof. We have LYh h = Yh

dh = {h, h} = 0,

since { , } is skew symmetric. For each u ∈ D we have the inclusion map iu : Hu → Tu D. Because iu is injective, its transpose itu : Tu∗ D → Hu∗ is surjective. Since the skew symmetric form $u : Hu × Hu → R is bilinear, there is a linear mapping $u[ : Hu → Hu∗ defined by h$u[ (vu ) | wu i = $u (vu , wu ) for every vu , wu ∈ Hu . Because $u is nondegenerate, the map $u[ is invertible. We denote its inverse by $u] . For each u ∈ D define the map Πu : Tu∗ D × Tu∗ D → R by Πu (αu , βu ) = hβu | iu ◦ $u] ◦ itu (αu )i,

26

Nonholonomically constrained motions

for every αu , βu ∈ Tu∗ D. Then Πu : Tu∗ D × Tu∗ D → R is a skew symmetric bilinear map. Moreover the map u 7→ Πu is smooth and is called the almost Poisson structure tensor field Π associated to the almost Poisson bracket ({ , }, C ∞ (D)). Because D is a finite dimensional smooth manifold, for every u ∈ D we have Tu∗ D = span{df (u) for all f ∈ C ∞ (D)}. Therefore for every f, g ∈ C ∞ (D) and every u ∈ D, we have Πu (df (u), dg(u)) = hdg(u) | iu ($u ] (itu (df (u)))i

= hdg(u) | Yf (u)i = {f, g}(u). Π]u

(75)

Tu∗ D

Since Πu is bilinear, it induces a linear map : → Tu D defined by hβu | Π]u (αu )i = Πu (αu , βu ) for every αu , βu ∈ Tu∗ D. The map u 7→ Π]u is smooth. Lemma 1.7.1.23. For every u ∈ D, we have

Hu = im Π]u = span{Yf (u) for every f ∈ C ∞ (D)}.

(76)

Proof. We show that the first equality in (76) holds. By definition Π]u (df (u)) = iu ($u] (itu (df (u)))) = Yf (u) for every f ∈ C ∞ (D). But itu : Tu∗ D → Hu∗ is surjective and $u] : Hu∗ → Hu is bijective. Therefore im Π]u = Hu . Next we prove the second equality in (76). Suppose that f ∈ C ∞ (D). Since Yf (u) ∈ Hu ⊆ Tu D, it follows that span{Yf (u) for every f ∈ C ∞ (D)} ⊆ Hu . Conversely, suppose that vu ∈ Hu . Then $u[ (vu ) ∈ Hu∗ . Since itu : Hu∗ → Tu∗ D is surjective and Tu∗ D = span{dg(u) for all g ∈ C ∞ (D)}, there is a smooth function f on D such that itu (df (u)) = $u[ (vu ). In other words, vu = $u] (itu (df (u))) = Yf (u). Therefore Hu ⊆ span{Yf (u) for every f ∈ C ∞ (D)}. This proves the lemma. Now we show how to recover a symplectic generalized distribution starting from an almost Poisson structure tensor field. Suppose that M is a smooth manifold and that { , } is an almost Poisson structure on C ∞ (M ), that is, { , } : C ∞ (M ) × C ∞ (M ) → C ∞ (M ), which is bilinear, skew symmetric, and satisfies Leibniz’ rule {f, g · h} = {f, g} · h + g · {f, h}

∞

for every f, g, h ∈ C (M ). For every m ∈ M define the map

∗ ∗ Πm : Tm M × Tm M → R : (df (m), dg(m)) 7→ {f, g}(m).

∗ Recall that the cotangent space Tm M to M at m is equal to ∞ span{df (m) f ∈ C (M )}. Then Πm is bilinear and skew symmetric.

1.7. Almost Poisson brackets

27

Moreover, Πm depends smoothly on m. We call the map m 7→ Πm the almost Poisson tensor field Π on M associated to the almost Poisson structure { , } on C ∞ (M ). Because Πm is bilinear, there is an associ∗ ated linear map Π]m : Tm M → Tm M defined by hdg(m) | Π]m (df (m))i = Πm (df (m), dg(m)), which depends smoothly on m. For f ∈ C ∞ (M ) let Pf (m) = Π]m (df (m)). Then m 7→ Pf (m) is a smooth vector field on M called the almost Poisson vector field associated to f . Lemma 1.7.1.24. For each m ∈ M let Hm = im Π]m . Then the map m 7→ Hm is a generalized distribution H on M , in the sense that it is locally spanned by smooth vectors on M . Proof. By definition Hm = span{Pf (m) ∈ Tm M f ∈ C ∞ (M )}. For each nm m ∈ M choose f1 , . . . , fnm ∈ C ∞ (M ) such that {Pfi (m)}i=1 spans Hm . Since the function M 7→ Z>0 : m 7→ nm is lower semi-continuous, there is an open neighborhood U of m in M such that it attains a maximum N . N Therefore for every m ∈ U we have Hm = span {Pfi (m)}i=1 . Consequently, m → Hm is a generalized distribution on M . ∗ For each m ∈ M by definition of Hm the linear map Π]m : Tm M → Hm ] ∗ e is surjective. Therefore the induced linear map Πm : Tm M/ ker Π]m → Hm is a bijective linear map.

Lemma 1.7.1.25. The linear map

∗ ∗ jm : Tm M/ ker Π]m → Hm : αm + ker Π]m 7→ αm |Hm

is bijective. Proof. To see that the map jm is well defined it suffices to show that ∗ ker Π]m |Hm = 0. Let βm ∈ ker Π]m and let vm ∈ Hm . Since Π]m : Tm M → ∗ Hm is surjective, there is a γm ∈ Tm M such that Π]m (γm ) = vm . Therefore βm (vm ) = βm (Π]m (γm )) = Πm (γm , βm ) = −Πm (βm , γm ), since Πm is skew symmetric = −γm (Π]m (βm )) = 0, since βm ∈ ker Π]m .

(77)

Therefore jm is well defined. Clearly it is linear. Since the inclusion map im : Hm → Tm M is injective, its transpose t ∗ ∗ ∗ ∗ im : T m M → Hm is surjective. Let βm ∈ Hm . Then there is γm ∈ Tm M t such that im (γm ) = βm . For vm ∈ Hm , we have βm (vm ) = (itm (γm ))(vm ) = γm (im (vm )) = γm (vm ),

28

Nonholonomically constrained motions

that is, γm |Hm = βm . So jm (γm + ker Π]m ) = γm |Hm = βm . Theree ] is a bijective linear map, it follows that fore jm is surjective. Since Π m ∗ ∗ ] dim Tm M/ ker Πm = dim Hm = dim Hm . Therefore jm is bijective. [ ∗ [ e ]m )−1 : Hm → Hm Let $m = jm ◦ (Π . Then $m is a bijective linear mapping associated to the skew symmetric bilinear map [ $m : Hm × Hm → R : (vm , wm ) 7→ h$m (vm ) | wm i.

Thus we have proved Proposition 1.7.1.26. The generalized distribution (H, $) on M is symplectic. Corollary 1.7.1.27. For every f ∈ C ∞ (M ) the Hamiltonian vector field Yf with respect to the generalized symplectic distribution (H, $) is equal to the almost Poisson vector field Pf . Proof. For every m ∈ M we have

e ] (df (m) + ker Π] ) ∈ Hm , Pf (m) = Π]m (df (m)) = Π m m

by definition of Hm . So

[ [ e] $m (Pf (m)) = $m (Πm (df (m) + ker Π]m )) = jm (df (m) + ker Π]m ) [ = df (m)|Hm = $m (Yf (m)).

[ Therefore Pf (m) = Yf (m) because $m is invertible.

1.7.2

Nonholonomic Dirac brackets

Next we give a procedure for finding an almost Poisson bracket on D in terms of the Poisson bracket on (T Q, ω), defined by (72). Suppose that the nonholonomic constraint D is a distribution on Q given by T` Dq = i=1 ker ϕi (q), where ϕi , i = 1, . . . , `, are 1-forms on Q such that ϕi (q) are linearly independent on Tq∗ Q for every q ∈ Q. Since D is a distribution, it is a submanifold of T Q, defined by the common zeroes of the constraint functions ci : T Q → R : u 7→ hϕi (τQ (u))|ui,

1 ≤ i ≤ `.

(78)

Here τQ : T Q → Q is the tangent bundle projection. Note that for every vu ∈ Tu (T Q) ∗ ϕi (u) | vu i, ci (u) = hτQ

(79)

29

1.7. Almost Poisson brackets

because ∗ hτQ ϕi (u) | vu i = hϕi (τQ (u)) | Tu τQ vu i = hϕi (τQ (u)) | ui.

Now consider the constraint map c : T Q → R` : u 7→ c1 (u), . . . c` (u) . Then Fact 1.7.2.28. 0 is a regular value of the map c. `

Proof. First we show that for every u ∈ T Q the 1-forms {dci (u)}i=1 P` are linearly independent. Suppose that 0 = Then i dci (u). i=1 αP P` ` ∗ ∗ 0 = i=1 αi dτQ ϕi (u), using (79), which implies 0 = τQ α dϕ (q) , i i i=1 P` where q = τQ (u). Hence 0 = i=1 αi dϕi (q). But then αi = 0 for all ` ` 1 ≤ i ≤ ` because {dϕi (q)}i=1 are linearly independent. Since {dci (u)}i=1 are linearly independent, the derivative of the map c is surjective. Next we show that Fact 1.7.2.29. c−1 (0) = D. Proof. To see this note that ci (u) = 0 for all i ⇐⇒ hϕi (τQ (u))|ui = 0

for all i

⇐⇒ u ∈ ker ϕi (q) for all i, where q = τ (u)

⇐⇒ u ∈ Dq . We now prove

Fact 1.7.2.30. For every u ∈ D we have

Tuω D = span{Xc1 (u), . . . , Xc` (u)}.

(80)

Proof. For every vu ∈ Tu D, Tuω D.

ωu (Xci (u), vu ) = hdci (u) | vu i = 0. `

`

So Xci (u) ∈ Therefore {Xci (u)}i=1 ⊆ Tuω D. Note that {Xci (u)}i=1 are linearly independent because if ` ` X X 0= αi Xci (u) = ωu] αi dci (u) , P`

i=1

i=1

then 0 = i=1 αi dci (u). But this implies αi = 0 for 1 ≤ i ≤ `. From ◦ ` fact 1.7.2.28 it follows that (Tu D) = span {dci (u)}i=1 . Moreover, ωu] is ◦ ω a bijective linear map from (Tu D) onto Tu D. Therefore dim Tuω D = `, which completes the proof.

30

Nonholonomically constrained motions

By definition, F (28) is the distribution on T Q whose image under T τQ is the distribution D. In our situation we have Fact 1.7.2.31. For every u ∈ T Q ∗ Fuω = span{ωu] (τQ ϕi (u)) i = 1, . . . , `}.

(81)

∗ ϕi (u)) is the Proof. To see this we argue as follows. By definition, ωu] (τQ unique vector in Tu (T Q) such that for every w ∈ Tu (T Q) ∗ ∗ hϕi (q)|T τQ wi = hτQ ϕi (u)|wi = ω(ωu] (τQ ϕi (u)), w),

(82)

where q = τQ (u). Now ∗ w ∈ Fu ⇔ T τQ v ∈ Dq ⇔ hτQ ϕi (u) | vi = hϕi (τQ (u))|T τQ vi = 0 ∀ i. ∗ From (82) it follows that ωu] (τQ ϕi (u)) ∈ Fuω .

Conversely, suppose that w0 ∈ Fuω . Let q = τQ (u). Since w0 ∈ TD (T Q), lemma 1.6.3.18 implies that there is a v 0 ∈ Tq Q k-orthogonal to Dq such [ that w0 = ιu (v 0 ). Consider the 1-form on Q defined by ϕ(q) = k(q) (v 0 ). Then for every u0 ∈ Dq , we have hϕ(q) | u0 i = k(v 0 , u0 ) = 0 because v 0 is k-orthogonal to Dq . Thus ϕ(q) is a linear combination of the 1-forms ϕi (q). For every w ∈ Tu (T Q) we have ∗ ω(u)(w0 , w) = k(q)(v 0 , T τQ w) = hϕi (τQ (u))|T τQ wi = hτQ ϕi (u)|wi,

which implies ∗ w0 ∈ span{ωu] (τQ ϕi )(u) i = 1, . . . , `}.

This proves (81). By definition of the symplectic distribution H = F ∩T D (34) on D, using theorem 1.6.3.20 we have Fuω ∩ Tuω D = {0} for every u ∈ D. Consequently, Huω = Tuω D ⊕ Fuω

∗ ∗ = span{Xc1 (u), . . . , Xc` (u), ωu] (τQ ϕ1 (u)), . . . , ωu] (τQ ϕ` )(u)}

(83)

is a ωu -symplectic subspace of (Tu (T Q), ωu ) of dimension 2`. The Poisson bracket [f1 , f2 ] of f1 , f2 ∈ C ∞ (T Q), with respect to the symplectic form ω, can be expressed in terms of the differentials df1 and df2 as [f1 , f2 ] = ω(ω ] (df1 ), ω ] (df2 )).

31

1.7. Almost Poisson brackets

In the above formula, we can replace differentials by arbitrary 1-forms and extend the notion of the Poisson bracket to one forms which need not be exact. In particular, we can write ∗ ∗ ∗ ∗ [τQ ϕi , τQ ϕj ] = ω(ω ] (τQ ϕi ), ω ] (τQ ϕj )), ∗ {ci , τQ ϕj }

]

= ω(ω (dci ), ω

]

(84)

∗ (τQ ϕj )).

Since TD (T Q) = H ⊕ H ω , for every f ∈ C ∞ (T Q), we can express the restriction of Xf to points in D in the form Xf |D = Yf + Zf , where Yf has values in H and Zf has values in H ω . Similarly, for g ∈ C ∞ (T Q), the restriction dg |TD T Q of dg to points in D can be written as dg |TD (T Q) = ∂H g ⊕ ∂H ω g, where ∂H g annihilates H ω and ∂H ω g annihilates H. Hence, restriction of [f, g] to D can be expressed as [f, g]

D

= hdg | Xf i

D

= hdg | Yf + Zf i

D

= h∂H g ⊕ ∂H ω g | Yf i + hdg | Zf i

= h∂H g | Yf i + hdg | Zf i = {f where {f g

D

D

,g

D

D

,g

D

} + hdg | Zf i,

} is the nonholonomic almost Poisson bracket of f

in C ∞ (D). From equation (83) that

D

and

Zf (u) = a1 (u)Xc1 (u) + . . . + a` (u)Xc` (u) ∗ ∗ +a`+1 (u) ωu] (τQ ϕ1 (u)) + . . . + a2` (u) ωu] (τQ ϕ` (u))

for every u ∈ D. We can express the coefficients a1 (u), . . . , a` (u) and ∗ ∗ b1 (u), . . . , b` (u) in terms of Poisson brackets of c1 , ..., c` and τQ ϕ1 , ..., τQ ϕ` ∗ with f as follows. Since hdcj | Yf i = 0 and hτQ ϕj | Yf i = 0 for all j = 1, . . . , `, we get [f, cj ] = hdcj | Xf i = hdcj | Zf i =

` X

ai hdcj | Xci i +

` X

∗ ai hτQ ϕj | Xci i +

i=1

` X i=1

∗ a`+i hdcj | ω ] (τQ ϕi )i,

∗ ∗ ∗ ϕj | Xf i = hτQ ϕj | Zf i [f, τQ ϕj ] = hτQ

=

i=1

` X i=1

∗ ∗ ϕi )i. a`+i hτQ ϕj | ω ] (τQ

Using the definition of the extended Poisson brackets (84) we are led to a system of 2` linear equations P` P` ∗ i=1 ai [ci , cj ] + i=1 a`+i [τQ ϕi , cj ] = [f, cj ] (85) P` P` ∗ ∗ ∗ ∗ i=1 ai [ci , τQ ϕj ] + i=1 a`+i [τQ ϕi , τQ ϕj ] = [f, τQ ϕj ]

32

Nonholonomically constrained motions

for a1 , ..., a2` . This system can be written in terms of matrices as Aa = b, where ! A=

([cj , ci ])

∗ ϕ ]) ([cj , τQ i

∗ ϕ , c ]) ([τ ∗ ϕ , τ ∗ ϕ ]) ([τQ j i Q j Q i

.

(86)

Claim 1.7.2.32. The matrix A is invertible. Proof. We compute each block of A. First, ωu (Xcj (u), Xci (u)) = hdcj (u) | Xci (u)i = [cj , ci ]u .

Next Finally

∗ ∗ ∗ ϕi (u)), Xcj (u)) = hτQ ϕi (u) | Xcj (u)i = [cj , τQ ϕi ]u . ωu (ω ] (τQ

∗ ∗ ∗ ∗ ∗ ∗ ωu (ω ] (τQ ϕi (u)), ω ] (τQ ϕi (u))) = hτQ ϕi (u) | ω ] (τQ ϕi (u))i = [τQ ϕj , τQ ϕi ]u .

Therefore the matrix A = (Aij ) is the matrix of the symplectic form ω with respect to the basis (83) of H ω . Hence A is invertible because (Huω , ωu ) is a symplectic subspace of (Tu (T Q), ωu ). 0 “ ” “ ”1 bij cij C ” “ ”A −cji dij

Let A−1 = (Aij ) =B @ “

(85) gives

` X

ai =

j=1

` X

a`+i =

j=1

be the inverse of the matrix A. Solving

∗ bij [f, cj ] + cij [f, τQ ϕj ] ,

∗ −cji [f, cj ] + dij [f, τQ ϕj ] ,

when written out in components. Now ` X ∗ hdg | Zf i = ai hdg | Xci i + a`+i hdg | ω ] (τQ ϕi )i i=1

=

` X i=1

=

∗ ai [g, ci ] + a`+i [g, τQ ϕi ]

` X ` X i=1 j=1

=

∗ bij [f, cj ] + cij [f, τQ ϕj ] [g, ci ] +

∗ ∗ −cji [f, cj ] + dij [f, τQ ϕj ] [g, τQ ϕi ]

` X ` X

∗ [g, ci ] bij [f, cj ] + [g, ci ] cij [f, τQ ϕj ]

i=1 j=1

∗ ∗ ∗ −[g, τQ ϕi ] cji [f, cj ] + [g, τQ ϕi ] dij [f, τQ ϕj ].

33

1.8. Momenta and momentum equation

This implies that {f

D

,g

D

} = [f, g]

D

−

X

∗ ([g, ci ], [g, τQ ϕi ])Aij

1≤i,j≤`

[f, cj ] ∗ [f, τQ ϕj ]

!

.

(87)

In chapter 5 in order to find the equations of motion for Carath´eodory’s sleigh, we use the special case of (87) when ` = 1. Written out explicitly, we get {f

D

,g

D

} = [f, g]

D

∗ + [c1 ,τ1∗ ϕ1 ] ([g, c1 ], [g, τQ ϕ1 ]) Q

1.8

∗ϕ ] 0 [c1 , τQ 1 ∗ −[c1 , τQ ϕ1 ] 0

!0 [f, c ] 1 1 @ A. ∗ϕ ] [f, τQ 1

(88)

Momenta and the momentum equation

In this section Q is a smooth manifold with a Riemannian metric k. 1.8.1

Momentum functions

Let Z be a smooth vector field on Q. The momentum function associated to Z is µZ : T ∗ Q → R : αq 7→ hαq | Z(q)i. Let ϕs : Q → Q be the flow of Z. Then the flow of the cotangent lift ZT ∗ Q of Z to T ∗ Q is ϕ es : T ∗ Q → T ∗ Q : αq 7→ (Tq ϕ−s )t αq .

Note that T πQ ◦ ZT ∗ Q = Z ◦ πQ , where πQ : T ∗ Q → Q is the cotangent bundle projection map. Lemma 1.8.1.33. ZT ∗ Q is the Hamiltonian vector field XµZ on (T ∗ Q, ωQ ) corresponding to the momentum function µZ . Here ωQ = − dθQ , where θQ is the canonical 1-form on T ∗ Q. Proof. First we show that θQ is invariant under the flow of ZT ∗ Q . For u ∈ Tq Q, α ∈ Tq∗ Q, and vα ∈ Tα (T ∗ Q) we have hϕ e∗s θQ (α) | vα i = hθQ (ϕ es (α)) | Tu ϕ es vα i = hϕ es (α) | Tu (τQ ◦ ϕ es )vα i

where τQ is the tangent bundle projection map

Therefore

= hϕ es (α) | T ϕs (Tu τQ vα )i = hα | ui = hθQ (α) | vα i.

0 = (LZT ∗ Q θQ )(α) = (ZT ∗ Q

dθQ )(α) + dhθQ (α) | ZT ∗ Q (α)i,

34

Nonholonomically constrained motions

that is, (ZT ∗ Q

ωQ )(α) = dhθQ (α) | ZT ∗ Q (α)i

= hα | T πQ ZT ∗ Q (α)i = hα | Z(πQ (α)i = µZ (α). Using the vector bundle isomorphism k[ : T Q → T ∗ Q, we pull back the momentum function µZ to a smooth function PZ on T Q. In other words, PZ (vq ) = k(q)(vq , Z(q)) for every q ∈ Q and every vq ∈ Tq Q. PZ is the momentum function on T Q associated to the vector field Z with respect to the Riemannian metric k. Let XPZ be the Hamiltonian vector field corresponding to PZ on the symplectic manifold (T Q, ω), where ω = (k[ )∗ ωQ . The tangent lift ZT Q of the vector field Z is the vector field on T Q whose flow ϕs : T Q → T Q is ϕs (vq ) = Tq ϕs vq for every q ∈ Q. Note that T τQ ◦ ZT Q = Z ◦ τQ . Lemma 1.8.1.34. We have T τQ ◦ XPZ = Z ◦ τQ .

(89)

Also XPZ = ZT Q if and only if the flow of Z preserves the Riemannian metric k. Proof. Pulling back both sides of XµZ (k[ )∗ XµZ

ω = (k[ )∗ XµZ [ ∗

ωQ = dµZ by k[ gives (k[ )∗ ωQ = k[ (XµZ

ωQ )

[ ∗

= (k ) dµZ = d((k ) µZ ) = dPZ . Therefore XPZ = (k[ )∗ XµZ . From T πQ ◦ k[ = T τQ it follows that T τQ ◦ XPZ = T πQ ◦ k[ ◦ (k[ )∗ XµZ = T πQ ◦ XµZ ◦ k[ = Z ◦ πQ ◦ k[ = Z ◦ τQ . This proves (89). Since XPZ = (k[ )∗ XµZ and XµZ = ZT ∗ Q , it follows that XPZ = ZT Q if and only if ZT Q = (k[ )∗ ZT ∗ Q . This equality holds if and only if T ϕs = (k[ )−1 ◦ (T ϕ−s )t ◦ k[ , that is, for every q ∈ Q and every vq ∈ Tq Q we have k[ (ϕs (q))(Tq ϕs vq ) = k[ (q)(vq )Tϕ−s (q) ϕs .

(90)

Evaluating both sides of (90) on Tq ϕs wq , where wq ∈ Tq Q, we get k(ϕs (q))(Tq ϕs vq , Tq ϕs wq ) = k(q)(vq , wq ), for every q ∈ Q and every vq , wq ∈ Tq Q. In other words, the flow of Z is an isometry.

1.8. Momenta and momentum equation

1.8.2

35

Momentum equations

Let (D, H, $, h) be a distributional Hamiltonian system with constraint distribution D on Q and symplectic distribution (H, $) on D. A smooth curve γ : I ⊆ R → T Q satisfies the second order equation condition for f ∈ C ∞ (Q) if and only if for every t ∈ I

∗ dτQ f (γ(t)) = hdf (q(t)) | v(t)i, (91) dt where q(t) = τQ (γ(t)) and v(t) = γ(t) ∈ Tq(t) Q. Note that (91) is equivalent to

hdf (γ(t)) | Tγ(t) τQ (γ(t))i ˙ = hdf (γ(t)) | v(t)i

(92)

for every t ∈ I. Lemma 1.8.2.35. For every f ∈ C ∞ (Q) the following statements are equivalent. 1. Yf defines a second order differential equation. In other words, if I ⊆ R → D ⊆ T Q : t 7→ (q(t), v(t)) is an integral curve of Yf , then v(t) = dq(t) dt for every t ∈ I. 2. There is a function V ∈ C ∞ (Q) such that f (q, vq ) =

1 2

k(q)(vq , vq ) + V (q),

(93)

for every (q, vq ) ∈ D. Proof. Using a local trivialization of D and the notation of §1.6.3, we may write Yf = a→ + b↑ , see (45). Writing u = (q, v) ∈ Dq we have b↑ (u) = ιu (aQ (u)). Therefore hdf (u) | ιu (z)i = k(q)(aQ (u), z),

(94)

for every z = bQ (u) ∈ Dq . Because aQ (u) = Tu τQ Yf (u), statement 1 is equivalent to aQ (u) = v. Therefore (94) is equivalent to d dt

f (u + tz) = k(q)(v, z),

(95)

t=0

for every z ∈ Dq . But (95) is equivalent to (93). A smooth curve γ : I ⊆ R → T Q satisfies the momentum equation for the smooth vector field Z on Q if and only if dPZ (γ(t)) = −LPZ h(γ(t)), dt

(96)

36

Nonholonomically constrained motions

for every t ∈ I. Theorem 1.8.2.36. Every integral curve I ⊆ R → D : t 7→ u(t) of the distributional Hamiltonian vector field Yh associated to the distributional Hamiltonian system (D, H, $, h) satisfies the second order equation condition for every f ∈ C ∞ (Q) and the momentum equation for every smooth vector field Z on Q, which takes values in the constraint distribution D. Conversely, let γ : I ⊆ R → D ⊆ T Q : t 7→ u(t) = (q(t), v(t)) be a smooth curve. Let F be a collection of smooth functions f on Q such that span{df q(t) |Dq(t) } = Dq(t) for every t ∈ I and let Z be a collection of smooth vector fields on Q with values in D such that span{Z(q(t))} = Dq(t) for every t ∈ I. If γ satisfies the second order equation condition for every f ∈ F and the momentum equation for every Z ∈ Z, then γ is an integral curve of Yh . Proof. Suppose that γ : I ⊆ R → D ⊆ T Q : t 7→ u(t) = (q(t), v(t)) is an integral curve of Yh . From lemma 1.8.2.35 it follows that Tγ(t) τQ (γ(t)) ˙ = v(t). Therefore (92) holds for all t ∈ I, that is, the second order equation condition holds for every f ∈ C ∞ (Q). Let Z be a smooth vector field on Q with values in D. Since T τQ ◦ XPZ = Z ◦ τQ and Z(Q) ⊆ D, it follows that XPZ has values in the distribution F (28). Therefore LYh PZ = Yh

dPZ = ω(Yh , XPZ )

= −ω(XPZ , γ) ˙ = −XPZ

dh = −LPZ h.

Consequently, the Hamiltonian h satisfies the momentum equation for Z. To prove the converse we start by observing that because γ is a smooth curve in D, it follows that v(t) ∈ Dq(t) and Tγ(t) γ(t) ˙ ∈ Dq(t) for every t ∈ I. Since (92) holds, we obtain Tγ(t) γ(t) ˙ = v(t). From Tγ(t) τQ (Yh (γ(t))) = v(t), it follows that Tγ(t) τQ λ(t) = 0 for every t ∈ I, where λ(t) = γ(t) ˙ − Yh (γ(t)). In addition, we have λ(t) ∈ Tγ(t) D ∩ Tγ(t) τQ , since γ(t) ˙ and Yh (γ(t)) both lie in Tγ(t) D. Therefore for every t ∈ I we see that λ(t) ∈ Dγ(t) , if we identify ker Tγ(t) τQ with Tγ(t) Q. On the other hand, the assumption that the momentum equation holds for γ implies dPZ = −LXPZ h = hdPZ | γi. ˙ Consequently, hdPZ | λ(t)i = 0 for Yh every Z ∈ Z. Because λ(t) belongs to the tangent space of each fiber of the tangent bundle projection map τQ and the fiber derivative of PZ is k[ (Z), see (95), it follows that k(γ(t))(Z(γ(t)), λ(t)) = 0 for every Z ∈ Z. Because spanZ∈Z {Z(γ(t))} = Dγ(t) , we obtain λ(t) = 0 for every t ∈ I. In other words, γ(t) ˙ = Yh (γ(t)) for every t ∈ I, that is, t 7→ γ(t) is an integral curve of Yh .

1.8. Momenta and momentum equation

1.8.3

37

Homogeneous functions

Because we have assumed that the nonholonomic constraints are linear, the constraint manifold D is a smooth vector subbundle of the tangent bundle T Q of configuration space Q. For every nonnegative integer k let Ck∞ (D) be the set of all smooth functions f on D such that for each q ∈ Q the restriction f |Dq of f to the vector space Dq is a homogeneous function of degree k, that is, f (r uq ) = rk f (uq ) for every r ∈ R\{0} and every uq ∈ Dq . For fixed q ∈ Q the function f |Dq is smooth at the origin of Dq . Therefore f |Dq is a homogeneous polynomial on Dq of degree k. Let C0∞ (D) be the space of smooth functions on D which are constant on the fibers of the vector bundle τ = τQ |D : D ⊆ T Q → Q. Then C0∞ (D) = {τ ∗ f ∈ C ∞ (D) f ∈ C ∞ (Q)}. This proves Lemma 1.8.3.37. The map τ ∗ : C ∞ (Q) → C0∞ (D) is an isomorphism of algebras. Define the space = {f ∈ C ∞ (D) fq |Dq : Dq ⊆ Tq Q → R is linear for every q ∈ Q}. Because k[q |Dq : Dq ⊆ Tq Q → Dq∗ ⊆ Tq∗ Q is a linear isomorphism, there is a unique Z(q) ∈ Dq such that k(q)(Z(q), uq ) = (f |Dq )(uq ) for every uq ∈ Dq . Moreover, the map q 7→ Z(q) is smooth. Therefore f = PZ |D, where PZ is the momentum function corresponding to the vector field Z on Q. C1∞ (D)

Let X ∞ (Q, D) be the space of smooth vector fields on Q with values in the distribution D. The discussion above proves Lemma 1.8.3.38. The mapping X ∞ (Q, D) → C1∞ (D) : Z 7→ PZ |D is an isomorphism of vector spaces. ∞ Let Chom (D) be the vector space of smooth homogeneous functions on D. ∞ Lemma 1.8.3.39. Chom (D) is a graded almost Poisson algebra. In partic∞ ∞ ular, if f ∈ Ck (D) and g ∈ C`∞ (D), then {f, g} ∈ Ck+`−1 (D).

Proof. For any r ∈ R \ {0} let mr : Tq Q → Tq Q : vq 7→ r vq . Then mtr : Tq∗ Q → Tq∗ Q : αq 7→ r αq . Let θQ be the canonical 1-form on T ∗ Q. Then for every αq ∈ T ∗ Q and wα ∈ Tα (T ∗ Q) we get h(mtr )∗ θQ (αq ) | wα i = hθQ (r αq ) | Tαq mr wα i = hr αq | T (τQ ◦ mr )wα i = hr αq | T τQ wα i since τQ ◦ mr = τQ = hr αq | T τQ wα i = r hθQ (αq ) | wα i.

38

Nonholonomically constrained motions

So (mtr )∗ θQ = r θQ . Therefore # " (mtr )∗ ωQ = (mtr )∗ (− dθQ ) = − d (mtr )∗ θQ = − d(r θQ ) = −r dθQ = r ωQ

Note that the map mr leaves the distribution D invariant and T mr leaves the distribution H invariant. Since f ∈ Ck∞ (D) if and only if m∗r f = rk f , we obtain (mtr )∗ df = d(m∗r f ) = d(rk f ) = rk df. Therefore for each u, u* ∈ D we have rk ω(Yf (u), u* ) = "df (mr (u)) | Tu" mr (u* )#

= ω(mr (u))(Yf (mr (u))) | Tu" mr (u* )# # " = (mtr )∗ ω (u)(m∗r Yf (u), u* ),

that is,

m∗r Yf = rk−1 Yf .

(97)

Consequently, m∗r ({f, g})(u) = {f, g}(mr (u)) = "dg(mr (u)) | Yf (mr (u))#

= rk−1 "dg(mr (u)) | Tu mr Yf (u)#, using (97) and f ∈ Ck∞ (D) = rk−1 "(mtr )∗ (dg)(u) | Yf (u)#

= rk+&−1 "dg(u) | Yf (u)#, because g ∈ C&∞ (D)

= rk+&−1 {f, g}(u).

1.8.4

Momenta as coordinates n

Let U be an open subset of Q. With ϕi ∈ C ∞ (D) let {ϕi }i=1 be a system d of coordinates on U . Let {Zj }j=1 be smooth vector fields on U such that Zj (U ) ⊆ D. n

The functions {τ ∗ ϕi }i=1 , where τ = τQ |D, together with the momenta {Pj = PZj }dj=1 form a system of coordinates on V = τ −1 (U ) in D if and d

only if for each q ∈ U the vectors {Zj (q)}j=1 form a basis of Dq . The map λ : U × Rd → V ⊆ D : (q, c) '→

d ! j=1

cj Zj (q)

39

1.9. Projection principle

is the inverse of a trivialization of the vector bundle τ : V ⊆ D → U ⊆ Q. Pd Substituting v = j=1 cj Zj (q) into the definition of the momentum Pj = PZj gives the relation Pj (q, v) = PZj (q, v) = k(q)(Zj (q), v) =

d X

k(q)(Zj (q), Z` (q)) c`

(98)

`=1

between the coordinates Pj and c` on U . Note that cj = Pj for 1 ≤ j ≤ d d if and only if {Zj (q)}j=1 is a k-orthonormal basis of Dq for every q ∈ U . In terms of the coordinates {τ ∗ ϕ1 , . . . , τ ∗ ϕn , P1 , . . . Pn } on V ⊆ D the equations of motion are ϕ˙ i =

d X j=1

(Zj )i cj , for 1 ≤ i ≤ n

(99)

P˙j = −(LPj h)|V, for 1 ≤ j ≤ d.

(100)

Here equation (99) is the second order equation condition for ϕi , where (Zj )i = hdϕi | Zj i are the components of the vector field Zj with respect to n the coordinates {ϕi }i=1 on U ⊆ Q. Solving (98), the cj ’s can be expressed in terms of the Pj ’s. Equation (100) is the momentum equation for Zj , where the right hand side has to be expressed in terms of ϕi and Pj . Using the fact that h = 21 k(q)(v, v) + V (q), where T = and V ∈ C0∞ (D) and lemma 1.8.3.39, we obtain

C2∞ (D)

1 2

k(q)(v, v) ∈

ϕ˙ i = {h, ϕi } = {T, ϕi } ∈ C1∞ (D),

which is a

n {ϕi }i=1 -dependent

linear form on D and

P˙j = {h, Pj } = {T, Pj } + {V, Pj }, n

d

where {T, Pj } ∈ C2∞ (D) is a {ϕi }i=1 -dependent quadratic form in {Pj }j=1 n and {V, Pj } ∈ C0∞ (D) is a function of {ϕi }i=1 . 1.9

A projection principle

In this section we describe a projection method to obtain the distributional Hamiltonian equations of motion (63). Equation (55) enables us to decompose Xf |D, the restriction to D of the Hamiltonian vector field Xf of f , into its components Xf,H and Xf,H ω in H and H ω , respectively. In other words, Xf |D = Xf,H + Xf,H ω .

(101)

40

Nonholonomically constrained motions

Taking equations (56), (57) and (59) into account, we see that the distributional Hamiltonian vector field Yf of f is equal to the H-component of the Hamiltonian vector field Xf of f with respect to the decomposition (101). In other words, Yf = Xf,H . On the other hand, the first equality in (33) allows us to decompose Xf |D into its components Xf,F ω and Xf,T D in F ω and T D, respectively. Lemma 1.9.40. Xf,T D = Yf . Proof. Since H ⊆ T D and F ω ⊆ H ω , we can intersect the decompositions TD (T Q) = F ω ⊕ T D and TD (T Q) = H ω ⊕ H to obtain TD (T Q) = F ω ⊕ (H ω ∩ T D) ⊕ H.

(102) ω

We know that H is a symplectic distribution on D, and F ⊆ ker T τQ is isotropic. Since F ω ⊕ (H ω ∩ T D) = H ω and dim F ω = dim(H ω ∩ T D), it follows that F ω and H ω ∩ T D are Lagrangian in H ω . Using the decomposition (102) write Xf = Xf1 + Xf2 + Xf3 where Xf1 has values in F ω , Xf2 has values in H ω ∩ T D, and Xf3 has values in H. In a similar fashion, using (102) we can write df = ∂F ω f ⊕ ∂H ω ∩T D f ⊕ ∂H f. Taking into account the decomposition (56) we get Xf

ω = (Xf1 + Xf2 + Xf3 ) =

(Xf1

$H ω ) ⊕

(Xf2

($H ω ⊕ $H )

$H ω ) ⊕ (Xf3

$H )

= ∂F ω f ⊕ ∂H ω ∩T D f ⊕ ∂H f.

Since Xf1 $H ω annihilates F ω , and Xf2 follows that Xf1

$H ω = ∂H ω ∩T D f, Xf2

$H ω annihilates H ω ∩ T D, it

$H ω = ∂F ω f,

Lemma 1.6.3.18 implies that ∂F ω f = 0. Hence, Xf,H ω = Xf,F ω , and Xf3 = Xf,T D = Xf,H = Yf .

and Xf3 Xf2

$H = ∂H f.

= 0. Therefore, Xf1 =

Let P : TD (T Q) → T D be the projection along the fibers of F ω corresponding to the direct sum decomposition TD (T Q) = F ω ⊕ T Q. The statement of theorem 1.6.3.20 can be reformulated as the following projection principle. Proposition 1.9.41. A dynamically admissible motion of a distributional Hamiltonian system (D, H, $, h) is the image under the tangent bundle projection τQ to Q of an integral curve of P ◦ Xh |D .

1.10. Accessible sets

1.10

41

Accessible sets

An abstract structure of a Hamiltonian system with linear nonholonomic constraints is given by a quadruple (D, H, $, h), where D is a manifold, (H, $) is a symplectic distribution on D, and h is a smooth function on D. For each f ∈ C ∞ (D), the distributional Hamiltonian vector field of f is defined as the unique vector field Yf on D with values in H satisfying the equation Yf $ = ∂H f. By theorem 1.6.3.20 the evolution of our nonholonomically constrained system is given by the distributional Hamiltonian vector Yh of the Hamiltonian h. This abstract structure was obtained in §6.3 as the result of an analysis of dynamics of a Hamiltonian system with configuration space Q, Hamiltonian h ∈ C ∞ (T Q), and linear nonholonomic constraints given by a distribution D on Q. However, once we have obtained the symplectic distribution (H, $) on D we can forget about the linear structure of D and its embedding into T Q, and retain only the manifold structure of D. In the following we shall need the notion of a generalized distribution on D, that is a linear subset H of the tangent bundle T D of D locally spanned by smooth vector fields. For u ∈ D, the number of linearly independent vector fields spanning Hu ⊆ Tu D is called the rank of H at u. A distribution is a generalized distribution of constant rank. An accessible set of a generalized distribution H on D is the set of points of D that can be connected by piecewise integral curves of vector fields with values in H. Accessible sets of H are also called reachable sets of H or orbits of the family of local vector fields with values in H. Theorem 1.10.42. (Sussmann’s theorem) Every accessible set of a generalized distribution on D is an immersed submanifold of D. Proof. See [112]. Let L ⊆ D be an accessible set of H. According to theorem 10.1 it is an immersed submanifold of D. The restriction HL of H to points in L is contained in T L. Hence, HL is a distribution on L. Let $L be the restriction of $ to HL . By construction $L is a nondegenerate 2-form on the distribution HL . For each f ∈ C ∞ (D), let fL be the restriction of f to L. Clearly fL is a smooth function on L. Let YfL be the distributional Hamiltonian vector field of fL on L relative to the symplectic distribution (HL , $L ). In other words, YfL is the unique vector field on L with values

42

Nonholonomically constrained motions

in HL such that YfL

$L = ∂HL fL .

(103)

Since Yh has values in H, it follows that Yh (u) = YhL (u) for all u ∈ L. Hence, L is preserved by the evolution of our nonholonomic Hamiltonian system. This proves Lemma 1.10.43. For every accessible set L of H, the nonholonomic Hamiltonian system (D, H, $, h) induces on L the structure of a nonholonomic Hamiltonian system (L, HL , $L , hL ). Since accessible sets of H form a partition of D, we can write [ (D, H, $, h) = (L, HL , $L , hL ).

(104)

L a.s.H

Here the union is taken over accessible sets L of H. We say that a nonholonomic Hamiltonian system (D, H, $, h) is simple if D is the unique accessible set of H. Since each accessible set L of H is a unique accessible set of HL , it follows that (104) is the decomposition of (D, H, $, h) into its simple components. Accessible sets of our symplectic distribution (H, $) on D are important because they provide restrictions on the evolution of the system given by integral curves of the vector field Yh . Theorem 1.10.44. (Stefan’s theorem) Accessible sets of a generalized distribution on D defines a structure of a smooth foliation with singularities on D. Proof. See [111] Taking into account the fact that D is a distribution on Q, the following proposition relates accessible sets of H to accessible sets of D. Proposition 1.10.45. Let M ⊆ Q be the unique accessible set of D −1 through q0 ∈ Q, and let u0 ∈ Dq0 . Then the restriction DM = τQ (M ) ∩ D of D to points of M is the accessible set of H through u0 . Proof. Let X be a vector field on Q with values in D. Consider the function f ∈ C ∞ (D) defined by f (u) = k(X(τQ (u)), u) for all u ∈ D. Let t 7→ u(t) be an integral curve of Yf with t 7→ q(t) = τQ (u(t)) its projection to Q. Evaluating both sides of the equation u(t) ˙ $ = df (u(t)) on vectors in

1.11. Constants of motion

43

Tu(t) D of the form ιu(t) v, where v ∈ Dq(t) , and taking into account lemma 1.5.9, we get k(v, T τQ (u(t))) ˙ = hu(t) ˙ $ | ιu(t) vi = hdf (u(t)) | ιu(t) vi

d d f (u(t) + sv) = k(X(q(t)), u + sv) ds s=0 ds s=0 = k(X(q(t)), v), for all v ∈ D. Since T τQ (u(t)) ˙ and X(q(t)) lie in D, it follows that T τQ (u(t)) ˙ = X(q(t)). But, T τQ (u(t)) ˙ = q(t). ˙ Therefore, t 7→ q(t) is an integral curve of X. In theorem 1.6.3.20 we have shown that an integral curve t 7→ q(t) of a vector field on Q with values in D lifts to a curve t 7→ u(t), which is an integral curve of a vector field on D with values in H. This shows that if L is the accessible set of H through u0 ∈ D and u ∈ L, then the whole fiber DτQ (u) is contained in L. Consequently, if M is the accessible set of D −1 containing q0 = τQ (u0 ) then DM = τQ (M ) ∩ D ⊆ L. On the other hand, the restriction HDM of H to points in DM is contained in T DM . Hence HDM is a distribution on DM . Thus the accessible set L of H through u0 ∈ DM is contained in DM . Therefore, L = DM . =

Corollary 1.10.46. A nonholonomic system is simple if and only if Q is the unique accessible set of D. The upshot of the above discussion is that accessible sets provide restrictions on the evolution of a nonholonomically constrained system which are independent of external forces and depend only on the constraint distribution.

1.11

Constants of motion

Additional restrictions on the evolution of a nonholonomically constrained system are provided by constants of motion. Theorem 1.11.47. (Nonholonomic Noether theorem) A function f ∈ C ∞ (D) is a constant of motion of the distributional Hamiltonian system (D, H, $, h) if and only if its distributional Hamiltonian vector field Yf preserves the Hamiltonian h. Proof. The dynamics on D is given by the distributional Hamiltonian vector field Yh of the Hamiltonian h. Hence, f˙ = hdf | Yh i = $(Yf , Yh ) = −$(Yh , Yf ) = −hdh | Yf i.

44

Nonholonomically constrained motions

Therefore, f˙ = 0 if and only if Yf preserves h. Theorem 1.11.47 is a nonholonomic counterpart of the Noether’s theorem relating symmetries and conservation laws in unconstrained Hamiltonian systems. In the unconstrained case, a function f ∈ C ∞ (T Q) is a constant of motion if and only if its Hamiltonian vector field Xf preserves the Hamiltonian. However, Xf also preserves the symplectic form ω. So it is an infinitesimal symmetry of the Hamiltonian system (T ∗ Q, ω, h). In the presence of constraints, the relationship between symmetries and constants of motion is more involved. The condition that the distributional Hamiltonian vector field Yf preserves the Hamiltonian h does not imply that it is an infinitesimal symmetry of the nonholonomic Hamiltonian system, because it need not preserve either H or $. Conservation laws corresponding to symmetries of a nonholonomically constrained system will be discussed in chapter 3. Lemma 1.11.48. Suppose that the nonholonomic system (D, H, $, h) is simple. Then the only Casimir functions of its almost Poisson algebra (C ∞ (D), { , }) are the constant functions. We now consider a special case of the conservation law given in theorem 1.11.47. Let Z be a vector field on Q and let Φt : Q → Q be the local 1parameter group of local diffeomorphism of Q generated by Z. The tangent map T Φt : T Q → T Q forms a local group of local diffeomorphisms of T Q generated by the tangent lift ZT Q of Z to T Q. Let PZ be a function on T Q defined by PZ (u) = k(u, Z(τQ (u))),

for each u ∈ T Q.

(105)

Proposition 1.11.49. If Z is a vector field on Q with values in D such that its tangent lift ZT Q preserves the Lagrangian `(u) = 21 k(u, u) − V (τQ (u)), then PZ is a constant of motion of the nonholonomically constrained system with Lagrangian ` and constraint distribution D. Proof. Since ZT Q preserves the Lagrangian `, it follows that it preserves the kinetic energy k(u) = 12 k(u, u) and the potential V (τQ (u)), separately. Hence, ZT Q preserves the Hamiltonian h(u) = 12 k(u, u) + V (τQ (u)). Moreover, Z preserves the kinetic energy metric k, that is LZ k = 0. By lemma 1.8.1.33, ZT Q is the Hamiltonian vector field of PZ defined by equation (105). Using decomposition (55) we can express the restriction of ZT Q to D as the sum of its components in H and H ω . In other words, ZT Q |D =

45

1.11. Constants of motion

eH = YPZ is the distributional Hamiltonian vector eH + Z eH ω . Moreover, Z Z field of PZ . From the fact that Z has values in D, it follows that ZT Q has eH ω has values in F ∩ H ω . Therefore by lemma 1.6.3.18 values in F, and Z eH ω preserves h. Since ZT Q preserves hdh | ZeH ω i = 0 which implies that Z e eH ω preserves h. Then theorem h, it follows that YPZ = ZH = ZT Q − Z 1.11.47 ensures that PZ is a constant of motion. Suppose that we have k conserved functions f1 , ..., fk . For each accessible set L of H and each c = (ca ) ∈ Rk , consider the level set Lc = {u ∈ D | fa (u) = ca

a = 1, . . . , k}.

Suppose that locally Lc is a submanifold of L. Since the functions fa are constants of motion, the restriction of Yh to Lc is a vector field Yh,Lc on Lc with values in HLc = H ∩ T Lc. Let $Lc be the restriction of $ to HLc , and hLc the restriction of h to Lc . We want to describe the vector field Yh,Lc in terms of the data (Lc , HLc , $Lc , hLc ). Equation (59) restricted to HLc implies that Yh,Lc

$Lc = dhLc .

(106)

If $Lc is nondegenerate, then Yh,Lc is the distributional Hamiltonian vector field of hLc relative to a symplectic distribution (HLc , $Lc ) on Lc , which we will denote by YhLc . Hence, we have proved Theorem 1.11.50. Suppose that locally Lc is a submanifold of L and that (HLc , $Lc ) is a symplectic distribution on Lc . Then the evolution of Yh in Lc is given by the distributional Hamiltonian vector field YhLc of hLc relative to (HLc , $Lc ). If the hypotheses of proposition 1.11.49 are satisfied for every accessible set L of H and for every value c ∈ Rd of the constants of motions f1 , ..., fd , then the decomposition (104) of the nonholonomic Hamiltonian system (D, H, $, h) admits a refinement, namely, [ [ (D, H, $, h) = (Lc , HLc , $Lc , hLc ). (107) L a.s. H

c∈Rd

We can refine the above decomposition further by considering accessible sets of HLc . Let M be an accessible set HLc . By Sussmann’s theorem 1.10.42, M is an immersed submanifold of Lc . The restriction HM of H to points in M coincides with HLc ∩ T M . Let $M be the restriction of $Lc

46

Nonholonomically constrained motions

to HM and hM the restriction of hLc to M . Applying the decomposition (104) to (Lc , HLc , $Lc , hLc ) we obtain [ [ [ (D, H, $, h) = (M, HM , $M , hM ). (108) L a.s. H

c∈Rd

M a.s. HLc

In some examples, every distribution HLc is involutive and on every integral manifold M of HLc the 2-form $M on HM = T M gives rise to a symplectic form ωM on M . When this happens, the distributional Hamiltonian systems (M, HM , $M , hM ) are Hamiltonian systems (M, ωM , hM ) and the original distributional Hamiltonian system (D, H, $, h) defines a foliation of D by Hamiltonian systems, namely, [ [ [ (D, H, $, h) = (M, ωM , hM ). (109) L a.s. H

1.12

c∈Rd

M a.s. HLc

Notes

We review the standard approach to dynamics of systems with linear nonholonomic constraints. Our only addition is the systematic use of the Levi Civita connection of the kinetic energy metric. We note that hypothesis 2.1, which leads to the Lagrange-d’Alembert principle, has been verified experimentally by Lewis and Murray [67]. We use the name the Lagrange-d’Alembert principle on the basis of an analogy with the formalisms of d’Alembert and Lagrange. But they themselves never applied their formalisms to nonholonomically constrained systems. These applications are definitely of a later date, namely the second half of the 19th century with a lot of confusion, which was only cleared up at the end of that century. Here is a brief account of the history. Whittaker [118] refers to Ferrers [43] as the first one who wrote down the equations of motion for a nonholonomically constrained system. But we do not understand what Ferrers meant. In §240 Routh [96] uses the Lagrange-d’Alembert principle in our form. Then Lindel¨ of [70] and Appell in the first edition of [4] and others used, in the case of a translation invariant nonholonomically constrained system, the wrong principle, namely, they take the Lagrangian on the tangent bundle of the orbit space of the translation symmetry; and then apply the Euler-Lagrange equation. Both Chaplygin [23], who refers to Lindel¨of [70], and Korteweg [62], who refers to the first edition of Appell [4], correct this error. Appell corrected his mistake in the second edition of [4]. For a more detailed discussion of this point see §4 of chapter 3. From this time on

1.12. Notes

47

the Lagrange-d’Alembert principle has been accepted as giving the “right” equations of motion. More recently Arnol’d et al. [7] has introduced vakonomic mechanics which have nonholonomic constraints but the equations of motion are based on a variational principle. Vakonomic systems are of interest, but rolling without slipping is not vakonomic. For an up to date treatment of nonlinearly nonholonomically constrained systems see Marle [74]. Our presentation of the distributional Hamiltonian formulation of nonholonomically constrained systems follows the theory developed in [13] and [32]. It is a special case of the partially symplectic formulation proposed by Bocharov and Vinogradov [17]. Dalsmo and van der Schaft [36] placed nonholonomically constrained systems in the context of Dirac structures. We recall the definition of a Dirac structure. The Pontryagin bundle of a manifold M is the direct sum of its tangent and cotangent bundles, that is, T M ⊕M T ∗ M . We endow the total space of the Pontryagin bundle with a nondegenerate symmetric bilinear form of signature (dimM,dimM ) given by "(u, p), (v, q)# = "q | u# + "p | v# ∗ for every (u, p), (v, q) ∈ Tm M ⊕ Tm M . A Dirac structure on a manifold M is a subbundle B of the T M ⊕T ∗ M which is maximal isotropic with respect of the above bilinear form, see Courant [29]. We now show that the distributional Hamiltonian system (D, (, H, h) with nonholonomic constraint distribution D ⊂ T Q, Hamiltonian h, and symplectic distribution (H, () on D is a Dirac structure. Here the symplectic form ( is the restriction to H of the symplectic form ω on T Q, which is obtained by pulling back the canonical symplectic form on T ∗ Q by the Legendre transformation k" : T Q → T ∗ Q given by the kinetic energy metric k of the system. Following Yoshimura and Marsden [119] the Dirac structure corresponding to (D, (, H, h) is given by

{(u, p) ∈ T D ⊕D T ∗ D u ∈ H and p − u

ω ∈ H 0 }.

Here H 0 is the annihilator of H in T ∗ D. The pair (Yh , dh), where Yh ( = ∂H h, is a section of this Dirac structure. The projection principle in the text is due to Marle [73]. However, projection principles were first used by Gibbs [47] and by Appell in the second edition [4] to derive what Pars [89] calls the Gibbs-Appell equations of motion for a nonholonomically constrained system.

48

Nonholonomically constrained motions

Van der Schaft and Maschke [113] introduced an almost Poisson bracket to study nonholonomically constrained Hamiltonian systems. Equation (87) is the nonholonomic almost Poisson bracket analogue of the formula for the Dirac bracket given in [39]. The intrinsic relation between the almost Poisson bracket and the symplectic distribution (H, $) was given by Koon and Marsden [61]. The term almost Poisson bracket was coined by Cantrijn, de Le´ on, and de Diego [19]. Results on accessible sets of distributions used here are taken from papers of Stefan [111] and Sussmann [112]. In [103] there is a proof of a nonholonomic version of Noether’s theorem of §11. A foliation of the constraint manifold by Hamiltonian systems given by constants of motion was discovered by Kemppainen in the dynamics of a rolling disk [57]. Bloch, Krishnaprasad, Marsden and Murray establish the momentum equation in [16], but only for G-invariant vector fields which were tangent to G-orbits on configuration space Q under the hypothesis that the action of G on Q is free and proper. The version in the text is based on [105].

Chapter 2

Group actions and orbit spaces

In this chapter we treat the basic properties of the action of a Lie group on a smooth manifold, concentrating especially on the case when the action is proper. We also discuss the differential geometry of the space of orbits of a proper action using the concept of a differential space. 2.1

Group actions

A smooth action of a Lie group G on a smooth manifold M is a smooth mapping Φ : G × M → M : (g, m) 7→ Φ(g, m) = g · m such that Φ(g, Φ(h, m)) = Φ(gh, m) for every g, h ∈ G and every m ∈ M . For fixed g ∈ G let Φg : M → M : m 7→ Φ(g, m); while for fixed m ∈ M let Φm : G → M : g 7→ Φ(g, m). Example 2.1.1. Suppose that V is a smooth vector field on M . Then for each m ∈ M there is a unique solution γm : Im ⊆ R → M of the differential equation dγm (t) = V (γm (t)), t ∈ Im , dt with initial condition γm (0) = m, which is defined on a maximal open interval Im containing 0. The set D = {(t, m) ∈ R × M t ∈ Im } is an open subset of R × M and the mapping ϕ : D ⊆ R × M → M : (t, m) 7→ γm (t) is the flow of V . The set Dt = {m ∈ M (t, m) ∈ D} = {m ∈ M t ∈ Im } 49

50

Group actions and orbit spaces

is an open subset of M . The smoothness of ϕ implies that ϕt : Dt ⊆ M → M : t 7→ γm (t) is the flow of V at time t. This flow has the group property, namely, if s, t ∈ R, m ∈ Ds , and ϕs (m) ∈ Dt , then m ∈ Ds+t and ϕt (ϕs (m)) = ϕt+s (m). Therefore ϕt : Dt → D−t is a diffeomorphism with inverse ϕ−t . If for every m ∈ M , we have Im = R, then the flow ϕ (or the vector field V ) is complete and ϕ : R × M → M defines an action of R on M . A point m ∈ M is a fixed point for the G-action Φ if and only if g·m = m for every g ∈ G. The action Φ is free if and only if g · m = m implies g = e for every m ∈ M . Here e is the identity element of G. An example of a nonfree action is given by G acting on itself by conjugation, that is, Φ : G × G → G : (g, h) 7→ ghg −1 .

Indeed if G 6= {e}, then for h 6= e we have h ∈ Gh = {g ∈ G ghg −1 = h} 6= {e}. let

Let g = Te G be the Lie algebra of G. For every ξ ∈ g and every m ∈ M ξ · m = Xξ (m) =

d dt

Φm (exp t ξ) t=0

be the infinitesimal action of g on M . Then Xξ : M → T M : m 7→ Xξ (m) is a smooth vector field on M , whose flow is ϕξt = Φexp tξ , which is complete. The infinitesimal action gives rise to a map g → X (M ) : ξ 7→ Xξ , which is a homomorphism from the Lie algebra g to the Lie algebra X (M ) of smooth vector fields on M . Note that the map Te Φm : g → Tm M : ξ 7→ Xξ (m) is a linear mapping.

2.2

Orbit spaces

The set G · m = {g · m ∈ M g ∈ G} is called the orbit of the G-action Φ through m. The map Φm : G → M induces the mapping e m : G/Gm → M : g Gm 7→ Φ(g, m), Φ

where Gm = {g ∈ G Φg (m) = m} is the isotropy group at m. The mapping e m , which is bijective onto the orbit G·m, exhibits this orbit as an immersed Φ submanifold of M . Note that tangent space Tm (G · G) to the G-orbit G · m at m is Te Φm g.

51

2 2. Orbit spaces

Two G-orbits are either equal or disjoint, which implies that they partition M . The set M = M/G = {G·m m ∈ M } of all G-orbits on M is called the orbit space of the action Φ. The map π : M → M/G = M : m 7→ G · m is called the orbit map or projection map. A fiber of π is a G-orbit in M . If we want to talk about orbits near a given one, we need a topology on M . We define one by saying that a subset U in M is open if and only if the G-invariant subset π −1 (U ) is an open subset of M . In general, the topological space M can be quite complicated. For instance, in the case of an action of R defined by the flow of a complete vector field on M , orbits need not be embedded submanifolds. Moreover, the topology on M need not be Hausdorff. In order to proceed further we make the additional assumption that the G-action Φ is proper , namely, the map Ξ : G × M → M × M : (g, m) 7→ (m, Φ(g, m)) is proper. In other words, if K is a compact subset of M × M then Ξ−1 (K) is compact. Equivalently, if {mj } and {gj } are infinite sequences in G and M , respectively, such that as j → ∞ the sequences {mj } and {gj · mj } converge in M , then there is a subsequence {gjk } which converges in G as k → ∞. If the G-action Φ is proper, then for every m ∈ M the isotropy group Gm is compact and hence is a Lie subgroup of G. Moreover, every G-orbit is a properly embedded submanifold of M and thus is a closed subset. Finally, the topology on the orbit space M/G is Hausdorff. Proposition 2.2.2. If the G-action Φ on the smooth manifold M is free and proper, then the orbit space M/G has a unique smooth manifold structure such that the projection map π : M → M/G = M is a principal fiber bundle with structure group G. Proof. See lemma 1.11.3 in [42]. Let us explain this last statement a bit more. The fact that the Gorbit map π : M → M is a smooth fibration implies that there exist local sections. In other words, for every m ∈ M there is an open neighborhood U of m and a smooth mapping σ : U ⊆ M → π −1 (U ) such that π ◦ σ = idU . The mapping τσ : U × G → π −1 (U ) : (u, g) 7→ g · σ(u)

52

Group actions and orbit spaces

is a diffeomorphism such that π ◦ τσ : U × G → U : (u, g) 7→ u. In other words, τσ−1 : π −1 (U ) → U × G is a local trivialization of the 0 fibration π restricted to π −1 (U ). If σ 0 : U ⊆ M → M is another such local 0 0 section of π defined on the open set U , then for each m ∈ U ∩ U there is a 0 unique ρ(u) ∈ G, depending smoothly on u, such that σ (u) = ρ(u) · σ(u). Therefore τσ0 (u, g) = g · ρ(u) · σ(u) = τσ (u, g · ρ(u)), or equivalently, (τσ−1 ◦ τσ0 )(u, g) = (u, g · ρ(u)). A principal fiber bundle with structure group G is a smooth fibration ν : M → B with an open covering {Ui }i∈I of B and local trivializations τi−1 : ν −1 (Ui ) → Ui × G such that (τi−1 ◦ τj )(u, g) = (u, g · νij (u))

for smooth mappings νij : Ui ∩Uj → G, which satisfy the cocycle conditions: (νij ◦ νji )|(Ui ∩Uj ) = idUi ∩Uj and (νik ◦ νkj ◦ νji )|(Ui ∩Uj ∩Uk ) = idUi ∩Uj ∩Uk . The maps τi : Ui × G → M intertwine the G-action G × (Ui × G) → Ui × G : (g 0 , (u, g)) 7→ (u, g 0 g)

with a unique G-action on M which is free and proper. Therefore the concepts of free and proper G-action and principal fiber bundle with fiber G are equivalent. 2.3 2.3.1

Isotropy and orbit types Isotropy types

For each m ∈ M the isotropy group Gm at m is a closed subgroup of G and hence is a Lie subgroup of G. Its Lie algebra is gm = {ξ ∈ g Xξ (m) = 0}. We have Lemma 2.3.1.3. The isotropy group Gg·m is conjugate to the isotropy group Gm by the element g ∈ G. Proof. We have h ∈ Gg·m if and only if h · (g · m) = g · m if and only if (g −1 hg)·m = m if and only if g −1 hg ∈ Gm . Therefore g −1 Gg·m g = Gm .

2.3. Isotropy and orbit types

53

Let H be a closed subgroup of G. Then MH = {m ∈ M Gm = H} is the set of points in M with isotropy type H.1 As H ranges over all closed subgroups of G for which MH is non-empty, the sets MH partition M . The normalizer N (H) of H in G is {g ∈ G gHg −1 = H}. In fact N (H) is the largest subgroup of G which contains H as a normal subgroup. Now N (H) is a closed subgroup of G and hence is a Lie group. Consequently, N (H)/H is a Lie group. The following lemma reduces the N (H)-action on an H-isotropy type MH to a free action of N (H)/H on MH . Lemma 2.3.1.4. Let H be a closed subgroup of G such that MH is nonempty. For every g ∈ G we have g ∈ N (H) ⇐⇒ g · MH = MH ⇐⇒ (g · MH ) ∩ MH 6= ∅. Proof. If m ∈ MH , then Gm = H. From lemma 2.3.1.3 it follows that g · m ∈ MH , that is, Gg·m = H = Gm if and only if g ∈ N (H). 2.3.2

Orbit types

We say that the subgroup H 0 is conjugate to H in G if there is an element g of G such that H 0 = gHg −1 . If H is a closed subgroup of G then so is H 0 . From lemma 2.3.1.3 it follows that the isotropy type MH 0 is nonempty if and only if the isotropy type MH is. The set of all subgroups of G which are conjugate in G to H is called the conjugacy class of H in G and is denoted by (H). The set M(H) = {m ∈ M Gm = gHg −1 , for some g ∈ G} is the orbit type of (H) in M .2 Since G·MH = M(H) , the orbit type M(H) is the smallest G-invariant subset of M which contains MH . Being in the same orbit type defines an equivalence relation on M . This relation is coarser than the one defined by being in the same isotropy type. The orbit type M(H) is partitioned into isotropy types MH 0 where H 0 ∈ (H), because g · MH = MgHg−1 for every g ∈ G. Therefore M is partitioned into orbit types, which in turn are partitioned into isotropy types for conjugate subgroups. The set M(H) = M(H) /G = π(M(H) ) is the orbit type in the orbit space of the conjugacy class (H). Here π : M → M is the G-orbit map. 1 Also

called H-symmetry type. name orbit type comes from the fact that m and m0 belong to M(H) if and only if there is a G-equivariant bijective mapping from G · m onto G · m0 . 2 The

54

Group actions and orbit spaces

Lemma 2.3.2.5. Let H be a closed subgroup of G for which the isotropy type MH is nonempty. Then 1. The mapping g '→ MgHg−1 induces a G-equivariant bijection from G/N (H) onto the collection of all isotropy types in M(H) . Here we use the G-action on G/N (H) induced by left multiplication. 2. If π : M → M is the G-orbit map, then a fiber of π|MH is an N (H)/Horbit in MH . 2.3.3

When the action is proper

Throughout this subsection we assume that the G-action Φ is proper. Up until now we have only made set theoretic statements about the action of G on M . Let m ∈ M . Then Te Φm g = Tm (G · m). For every g ∈ Gm the linear transformation Tm Φg : Tm M → Tm M leaves the subspace Tm (G · m) invariant. Hence we obtain an induced linear action ◦ of the compact group H = Gm on the vector space E = Tm M/Tm (G · m). Let B be an H-invariant open subset of E. On G × B we have an action of H defined by µ : H × (G × B) → G × B : (h, (g, b)) '→ (gh−1 , h ◦ b).

(1)

ν : G × (G × B) → G × B : (g * , (g, b)) '→ (g * g, b)

(2)

This action is free and proper. Therefore the orbit space G×H B of the H action µ is a smooth manifold. Because the G action on G × B defined by commutes with the H-action (1), it induces a G-action on G×H B.

Theorem 2.3.3.6. (tube theorem) Let m ∈ M and set H = Gm . There is a G-invariant open neighborhood U of m in M , an open H-invariant neighborhood B of the origin in E, and a diffeomorphism ϕ : G×H B → U , which intertwines the G-action on G×H B with the G-action on U . Proof. See theorem 2.4.1 in [42]. B is identified with a submanifold S of M containing m, called a slice, by a diffeomorphism ψ. The diffeomorphism ϕ is induced by the mapping G × B → M : (g, b) '→ g · ψ(b). The tube theorem is the basic result used to study local properties of the orbit space M/G. Theorem 2.3.3.7. Suppose that H = Gm is the isotropy subgroup at m for a proper G-action on M . Then

2.3. Isotropy and orbit types

55

1. The isotropy type MH is a locally closed submanifold of M .3 2. The Lie group N (H)/H acts smoothly, freely, and properly on MH . There is a unique smooth manifold structure on M(H) = π(M(H) ) such that π|MH : MH → M(H) is a principal fiber bundle with structure group N (H)/H. Here π : M → M is the G-orbit map. 3. The orbit type M(H) is a G-invariant, locally closed, smooth submanifold of M .4 The G-action on M(H) induces a G-equivariant diffeomorphism from the space (G/H)×N (H)/H MH onto MH . The map π|M(H) : M(H) → M(H) defines a smooth fibration whose fibers are G-equivariantly diffeomorphic to G/H. Proof. Let M H = {m ∈ M h · m = m, for every h ∈ H} = {m ∈ M H ⊆ Gm } be the set of all points of M which are fixed by every element of H. Clearly MH ⊆ M(H) ∩ M H . Suppose that m ∈ M(H) ∩ M H . Then Gm is conjugate to H in G and H ⊆ Gm . Because the G-action is proper, the isotropy group Gm is compact. As H is conjugate to Gm , it follows that H is also compact. Hence H has the same dimension and finite number of connected components as Gm . Since H ⊆ Gm , we obtain H = Gm , that is, m ∈ MH . Consequently, MH = M(H) ∩ M H . The theorem now follows from theorem 2.6.7 of [42], where the tube theorem is used in an essential way. 2.3.4

Stratification by orbit types

A stratification of a smooth manifold M is a collection S of locally closed smooth submanifolds of M , called strata, having the following properties. 1. S is a locally finite partition of M , that is, a. If S, S * ∈ S and S -= S * , then S ∩ S * = ∅; ( b. M = S∈S S; c. for each m ∈ M , there is an open neighborhood U of m in M such that {S ∈ S S ∩ U -= ∅} is finite. 2. for each S ∈ S, the closure cl(S) of S in M is the union of S and {S * ∈ S dim S * < dim S}. 3 Different 4 Idem.

connected components of MH may have different dimensions.

56

Group actions and orbit spaces

A geometrically well behaved stratification S of M satisfies Whitney’s axioms A and B.5 Such a stratification is called a Whitney stratification. Theorem 2.3.4.8. Suppose that G acts smoothly and properly on M . Then connected components of orbit types in M form a (Whitney) stratification of M . Proof. See theorem 2.7.4 in [42] or theorem 4.3.7 in Pflaum [91]. Each orbit type M(H) is fibered by isotropy types MH with H ∈ (H). For each H ∈ (H) the codimension c of MH in M(H) is dim G − dim N (H)/H. If c > 0 then isotropy types do not form a stratification of M. In order to be able to state that orbit types in the orbit space form a Whitney stratification, we need to know that, at least locally, we can embed the G-orbit space M/G into some smooth manifold. This will be discussed in the next section. 2.4 2.4.1

Smooth structure on an orbit space Differential structure

A differential space is a pair (Q, C ∞ (Q)), where Q is a topological space and C ∞ (Q) is a set of continuous real valued functions having the following properties. 1. The sets f −1 (I), with f ∈ C ∞ (Q) and I an open interval in R, form a subbasis for the topology of Q. 2. For every positive integer n, every F ∈ C ∞ (Rn ), and every f1 , . . . , fn ∈ C ∞ (Q), we have F ◦ f ∈ C ∞ (Q), where f(q) = (f1 (q), . . . , fn (q)) ∈ Rn for every q ∈ Q. 3. If f : Q 7→ R has the property that for every q ∈ Q there is an open neighborhood Uq of q in Q and fq ∈ C ∞ (Q) such that f |Uq = fq |Uq , then f ∈ C ∞ (Q). 5 A) If i) S, S 0 ∈ S with S 0 ⊆ cl(S) and S 0 6= S, ii) the sequence {s } ⊆ S converges to j s0 ∈ S, and iii) Tsj S converges to a linear subspace L of Ts0 M , then Ts0 S 0 ⊆ L. B) If {sj } is a sequence such as in condition A) above and {s0j } ⊆ S 0 also converges to s0 ∈ S 0 , then each limit of the 1-dimesional subspaces R λ(sj , s0j ) is contained in L. Here λ : ∆M ⊆ M × M → T M is a diffeomorphism of an open neighborhood of the diagonal ∆M in M × M to an open neighborhood of the zero section ZT M of T M such that λ(∆M ) = ZT M .

2.4. Smooth structure on an orbit space

57

The set C ∞ (Q) is called the differential structure of the differential space (Q, C ∞ (Q)). Example 2.4.1.9. Let M be a smooth manifold with C ∞ (M ) its collection of smooth functions. Then (M, C ∞ (M )) is a differential space. If (P, C ∞ (P )) and (Q, C ∞ (Q)) are differential spaces, then a smooth mapping ϕ from (P, C ∞ (P )) to (Q, C ∞ (Q)) is a continuous mapping ϕ : P → Q such that ϕ∗ (C ∞ (Q)) ⊆ C ∞ (P ). The map ϕ is a diffeomorphism from (P, C ∞ (P )) to (Q, C ∞ (Q)) if ϕ is a homeomorphism from P onto Q and both ϕ and ϕ−1 are smooth. This is equivalent to the condition that ϕ is a homeomorphism from P onto Q such that ϕ∗ (C ∞ (Q)) = C ∞ (P ), because C ∞ (Q) ⊇ (ϕ−1 )∗ (C ∞ (P )) ⊇ (ϕ−1 )∗ (ϕ∗ (C ∞ (Q))) = (ϕ ◦ ϕ−1 )∗ (C ∞ (Q)) = C ∞ (Q).

Differential spaces and smooth mappings form a category. Let (Q, C ∞ (Q)) be a differential space and let N be a subset of Q. Define Ci∞ (N ) as the set of all functions f : N 7→ R with the property that for every n ∈ N there is an open neighborhood Un of n in Q and an fn ∈ C ∞ (Q) such that f |(Un ∩ N ) = fn |(Un ∩ N ). If we provide N with the topology induced from that on Q, then (N, Ci∞ (N )) is a differential space, called a differential subspace of (Q, C ∞ (Q)). If U is an open subset of Q, then Ci∞ (U ) = {f |U f ∈ C ∞ (Q)} = C ∞ (U ).

A differential space (Q, C ∞ (Q)) is a smooth manifold if for every q ∈ Q there is a nonnegative integer n, an open subset U of Q containing q, and f1 , . . . , fn ∈ C ∞ (Q) such that the map f : U ⊆ Q → V = f (U ) ⊆ Rn : q 7→ f1 (q), . . . , fn (q)

is a diffeomorphism from the differential space (U, C ∞ (U )) onto the differential space (V, C ∞ (V )), seen as a subspace of the differential space (Rn , C ∞ (Rn )). In other words, (Q, C ∞ (Q)) is locally diffeomorphic to an open subset of Rn with its differential structure being given by restricting smooth functions to this open subset. Lemma 2.4.1.10. If the topological space Q of the differential space (Q, C ∞ (Q)) is Hausdorff, locally compact and paracompact, then for every open covering U of Q there is a partition of unity in C ∞ (Q), which is subordinate to U.

58

Group actions and orbit spaces

Proof. If q ∈ Q and U is an open neighborhood of q in Q, then it follows from property 1 of a differential structure that there is a positive integer n, an open subset W of Rn , and f1 , . . . , fn ∈ C ∞ (Q) such that f−1 (W ) ⊂ U . There is a cutoff function F ∈ C ∞ (Rn ) such that F = 1 on an open neighborhood of f(q) in W and the support of F is a compact subset K of W . From property 2 of a differential structure it follows that F ◦ f ∈ C ∞ (Q). Also the support of F ◦ f is f−1 (K), which is a closed subset of U . The set f−1 (K) is compact if the closure of U in Q is compact. Therefore, if the topological space Q is locally compact, then there are cutoff functions in C ∞ (Q). If X is a topological space with F (X) a space of real valued functions on X and U = {Uj }j∈J is an open covering of X, then a partition of unity in F (X) subordinate to U is a collection {χj }j∈J of functions in F (X) having the following properties. 1. For each j ∈ J, the support of χj is a compact subset of some Uj ∈ U. 2. The supports of the {χj }j∈J form a locally finite family of compact subsets of X, whose union is X. P 3. j∈J χj = 1 on X.

A Hausdorff topological space X is called paracompact if every open covering of X has a locally finite refinement. If X is Hausdorff and locally compact , then it is paracompact if and only if every connected component of X is equal to the union of a countable collection of compact subsets. Cutoff functions in F (X) can be used to obtain partitions of unity.

We would like to think of C ∞ (Q) as being the space of smooth functions on Q. For this to be the case, we should have a sheaf of locally defined “smooth functions”. Towards this goal, let U be an open subset of Q. Then we have already defined C ∞ (U ). Moreover, the mapping U → C ∞ (U ), where U ranges over all open subsets U of Q, defines a sheaf of functions on Q. We now check to see that this sheaf behaves like the sheaf of smooth functions on a manifold. Lemma 2.4.1.11. Let N be a subset of a smooth paracompact manifold M . N is an embedded submanifold of M if and only if the identity map from the differential space (N, Ci∞ (N )) into the differential space (N, C ∞ (N )) is a diffeomorphism of differential spaces, that is, if and only if Ci∞ (N ) = C ∞ (N ).

2.4. Smooth structure on an orbit space

59

Proof. ⇐=. The inclusion map ι : N → M is a smooth map of differential spaces (N, Ci∞ (N )) to (M, C ∞ (M )). Since Ci∞ (N ) = C ∞ (N ) by hypothesis, it follows that ι is a smooth mapping of (N, C ∞ (N )) and (M, C ∞ (M )), where M and N are smooth manifolds. The topology on N is induced from p that on M . If n ∈ N and dim N = p, then there are {fj }j=1 ⊆ C ∞ (N ) p such that {fj }j=1 form a system of coordinates for N on an open neighborhood of n in N . Therefore the map ι has an injective tangent at n and consequently is an embedding. Hence N is an embedded submanifold of M. =⇒. Suppose that N is an embedded submanifold of M . If f ∈ C ∞ (N ), then for every n ∈ N and any open neighborhood Un of n in N , there is a cutoff function χ : M → R in C ∞ (M ) such that χ = 1 on an open subset Vn of n in N whose support is a compact subset of Un . Such a cutoff function χ exists because M is paracompact. Let fn = χ · f . Then f |Vn = fn |Vn . Thus f ∈ Ci∞ (N ). Consequently, C ∞ (N ) ⊆ Ci∞ (N ). Conversely, suppose that f ∈ Ci∞ (N ). Then for every j ∈ N there is an open neighborhood Uj of j in M and fj ∈ C ∞ (M ) such that f |(Uj ∩ N ) = fj |(Uj ∩ N ). Now U = {Uj }j∈J=N is an open covering of N . We extend U to an open covering Ue by adding the open set U = M \N . (Since N is an embedded submanifold of M , it is a closed subset of M .) Let {χj }j∈Je be a C ∞ (M ) partition of e Then {χj |N } unity on M subordinate to U. is a partition of unity on Pj∈J N subordinate to U. So the function f = j∈J χj |N · fj lies in C ∞ (N ). Therefore Ci∞ (N ) ⊆ C ∞ (N ) and thus Ci∞ (N ) = C ∞ (N ).

Because of lemma 2.4.1.11 we may call C ∞ (Q) the space of smooth functions on the differential space (Q, C ∞ (Q)). 2.4.2

The orbit space as a differential space

Let M = M/G be the orbit space of a proper G-action on a smooth manifold M with G-orbit map π : M → M . For every open subset U of M a function f : U ⊆ M → R is smooth if π ∗ f : U = π −1 (U ) ⊆ M → R is smooth. Let C ∞ (U ) be the space of smooth functions on U. Lemma 2.4.2.12. For every m ∈ M and every open neighborhood U of m in m there is a cutoff function χ ∈ C ∞ (M ) such that χ = 1 on an open neighborhood of m and is 0 on the complement of a compact neighborhood

60

Group actions and orbit spaces

of m in U . Proof. From the tube theorem 2.3.3.6 with U = π −1 (U ) and H = Gm with m ∈ m it follows that the mapping ρ : B/H → U/G : b ◦ H → ϕ(b) · G is a homeomorphism. Because H is a compact Lie group which acts linearly on E = Tm M/Tm (G · m), we may average an arbitrary inner product on E over H to obtain an H-invariant inner product β on E with norm / /. Because B is an H-invariant open subset of E containing the origin, there is an ε > 0 such that {/x/ ≤ 0} ⊆ B. There is a smooth function ψ : R → R : r '→ ψ(r), which is 1 in an open neighborhood of 0 in R and is 0 when r ≥ ε, Then f : H × B → R : (h, b) '→ ψ(/b/) is a smooth H invariant function. Therefore f induces a smooth function f' : B/H → R, which corresponds to a function g = f'◦ ρ−1 ∈ C ∞ (U ) with support in U . Extending g by 0 outside U gives the cutoff function χ ∈ C ∞ (M ). Corollary 2.4.2.13. Paracompactness of M implies that the Hausdorff space M is paracompact.

Proof. To see this let C be a connected component of M . Then C = π(C) for every connected component C of M . Because M is paracompact, C is equal to the union of a countable number of compact sets Ki . Since the G-orbit map π is continuous, π(Ki ) is compact and their union is C. Cutoff functions can now be used to obtain a partition of unity in C ∞ (M ) as in lemma 2.4.1.10. Proposition 2.4.2.14. (M/G, C ∞ (M/G)) is a differential space and the G-orbit map π is a smooth mapping from (M, C ∞ (M )) to the differential space (M/G, C ∞ (M/G)). Proof. We verify that properties 1–3 defining a differential structure hold for C ∞ (M ). 1. If χ is a cutoff function on M as constructed in lemma 2.4.2.12, then χ−1 (1/2, 3/2) ⊆ U . This proves property 1. 2. Let f 1 , . . . , f n ∈ C ∞ (M ) and let F ∈ C ∞ (Rn ). Then f j ◦ π ∈ C ∞ (M )G , the space of smooth G-invariant functions on M . Hence F ◦ f ◦ π ∈ C ∞ (M )G , where f(m) = (f 1 (m), . . . , f n (m)) for every m ∈ M . Therefore F ◦ f ∈ C ∞ (M ). This proves property 2. 3. Let f : M '→ R. Suppose that for every m ∈ M there is an open neighborhood U m of m in M and an f m ∈ C ∞ (M ) such that f |U m = f m |U m . Then f ◦ π is G-invariant and is equal to the smooth function f m ◦ π

2.4. Smooth structure on an orbit space

61

on Um = π −1 (U m ). Here m ∈ m. Because the {Um }m∈M form an open covering of M , it follows that f ◦ π ∈ C ∞ (M ). Therefore f ◦ π ∈ C ∞ (M )G , which implies that f ∈ C ∞ (M ). This proves property 3. Thus C ∞ (M ) is a differential structure. The G-orbit map π : M → M is smooth because π ∗ (C ∞ (M )) = C (M )G ⊆ C ∞ (M ). ∞

Corollary 2.4.2.15. Ci∞ (U ) = C ∞ (U ), for any open subset U of M . Proof. Let f : U 7→ R. Then locally f agrees with an element of C ∞ (M ). Using the argument which proved property 3 in the proof of the proposition, it follows that f ◦ π ∈ C ∞ (U )G , where U = π −1 (U ). Hence f ∈ C ∞ (U ). Conversely, suppose that f : U → R such that f ◦ π ∈ C ∞ (U )G . Using lemma 2.4.2.12 we know that for every m ∈ U there is a cutoff function χm ∈ C ∞ (M ) whose support is a compact subset K of U and which is 1 in an open neighborhood of m in M . Define f m : M → R by f m = χm · f on U and 0 on M \ U . Then f m = 0 on M \ K = W . Therefore U = π −1 (U ) and W = π −1 (W ) are open subsets of M = U ∪ W . On U we have f m ◦ π = (χm ◦ π) · (f ◦ π), which is smooth and G-invariant; while on W we have f m ◦ π = 0, which is also smooth and G-invariant. Therefore f m ∈ C ∞ (M ). But f = f m in an open neighborhood of m ∈ M . Therefore f ∈ Ci∞ (U ). Consequently, C ∞ (U ) = Ci∞ (U ). Using the notation of the tube theorem 2.3.3.6 we have Lemma 2.4.2.16. The differential spaces (B/H, C ∞ (B/H)) and (U/G, C ∞ (U/G)) are diffeomorphic. Proof. On G × B we have the G-action ν (2) with G-orbit map πν . Since every G-orbit on G × B intersects {e} × B exactly once, the G-orbit space (G × B)/G is diffeomorphic to B. In particular, the map i : B → (G × B)/G : b 7→ πν (e, b)

(3)

is a diffeomorphism, whose inverse is given by the smooth map π e : (G × B)/G → B, induced from the G-invariant map π : G × B → B : (g, b) 7→ b. On G × B we have an H-action µ (1) whose orbit space is G×H B. In addition, we have an action of G × H given by λ : (G × H) × (G × B) → G × B : ((g 0 , h), (g, b)) 7→ (g 0−1 , h ◦ b).

62

Group actions and orbit spaces

Because the actions µ and ν commute, we have an induced H-action on the orbit space (G × B)/G and an induced G-action on G×H B, whose orbit spaces ((G × B)/G) /H and (G×H B)/G are equal to the orbit space (G × B)/(G × H) of the action λ. Let ϕ : G×H B → U be the G-equivariant diffeomorphism given by the tube theorem 2.3.3.6. Then ϕ induces the homeomorphism ϕ e : (G×H B)/G = (G × B)/(G × H) → U/G and the isomorphism ϕ∗ : C ∞ (U ) → C ∞ (G × B)H , which restricts to the isomorphism ϕ e∗ : ∞ G ∞ G×H C (U ) → C (G × B) . The diffeomorphism i (3) induces the homeomorphism ei : B/H → ((G × B)/G) /H = (G × B)/(G × H)

and the isomorphism i∗ : C ∞ (G × B)G → C ∞ (B), which restricts to the isomorphism ie∗ : C ∞ (G × B)G×H → C ∞ (B)H = C ∞ (B/H).

Therefore ϕ e ◦ ei : B/H → U/G is a homeomorphism and ei∗ ◦ ϕ e∗ : C ∞ (U/G) → C ∞ (B/H) is an isomorphism. In other words, the differential spaces (B/H, C ∞ (B/H)) and (U/G, C ∞ (U/G)) are diffeomorphic. 2.5

Subcartesian spaces

In this section we will show that the orbit space of a proper group action is a subcartesian space. A differential space (Q, C ∞ (Q)) is subcartesian if Q is a Hausdorff topological space and (Q, C ∞ (Q)) is locally diffeomorphic to (N, Ci∞ (N )), where N is a subset of Rn . In other words, if for each q ∈ Q there is an open neighborhood U of q in Q, a nonnegative integer n, a subset N of Rn , and a diffeomorphism ψ from (U, C ∞ (U )) onto (N, Ci∞ (N )). A subcartesian differential space (Q, C ∞ (Q)) is locally compact if and only if for every q ∈ Q there is an open neighborhood U of Q containing q and a diffeomorphism ϕ : U ⊆ Q → V ⊆ Rn , which maps U onto a locally closed subset V . Using the notation of the tube theorem 2.3.3.6, we now investigate the orbit space of the linear action of the compact Lie group H = Gm on the vector space E = Tm M/Tm (G · m). From classical invariant theory we know that the algebra P (E)H of H-invariant polynomial functions on E

2.5. Subcartesian spaces

63

is finitely generated, that is, there is a positive integer n and polynomials p1 , . . . , pn ∈ P (E) such that every p ∈ P (E)H can be written as p = F (p1 , . . . , pn ), where F is a polynomial on Rn . Because the action of H on E is linear, we can choose pi to be homogeneous of degree di > 0. We may n also suppose that n is minimal. We then say that {pi }i=1 is a Hilbert basis of P (E)H and that σ : E → Rn : x 7→ p1 (x), . . . , pn (x) (4) is the Hilbert map corresponding to the given Hilbert basis. Because there can be nontrivial relations among the generators pi , neither the Hilbert basis of E nor the Hilbert map is unique. Since elements of P (E)H separate H-orbits on E, the Hilbert map σ induces a continuous bijective map σ e : E/H → Σ = σ(E) ⊆ Rn . (5) The Tarski-Seidenberg theorem states that the image of a semialgebraic set under a polynomial mapping is semialgebraic. Therefore Σ is a semialgebraic subset of Rn . Because σ(t·x) = (td1 p1 (x), . . . , tdn pn (x)) the set Σ is quasi homogeneous in the sense that if y ∈ Σ then t · y ∈ Σ. In particular, Σ is contactible to the origin in Rn . Averaging an arbitrary inner product on E over the orbits of the H-action, we obtain an H-invariant inner product β on E. Since the function E → R : x 7→ β(x, x) is H-invariant, there is a polynomial P : Rn → R such that β(x, x) = P (σ(x)) for every x ∈ E. Consequently, the Hilbert mapping σ (4) is proper. This implies that Σ is a closed subset of Rn . Because the induced mapping σ e (5) is continuous, bijective, and proper, it is a homeomorphism from the locally compact Hausdorff space E/H onto Σ. Thus we have proved Lemma 2.5.17. The orbit space E/H of the H-action on E is homeomorphic to the image Σ of the Hilbert map σ. Shrinking B (and thereby also U ), if necessary, we may assume that B = {x ∈ E β(x, x) < c} for some c > 0. Then the map σ e (5) is a homeomorphism from B/H onto σ(B) = {y ∈ Σ = σ(E) P (y) < c}. Here P is an H-invariant polynomial which expresses β in terms of the invariant polynomials {pi }ni=1 on E. Note that σ(B) is an open semialgebraic subset of the closed semialgebraic subset Σ of Rn . Proposition 2.5.18. Let φ : B/H → U/G be the diffeomorphism given by lemma 2.4.2.16. Then σ e ◦ φ−1 : (U/G, C ∞ (U/G)) → (σ(B), C ∞ (σ(B))) is a diffeomorphism of differential spaces.

64

Group actions and orbit spaces

Proof. Let f ∈ C ∞ (σ(B)) and b ∈ B. Then there is an open neighborhood U of σ(b) in Rn and a g ∈ C ∞ (Rn ) such that f = g on σ(B) ∩ U . Therefore f ◦ σ = g ◦ σ is smooth on the open neighborhood σ −1 (U ) of b in B. Because this holds for every b ∈ B, we get f ◦ σ ∈ C ∞ (B). Since f ◦ σ is H-invariant, it follows that σ ∗ (C ∞ (σ(B))) ⊆ C ∞ (B)H . Conversely, the theorem of Schwarz [98] states that C ∞ (B)H ⊆ σ ∗ (C ∞ (Rn )). Let i : σ(B) → Rn be the inclusion mapping. Then i∗ : C ∞ (Rn ) → C ∞ (σ(B)) is surjective, that is, i∗ (C ∞ (Rn )) = C ∞ (σ(B)). Since σ = i ◦ σ, we obtain C ∞ (B)H ⊆ σ ∗ (C ∞ (Rn )) = σ ∗ ◦ i∗ (C ∞ (Rn )) = σ ∗ (C ∞ (σ(B))). Therefore C ∞ (B)H = σ ∗ (C ∞ (σ(B))). Since the mapping σ e : B/H → σ(B) is a homeomorphism by lemma 2.5.17, we deduce that it is a diffeomorphism from (B/H, C ∞ (B/H)) onto (σ(B), C ∞ (σ(B))). The proposition follows using lemma 2.4.2.16. Because the open sets {U/G} form an open covering of the orbit space M/G we have proved Corollary 2.5.19. For a proper action of a Lie group G on a smooth manifold M , the differential space (M/G, C ∞ (M/G) = C ∞ (M )G ) is a subcartesian space. More precisely, M/G has a covering by open subsets each of which is diffeomorphic as a differential space to an open subset of a closed semialgebraic set. 2.6

Stratification of the orbit space by orbit types

In the section we investigate the orbit types M(H) = M(H) /G in the orbit space M = M/G of a proper action of G on a manifold M . 2.6.1

Orbit types in an orbit space

Let MH be the H-isotropy type where H = Gm for some m ∈ M . Then MH = π(MH ) = M(H) /G = π(M(H) ) = M(H) . Moreover, there is a smooth manifold structure on M(H) such that π|MH : MH → M(H) is a principal N (H)/H-bundle, see theorem 2.3.3.7. Using the notation of the tube theorem 2.3.3.6, consider the G-action G × (G×H E) → G×H E : g 0 , ρ(g, x) 7→ ρ(g 0 g, x),

2.6. Stratification of the orbit space by orbit types

65

where ρ : G × E → G×H E is the orbit map of the H-action

H × (G × E) → G × E : (h, (g, x)) '→ (gh−1 , h ◦ x).

(See the second paragraph in §3.3 for the definition of the action ◦ .) Then Gρ(g,x) = gHx g −1 . To see this suppose that g * ∈ Gρ(g,x) . Then g * ·ρ(g, x) = ρ(g, x) or ρ(g * g, x) = ρ(g, x). Therefore there is an h ∈ H such that g * g = gh−1 and h ◦ x = x. In other words, we have g * ∈ gHx g −1 . So Gρ(g,x) ⊆ gHx g −1 . Conversely, suppose that g * ∈ gHx g −1 . Then for some h−1 ∈ Hx , we have g * g = gh−1 . Therefore g * and h ◦ x = x. This implies that g * · ρ(g, x) = ρ(g, x), that is, g * ∈ Gρ(g,x) . So gHx g −1 ⊆ Gρ(g,x) . Therefore Gρ(g,x) = gHx g −1 . Let E H = {x ∈ E , h ◦ x = x for every h ∈ H} = {x ∈ E Hx = H}. Lemma 2.6.1.20. (G×H E)(H) is isomorphic to E H . Proof. Suppose that x ∈ E H . Then H = Hx . Therefore for each g ∈ G, we have Gρ(g,x) = gHx g −1 = gHg −1 . In other words, ρ(g, x) ∈ (G×H E)(H) . Conversely, if ρ(g, x) ∈ (G×H E)(H) , then Gρ(g,x) ∈ (H). But Gρ(g,x) = gHx g −1 , so Hx ∈ (H). However, Hx ⊆ H and has the same dimension and number of connected components as H, since Hx is conjugate to H in G. Therefore Hx = H, that is, x ∈ E H . The fact that π(MH ) = M(H) in combination with lemma 2.4.2.16 gives U(H) = U(H) /G = φ((B ∩ E H )/H). Here φ is the diffeomorphism given by lemma 2.4.2.16 and (B ∩ E H )/H is the image of B ∩ E H under the H orbit map on B. Because H acts linearly on E, it follows that E H is a linear subspace of E. Let F be the orthogonal complement of E with respect to the Hinvariant inner product β. Then F is H-invariant and F ∩ E H = {0}. The mapping E H × F → E : (x, y) '→ x + y

is a linear isomorphism, which is H-equivariant if we let H act on E H × F by H × (E H × F ) → E H × F : (h, (x, y)) '→ (x, h ◦ y).

If x = (x1 , . . . , x& ) is a coordinate system on E H and y = (y1 , . . . , yr ) is one on F , then every polynomial p ∈ P (E H ×F ) can be written uniquely as ) α! 1 p(x, y) = α xα qα (y), where α = (αi , . . . , α& ), xα = xα 1 · · · x& and qα (y) is a polynomial in y1 , . . . , yr . Since p ∈ P (E H × F )H ⇐⇒ qα ∈ P (F )H for

66

Group actions and orbit spaces

every α, a Hilbert basis of P (E)H , which is isomorphic to P (E H × F )H , is H given by x1 , . . . , x` , q1 , . . . , qm , where {qj }m j=1 is a Hilbert basis of P (F ) . H The image of B ∩ E under the Hilbert map σ : E = E H × F → Rn = R` × Rm : (x, y) 7→ (x, q1 (y), . . . , qm (y))

(6)

is an open subset of R` × {0}, which is a smooth `-dimensional submanifold. In spaces (B ∩ E H )/H, C ∞ (B ∩ view of the factHthat∞the differential H H H E ) and (σ(B ∩ E ), C (σ(B ∩ E ))) are diffeomorphic by proposition 2.5.18 and (σ(B ∩ E H ), C ∞ (σ(B ∩ E H ))) is diffeomorphic to (U(H) /G, C ∞ (U(H) /G)) by lemma 2.4.2.16, we have proved Proposition 2.6.1.21. Each connected component of each orbit type M(H) in the G-orbit space M is a smooth manifold, when regarded as a differential subspace of the differential space (M , C ∞ (M )). In other words, (M(H) , Ci∞ (M(H) )) is a smooth manifold. 2.6.2

Stratification of an orbit space

The orbit types for the action of H on E = E H × F are of the form E H × R, where R is an orbit type for the linear H-action on F . Furthermore, the R>0 -action on E of multiplication by t > 0 commutes with the linear action of H on F . Therefore t · R = R. Let S r−1 be the unit sphere in F with respect to the H-invariant inner product β. Then R → R ∩ S r−1 is a bijective map from all orbit types R 6= {0} to orbit types of the induced H-action on S r−1 . Using induction on the dimension of S r−1 , it follows that there are only finitely many H orbit types on S r−1 . Consequently, there are only finitely many H-orbit types for the action of H on E. From the quasihomogenity of the Hilbert map σ (6), it follows that σ(E) is invariant under the transformation (x1 , . . . , x` , q1 , . . . , qm ) 7→ (x1 , . . . , x` , td1 q1 , . . . , tdm qm ), where dj = deg qj . Note that when (q, . . . , qm ) 6= (0, . . . , 0) then (x1 , . . . , x` , td1 q1 , . . . , tdm qm ) 6= (x1 , . . . , x` , sd1 q1 , . . . , sdm qm ), when s > 0 and s 6= t. Therefore we have proved Lemma 2.6.2.22. Each orbit type in σ(E), which is different from R` ×{0} is equal to a product of R` with a submanifold of Rm of dimension greater than or equal to 1.

67

2.6. Stratification of the orbit space by orbit types

Therefore each connected component of an orbit type in the orbit space near a given connected component of a given orbit type has dimension greater than the dimension of the given connected component of the given orbit type. This proves Proposition 2.6.2.23. Connected components of orbit types in the orbit space M , when viewed as differential subspaces of (M , C ∞ (M )) are smooth manifolds. These manifolds define a stratification S of M , called the orbit type stratification of the orbit space. 2.6.3

Minimality of S

In this subsection we show that the orbit type stratification S of the orbit space is minimal when stratifications are partially ordered by inclusion of the strata. We begin with some observations. Let q ∈ P (F )H be a homogeneous polynomial of degree 1. Then q ∈ F ∗ , which shows that β ] (q) ∈ F . (The inner product β on F induces a bijective linear map β [ : F → F ∗ given by β(y)y 0 = β(y, y 0 ). The inverse of β [ is β ] .) Since β is H-invariant, it follows that β ] (q) ∈ F H = {0}. Therefore q = 0. Consequently, dj = deg qj , m where {qj }j=1 form a Hilbert basis of P (F )H , is greater than or equal to 2. Because the polynomial F → R : y 7→ β(y, y) is H-invariant and of degree 2, we may choose q1 (y) = β(y, y) for every y ∈ F . Let S = {y ∈ F q1 (y) = 1} be the unit sphere in F with respect to the inner product β. For each j with 2 ≤ j ≤ m let Cj be the maximum of 1/2 |qj (y)| with y ∈ S. When y ∈ F \ {0}, let t = q1 (y) . Then t−1 y ∈ S. Since qj (y) = qj (t(t−1 y)) = tdj qj (t−1 y) dj /2

and |qj (t−1 y)| ≤ Cj , we obtain |qj (y)| ≤ q1 (y) when y = 0. Therefore σ(E) is contained in d /2

{(x, q) ∈ R` × Rm q1 ≥ 0 and |qj | ≤ Cj q1 j

Cj , which also holds

for every 2 ≤ j ≤ m}. (7)

Here σ is the Hilbert map given in (6). Lemma 2.6.3.24. Let I be an open interval in R containing 0 and let γ : I → σ(E) ⊆ R` × Rm : t 7→ γ(t) = (x(t), q(t)). Suppose that q1 (0) = 0 and that q : I → Rm : t 7→ q(t) is differentiable at t = 0. Then q 0 (0) = 0.

68

Group actions and orbit spaces d /2

Proof. From the fact that γ(I) ⊆ σ(E) and the inequality |qj | ≤ Cj q1 j (7), it follows that q1 (t) ≥ 0 for every t ∈ I. Because q1 (0) = 0, we see that the function t 7→ q1 (t) attains its minimum value at 0. Therefore q10 (0) = 0, that is, q1t(t) → 0 as t → 0. For each j where 2 ≤ j ≤ m, the fact that γ(I) ⊆ σ(E) and inequality (7) imply |qj (t)| q1 (t)dj /2 q1 (t) ≤ Cj = Cj |t| |t| t

dj /2

|t|dj /2−1 .

(8)

The right hand side of (8) converges to 0 as t → 0, because q1t(t) → 0 as t → 0 and 12 dj − 1 ≥ 0. Therefore qj0 (0) = 0 for every 2 ≤ j ≤ m. From lemma 2.6.3.24 we see that if N is a C 1 submanifold of Rn = R × Rm such that N ⊆ σ(E) and 0 ∈ N , then T0 N ⊆ R` × {0}. In particular, dim N ≤ `. From lemma 2.6.2.22 we know that all the orbit type strata in σ(E) different from R` × {0} have dimension strictly greater than `. Therefore no union of R` × {0} with different strata in σ(E) can be a C 1 manifold through the origin. This proves `

Proposition 2.6.3.25. The orbit type stratification of the orbit space M/G of a proper G-action on M is minimal, that is, no union of different strata can be a connected smooth manifold in the differential space (M/G, C ∞ (M/G)). Corollary 2.6.3.26. If M/G is connected, then the differential space (M/G, C ∞ (M/G)) is a smooth manifold if and only if there is exactly one orbit type. Given a semialgebraic subset, it has a primary stratification given by iteratively forming the semialgebraic set of singular points of the preceding semialgebraic variety. Proposition 2.6.3.27. In the local model of σ(B) ⊆ Rn of the orbit space U/G given in lemma 2.4.2.16, the orbit type stratification of B/H coincides with the primary stratification of the semialgebraic set σ(B). Proof. See theorem A in Bierstone [14].

2.7

Derivations and vector fields on a differential space

Let A be an algebra over R with multiplication ·. A derivation of A is a linear mapping δ of A into itself such that Leibniz’ rule holds, namely,

2.7. Derivations and vector fields on a differential space

69

δ(f · g) = (δf ) · g + f · (δg) for every f, g ∈ A. We denote the set of all derivations of A by Der(A). Note that Der(A) is a Lie algebra with bracket [δ, δ 0 ] = δ ◦ δ 0 − δ 0 ◦ δ for every δ, δ 0 ∈ Der A. Example 2.7.28. Let Q be a smooth manifold with C ∞ (Q) its space of smooth functions. If V is a smooth vector field on Q, then for every f ∈ C ∞ (Q) the Lie derivative LV f of f with respect to V , namely, LV f : Q → R : q 7→ hdf (q)|V (q)i, is a smooth function on Q. Moreover, the linear mapping LV : C ∞ (Q) → C ∞ (Q) is a derivation.6 Let (Q, C ∞ (Q)) be a differential space, which is not necessarily a smooth manifold, and let δ ∈ Der(C ∞ (Q)). An integral curve of δ is a smooth mapping γ : I ⊆ R → Q, where I is an interval, such that df (γ(t)) = δ(f )(γ(t)) for every f ∈ C ∞ (Q) and every t ∈ I. dt

Proposition 2.7.29. Let (Q, C ∞ (Q)) be a locally compact subcartesian differential space and let δ ∈ Der(C ∞ (Q)). Suppose that for every q ∈ Q, there is an open interval I ⊆ R containing 0 and an integral curve γq : I ⊆ R → Q of δ such that γq (0) = q. Then 1. For each q ∈ Q there is a unique integral curve γ : Iq → Q of δ, which is defined on a maximal open interval Iq in R containing 0 such that γ(0) = q. 2. The set D = {(t, q) ∈ R × Q t ∈ Iq } is an open subset of R × Q and the map ϕ : D ⊆ R × Q → Q : (t, q) 7→ γq (t) is smooth. 3. For each t ∈ R, the set Dt = {q ∈ Q t ∈ Iq } is an open subset of Q and the mapping ϕt : Dt → Q : q 7→ γq (t) is smooth. ϕt is called the flow of δ after time t. 4. If s, t ∈ R, q ∈ Ds , ϕs (q) ∈ Dt , then q ∈ Dt+s and ϕt (ϕs (q)) = ϕt+s (q). Therefore ϕt : Dt → D−t is a diffeomorphism with inverse ϕ−t . Proof. Using the fact that Q is locally compact, we may identify a suitable open neighborhood of q in Q with a locally closed subset V of Rn . Let n {xi }i=1 be coordinate functions on Rn . There is an open neighborhood Ui of q in Rn and δi ∈ Der(C ∞ (Ui )) such that δxi = δi |Ui . Let U = ∩ni=1 Ui and let δe : U → Rn be a smooth vector field on U such that Lδexi = δi |U . Shrinking U if necessary, we may assume that V ∩ U is a closed subset of U. 6 In many texts on manifolds, smooth vector fields are defined as derivations on C ∞ (Q), after which tangent spaces are introduced. It is then shown that derivations correspond to smooth sections of the tangent bundle.

70

Group actions and orbit spaces

Let γ : I → V ∩ U be an integral curve of δ. Then

dγi (t) dxi (γ(t)) = = δ(xi )(γ(t)) = δi (γ(t)) dt dt shows that γ is an integral curve of the derivation Lδe, thought of as a smooth vector field on Rn . Therefore the local existence and uniqueness of integral curves of the smooth vector field implies the existence and uniqueness of smooth integral curves of δe with prescribed initial value. Consequently, for each q ∈ V ∩ U there is a unique integral curve γ : Iq → V ∩ U of the derivation δ on a maximal open interval Iq in R containing 0 with γ(0) = q. Let ϕ et be the flow after time t of the vector field δe on U . Let Ie be the maximal domain of definition of the integral curve t 7→ e γ (t) = ϕ et (q) of δe e e with γ e(0) = q. We will now show that Iq = I. Now e γ |Iq = γ, so Iq ⊆ I. e Suppose that s = sup Iq ∈ I and let r = γ e(s). Then r = lim e γ (t) = lim γ(t) ∈ V ∩ U, t↑s

t↑s

because γ(t) ∈ V ∩ U for every t ∈ Iq and V ∩ U is closed in U . The hypothesis that every integral curve of δ is defined on an open interval, implies that there is an open interval J in R containing 0 and an integral curve b γ : J → V ∩ U of δ such that b γ (0) = r. Therefore γ b(t − s) = ϕt−s (r) = ϕt−s (ϕs (q)) = e γ (t)

for all t ∈ Iq ∩ (s + J). From uniqueness it follows that γ e and t 7→ γ b(t − s) piece together to form an integral curve of δ, which is defined on Iq ∩(s+J). This contradicts the maximality of Iq . A similar argument shows that if e then we obtain a contradiction. Therefore Iq = I. e This proves inf Iq ∈ I, assertion 1. e ⊆ R × U be the domain of definition of the flow ϕ e → U Let D e : D e of the smooth vector field δ on U . Let ϕ : D → V ∩ U be as defined in assertion 2 of the proposition with Q replaced by V ∩ U . The argument e ∩ (R × (V ∩ U )) and that of the preceding paragraph shows that D = D e ϕ = ϕ|D. e Because D is an open subset of R × U and the flow ϕ e is smooth, it follows that D is an open subset of R × (V ∩ U ) and that the mapping ϕ : D → V ∩ U is smooth as a map of differential spaces. Because the preceding two properties are local, we have shown that the set D, given in statement 2 of the proposition, is an open subset of R × Q and that the mapping ϕ : D → Q is smooth. This proves assertion 2. The other assertions follow in the same way as for smooth vector fields on a smooth manifold.

2.7. Derivations and vector fields on a differential space

71

Example 2.7.30. Consider the set S = {(x1 , x2 ) ∈ R2 | x21 + (x2 − 1)2 < 1 or x2 = 0}.

e = The vector field X

∂ ∂x1

on R2 restricts to a derivation LX of C ∞ (S). e

For every x = (x1 , x2 ) ∈ S, ϕX t (x) = (x1 + t, x2 ) for all t ∈ R. X Its restriction to S induces ϕ given by ϕX t t (x1 , x2 ) = (x1 + t, x2 ) for p p t ∈ (−x1 − 1 − (x2 − 1)2 , −x1 + 1 − (x2 − 1)2 ) if x2 > 0, and for t ∈ R if x2 = 0. Hence, all integral curves of X have open domains. Nevertheless, ϕX t fails to be a local one-parameter group of local diffeomorphisms of S. The proof that S is not a locally closed subset of R2 is left to the reader. Proposition 2.7.29 motivates the following definition. Let (Q, C ∞ (Q)) be a locally compact subcartesian space and let δ ∈ Der(C ∞ (Q)). We call δ a vector field on (Q, C ∞ (Q)) if and only if for every q ∈ Q there is an open interval I in R containing 0 and an integral curve γ : I ⊆ R → Q of δ such that γ(0) = q. The collection of all vector fields on Q we denote by X (Q, C ∞ (Q)). We will refer to X (Q, C ∞ (Q)) as the space of all smooth vector fields on the differential space (Q, C ∞ (Q)). We conclude this subsection with the statement of a generalization of Sussmann’s theorem to subcartesian spaces (S, C ∞ (S)), see theorem 1.10.42 of chapter 1. Let F be a family of vector fields on a subcartesian space S. We say that the family F is locally complete if, for every X, Y ∈ F, t ∈ R and x ∈ S, for which (ϕX t )∗ Y (x) is defined, there exists an open neighborhood U of x and Z ∈ F such that (ϕX t )∗ Y | U = Z|U. For example, a family consisting of a single vector field X is locally complete X because (ϕX t )∗ X(x) = X(x) at all points x ∈ S for which ϕt (x) is defined. For x ∈ S, we define the orbit through x of a family F of vector fields on S to be the set of points in S which can be connected to x by piecewise integral curves of vector fields in F . Theorem 2.7.31. Let F be a locally complete family of vector fields on a subcartesian space S. Each orbit O of F admits a unique manifold structure such that the inclusion map ιO : O ,→ S is smooth. Moreover, smooth functions on O are locally pull backs of smooth functions on S. Outline of proof. For each x ∈ S, let DFx = {X(x) | X ∈ F }. In other words, DFx is the subspace of the space of derivations of C ∞ (S) consisting of values at x of vector fields X ∈ F. Since the family F is locally complete, it follows that dim DFx is constant along orbits of F .

72

Group actions and orbit spaces

Let O be the orbit of F through z ∈ S and let m = dim DFz . There exist m vector fields X 1 , ..., X m ∈ F which are linearly independent in a neighborhood V of z in S. Since S is subcartesian, without loss of generality, we may assume that V is a subset of Rn . Then, there exists a neighborhood ˜ 1 , ..., X ˜ m on U such U of z in Rn , and linearly independent vector fields X ˜ i (x) for each i = 1, ..., m and each x ∈ V ∩ U . Consider that, X i (x) = X the map ˜1 ˜m ◦ · · · ◦ ϕX )(x) ∈ Rn ρ˜ : T = (t1 , ..., tm ) 7→ (ϕX tm t1 ˜ 1 , ..., X ˜m defined in a neighborhood of zero in Rn . Since the vector fields X are linearly independent on U , there is a neighborhood Ω of zero in Rm such that ρ˜ is a diffeomorphism of Ω onto its image W = ρ˜(Ω) ⊂ Rn . ˜ 1 , ..., X ˜ m are extensions to U of vector But x ∈ S and the vector fields X 1 m fields X , ..., X on S restricted to V ∩ U . Moreover, for each i = 1, . . . , m i i the flow ϕX ti of X is a local one parameter group of local diffeomorphisms of S. Hence, W = ρ˜(Ω) is in S, and it is an open subset of the orbit O through x. Since W is a submanifold of Rn , a function f : W → R is smooth if it locally coincides with restrictions to W of smooth functions on Rn . Let ιV : V ,→ Rn , ιW : W ,→ Rn and ιW V : W ,→ V . Then ιW = ιW V ◦ ιV . Moreover, smooth functions on V are locally restrictions to V of smooth functions on Rn . Hence, smooth functions on W are locally restrictions to W of smooth functions on V . But V is an open subset of S. Hence, smooth functions on V are locally restrictions to V of smooth functions on S. Therefore, smooth functions on W are locally restrictions of smooth functions on S. Analogous reasoning for every x ∈ O gives a covering of O by smooth manifolds Wx . One needs to show that the differential structures are compatible on overlaps and that they are independent of the choice of vector fields X 1 , ..., X m ∈ F which are linearly independent in Wx . For details see [106]. Let X (S, C ∞ (S)) denote the family of all vector fields on a subcartesian space (S, C ∞ (S)). It is easy to see that the family X (S, C ∞ (S)) is locally complete. Hence, we obtain the following Corollary 2.7.32. For each point x in a subcartesian space (S, C ∞ (S)), the orbit O of the family X (S, C ∞ (S)) of all vector fields on (S, C ∞ (S)) is a manifold and the inclusion map O ,→ S is smooth. Proof. See [106].

2.8. Vector fields on a stratified differential space

2.8

73

Vector fields on a stratified differential space

Let (Q, C ∞ (Q)) be a differential space. A stratification S of (Q, C ∞ (Q)) is a collection of differential subspaces, which are smooth submanifolds of (Q, C ∞ (Q)) such that 1. S is a locally finite partition of Q; 2. for each S ∈ S the closure of S in Q is the union of S and {S 0 ∈ S dim S 0 < dim S}. Denote a differential space (Q, C ∞ (Q)) with the stratification S by (Q, S, C ∞ (Q)). A stratified vector field W on (Q, S, C ∞ (Q)) is a map which assigns to each stratum S of the stratification S a smooth vector field WS on S. If f ∈ C ∞ (Q) and S ∈ S, then f |S ∈ C ∞ (S). Therefore LWS (f |S) ∈ C ∞ (S). Lemma 2.8.33. For every f ∈ C ∞ (Q) the function ∂W f , which assigns to each S ∈ S the smooth function LWS (f |S) in C ∞ (S), is a smooth function on Q, which lies in Der(C ∞ (Q)). Proof. Let q ∈ Q. Because f ∈ C ∞ (Q), for every open neighborhood U of q in Q, we have f |U ∈ C ∞ (U ). Since the stratification S is a locally finite partition of Q, there are a finite number of strata {Sj }j∈J such that P ∞ Sj ∩ U 6= ∅. Let F = j∈J LWSj (f |(U ∩ Sj )). Then F ∈ C (U ) because LWSj (f |(U ∩ Sj )) ∈ C ∞ (U ∩ Sj ) ⊆ C ∞ (U ), since Sj is an embedded submanifold of (Q, C ∞ (Q)). From (∂W f )|(U ∩ Sj ) = LWSj (f |(U ∩ Sj )) = F |(U ∩ Sj ) it follows that LW f ∈ Ci∞ (U ) = C ∞ (U ). Therefore ∂W f ∈ C ∞ (Q). Clearly ∂W f is a derivation on C ∞ (Q). A stratified vector field W on (Q, S, C ∞ (Q)) is smooth if and only if ∂W f ∈ C ∞ (Q) for every f ∈ C ∞ (Q). Let X ∞ (Q, S) be the set of all smooth stratified vector fields on (Q, S, C ∞ (Q)). Proposition 2.8.34. The map ∂ : X ∞ (Q, S) → Der(C ∞ (Q)) : W 7→ ∂W is an injective homomorphism of Lie algebras. Moreover, ∂(X ∞ (Q, S)) ⊆ X (Q, C ∞ (Q)),

(9)

if the differential space (Q, C ∞ (Q)) is locally compact and is subcartesian.

74

Group actions and orbit spaces

Proof. First we show that ∂ is injective. Let W ∈ X ∞ (Q, S) and suppose that ∂W = 0. Let S ∈ S, f ∈ C ∞ (S), and q ∈ S. Then there is an open neighborhood U of q in Q and g ∈ C ∞ (Q) such that f |(S ∩ U ) = g|(S ∩ U ). Because ∂W g = 0, we obtain LWS (f |(S ∩ U )) = 0. Since this holds for every q ∈ S, we see that LWS f = 0. Because LWS : X ∞ (S) → Der(C ∞ (S)) is an isomorphism for smooth manifolds, it follows that WS = 0. Since this holds for every S ∈ S, we deduce that W = 0. We now show that (9) holds. Let W ∈ X ∞ (Q, S) and q ∈ Q. Then there is an S ∈ S with q ∈ S. The smooth vector field WS on S has a smooth integral curve γ : I → S, where I is an open interval in R containing 0 and γ(0) = q. For any f ∈ C ∞ (Q), we know that f |S is a smooth function on the smooth manifold S. Therefore df (γ(t)) = hd(f |S)(γ(t)) | γ 0 (t)i = hd(f |S)(γ(t)) | WS (γ(t))i dt = LWS (f |S)(γ(t)) = ∂W f (γ(t)). So γ is an integral curve of the derivation ∂W with γ(0) = q. Because this holds for every q ∈ Q, the derivation ∂W is a smooth vector field on (Q, C ∞ (Q)), that is, ∂W ∈ X (Q, C ∞ (Q)). 2.9

Vector fields on an orbit space

Let G be a Lie group which acts smoothly and properly on a smooth manifold M . Then (M/G, C ∞ (M/G)) is a differential space, which has a stratification S by orbit types. Let V ∈ X (M )G , that is, V is a smooth G-invariant vector field on M . Then V(H) = V |M(H) is a smooth vector field on the orbit type M(H) . Let V(H) = π∗ V(H) , where π : M → M = M/G is the G-orbit map. Then V(H) is a smooth vector field on the orbit type M(H) = π(M(H) ) in the orbit space M . The map which assigns to the orbit type M(H) the smooth vector field V(H) defines a smooth stratified vector field V on (M , S, C ∞ (M )) such that π∗ V = V . Thus we have proved Proposition 2.9.35. We have π∗ (X (M )G ) ⊆ X ∞ (M , S). Schwarz [99] has shown that equality holds in (10).

(10)

2.9. Vector fields on an orbit space

Proposition 2.9.36. defined in §8, we have

75

Using the map ∂ : X ∞ (M , S) → Der(C ∞ (M ))

∂X ∞ (M , S) = X (M , C ∞ (M )).

(11)

Proof. We begin by observing that the orbit space M is locally compact, since M is and the orbit map π is continuous, open and surjective. In addition, the differential space (M , C ∞ (M )) is subcartesian. Let δ be a smooth vector field on M . Let γ : I → M : t 7→ γ(t) which starts at m. For any point m ∈ M , we use the diffeomorphism ϕ = σ e ◦ φ−1 to identify an open neighborhood U of m in M with the image σ(B) ⊆ Rn of the Hilbert map, as in the proof of proposition 2.5.18. We also use the decomposition Rn = R` × Rm so that the orbit types in σ(E) are of the form R` × R, where R is an orbit type for the action of H on F . We may assume that σ e(m) = (0, 0) ∈ R` × Rm and that its orbit type corresponds ` locally to R × {0}. Let V (x, q) = (x, ˙ q) ˙ be a smooth vector field in an open neighborhood n of 0 in R induced by the smooth vector field δ. Because γ(t) ∈ σ(E) for all t ∈ I, from lemma 2.6.3.24 it follows that q 0 (0) = 0. Therefore q˙ = 0 when q = 0. In other words, V is tangent to S = R` × {0} at the origin. This implies that V |S near the origin is a smooth vector field on S. Therefore the integral curves of δ, which start on S remain on S. Let f be a smooth function on an open neighborhood of the origin in Rn . For any integral curve γ of δ, which lies on S, we have df (γ(t)) = LW f (γ(t)). dt This implies that δ(f )|S = LW (f |S). Thus, for every S ∈ S and every m ∈ S, there is an open neighborhood U of m in M and a smooth vector field WS∩U such that δ(f )(γ(t)) =

δ(f )|(S ∩ U ) = LWS∩U (f |(S ∩ U )) for every f ∈ C ∞ (U ).

(12)

Because equation (12) determines WS∩U uniquely, the vector fields WS∩U patch together to form a smooth vector field WS on S, where δ(f )|S = LWS (f |S)

(13)

holds for every f ∈ C ∞ (M ). Because (13) holds for every S ∈ S, we f on M , that see that S 7→ WS defines a smooth stratified vector field W ∞ ∞ f is, W ∈ X (M , S). Thus δ = ∂W f , which implies X (M , C (M )) ⊆ ∞ ∞ ∞ ∂X (M , S). The inclusion ∂X (M , S) ⊆ X (M , C (M )) follows from proposition 2.8.34 with (Q, C ∞ (Q)) = (M , C ∞ (M )).

76

Group actions and orbit spaces

In view of propositions 2.9.35 and 2.9.36, we call the set of smooth vector fields on M the space X (M , C ∞ (M )) of smooth vector fields on the G orbit space M . These vector fields depend only on the differential structure C ∞ (M ) of M ; whereas the space of induced vector fields π∗ (X ∞ (M )G ) depends on the manifold structure of M , and the space X ∞ (M , S) of stratified vector fields on M depends on the stratification S of M by orbit types. Examples 2.9.37. 1. Let M = R and G = Z2 = {±1}. Then p : R → R≥0 ⊆ R : x 7→ x2 is a Z2 -invariant polynomial, which freely generates C ∞ (R)Z2 . In other words, every smooth even function is a smooth function of p. Therefore p∗ : C ∞ (R≥0 ) → C ∞ (R)Z2 is an isomorphism, which gives rise to a diffeomorphism of the differential space (R/Z2 , C ∞ (R/Z2 )) onto the differential space (R≥0 , C ∞ (R≥0 )). A derivation on C ∞ (R≥0 ) is a smooth vector field W on R. Now LW is a smooth vector field on C ∞ (R≥0 ) if and only if the flow ϕt of W maps 0 into R≥0 for every t in an open neighborhood of 0 in R. This holds if and only if W (0) = 0. Therefore not every derivation of C ∞ (R/Z2 ) is a smooth vector field on R/Z2 . 2. Let M = C and G = S 1 = {z ∈ C |z| = 1}. Suppose that S 1 acts on 1 C by multiplication. Then C ∞ (C)S is generated freely by the real valued function p : C → R≥0 ⊆ R : z 7→ z z. Thus p∗ : C ∞ (R≥0 ) → C ∞ (C/S 1 ) is an isomorphism. So again not every derivation of C ∞ (C/S 1 ) is a vector field on C/S 1 . The inclusion map i : R → C induces a diffeomorphism of the differential space (C/S 1 , C ∞ (C/S 1 )) onto the differential space (R/Z2 , C ∞ (R/Z2 )). 2.10

Tangent objects to an orbit space

In the preceding section we have introduced the notion of a smooth vector field on the orbit space M of a smooth proper action of a Lie group G on a smooth manifold M . We did not define what we meant by being “tangent” to M at m ∈ M , so we cannot view our smooth vector fields as being “tangent” to M . Below we discuss various ways of dealing with this. 2.10.1

Stratified tangent bundle

Let S be the orbit type stratification of the orbit space M , which due to its local model as an open subset of the semialgebraic variety σ(E), is a

2.10. Tangent objects to an orbit space

77

` Whitney stratification. Let T S M be S∈S T S, the disjoint union of the tangent bundle T S of the strata S of S. Then T S M is the stratified tangent bundle of M . We can give T S M the structure of a differential space so that T = {T S}S∈S forms a stratification of T S M and the projection map π S : T S M → M is a smooth map of stratified differential spaces. In addition, the space X ∞ (M , S) of smooth stratified vector fields on M is equal to the space of smooth sections of π S : T S M → M , that is, the space of smooth mappings σ : M → T S M such that π S ◦ σ = idM .

2.10.2

Zariski tangent bundle

Let (Q, C ∞ (Q)) be a differential space. For q ∈ Q, let Mq = {f ∈ C ∞ (Q) f (q) = 0}. Then Mq is a maximal ideal in the algebra C ∞ (Q). Following ideas in algebraic geometry, we call Mq /M2q the Zariski cotan∗ gent space to Q at q and its topological dual Mq /M2q the Zariski tangent space TqZ Q to Q at q. If S is a smooth submanifold of Q, then for each s ∈ S with vs ∈ Ts S and for each f ∈ C ∞ (Q) the function ∂evs : C ∞ (Q) → R : f 7→ hd(f |S)(s) | vs i gives rise to the mapping ∂ : Ts S → TsZ Q : vs → ∂evs , which is injective and therefore can be used to identify Ts S with a linear subspace of TsZ Q. In this way the stratified tangent bundle T S Q of Q is contained in the Zariski ` tangent bundle T Z Q = q∈Q TqZ Q of Q. When Q = M , using the local model of M as an open subset of a semialgebraic set σ(E), then T0Z (σ(E)) is isomorphic to T0Z Rn , which in turn is isomorphic to Rn . Therefore the dimension of the Zariski tangent space to M at m is equal to the number of generators in a Hilbert basis of the algebra of Gm -invariant polynomials on E = Tm M/Tm (G · m).

2.10.3

Tangent cone

The tangent cone T0 C in T0Z σ(E) is the set of all limits of sequences {τj σ(xj )} ∈ σ(E), where τj ∈ R>0 and {σ(xj )} → 0. The conic structure of T0 C is given by multiplication of a vector in Rn by a positive real number. Because the Hilbert map σ : E → Rn is not homogeneous, this conic structure is different from the conic structure on σ(E), which is induced by the Hilbert map σ from the H-invariant conic structure on E defined by multiplication of vectors in E by positive real numbers.

78

Group actions and orbit spaces

2.10.4

Tangent wedge

Throughout this subsection we suppose that the action of G on M is proper. Let O be the G-orbit G · m through m ∈ M . The normal bundle N O of O ` is m0 ∈O Tm0 M/Tm0 (G · m), namely, the disjoint union of the vector spaces Em0 = Tm0 M/Tm0 (G · m) for every m0 ∈ O. The normal bundle N O is a smooth vector bundle over O. On N O there is an action of G induced by the tangent of the G-action on M . The orbit space N O/G is called the tangent wedge Wm at m = O ∈ M . Let C ∞ (Wm ) = C ∞ (N O/G) = C ∞ (N O)G . Then (Wm , C ∞ (Wm )) is a differential space. The inclusion map Em → N O leads to an identification of the tangent wedge Wm at m with E/H, where E = Em and H = Gm . The discussion in §6.1 shows that E/H = E H ×F/H, where E H corresponds to the tangent space at m of the orbit type M(H) in M and F is a vector space isomorphic to E/E H . The tangent wedge at m is a wedge over the cone F/H. The conic structure on F/H is the one induced by the Hilbert map σ from the H-invariant conic structure on F . Proposition 2.10.4.38. The tangent wedge (Wm , C ∞ (Wm )) at m ∈ M is diffeomorphic as a differential space to an open neighborhood of m in (M , C ∞ (M )). Proof. This follows from proposition 2.5.18. We say that a stratified differential space (Q, S, C ∞ (Q)) is smoothly locally trivial if for every q ∈ Q there is an open neighborhood U of q in Q, a stratified differential space (P, T , C ∞ (P )) with a distinguished point o, and a map ϕ : U → (S ∩ U ) × P , where S ∈ S with q ∈ S such that 1. ϕ∗ : C ∞ ((S ∩ U ) × P ) → C ∞ (U ) is bijective. 2. ϕ|((S ∩ U ) × {o}) is a diffeomorphism with ϕ(s, o) = (s, o) for every s ∈ S. From proposition 2.10.4.38 we obtain Corollary 2.10.4.39. The stratified differential space (M , S, C ∞ (M )), where S is the orbit type stratification, is smoothly locally trivial and thus is subcartesian. Proof. For a proof of the last assertion see [106]. Lemma 2.10.4.40. Let (M , S, C ∞ (M )) be a smoothly locally trivial stratified subcartesian space and XS a smooth vector field on a stratum S of S.

2.11. Notes

79

For each s ∈ S there exists a neighborhood U of s and a vector field X on M such that the restriction of XS to U coincides with the restriction of X to U . Proof. See [106]. Theorem 2.10.4.41. Each stratum of the stratified differential space (M , S, C ∞ (M )), where S is the orbit type stratification, is an orbit of the family X (M , C ∞ (M )) of all vector fields on M . Proof. By corollary 2.10.4.39, (M , S, C ∞ (M )) is smoothly locally trivial. Lemma 2.10.4.40 implies that every stratum S of S is contained in an orbit O of the family X (M , C ∞ (M )) of all vector fields on M . By corollary 2.7.32 the orbit O of X (M , C ∞ (M )) is a smooth manifold. However, proposition 2.6.3.25 states that the stratification S by orbit type is minimal, that is, no union of different strata can be a connected smooth manifold in the differential space (M , S, C ∞ (M )). Hence, S = O, which completes the proof. 2.11

Notes

The text in this chapter follows that of the survey article of Duistermaat [41]. The results about the flow of a vector field are standard and may be found in Coddington and Levinson [27]. The notion of a proper action is due to Palais [88]. The proofs of the standard facts about proper actions not given in the text, namely, proposition 2.2.2, theorems 2.3.3.6, 2.3.3.7, and 2.3.4.8, may be found in Duistermaat and Kolk [42]. The equivalence of a free and proper action and a principal G-bundle is due to H. Cartan [20]. The facts about paracompactness and partitions of unity used to prove lemma 2.4.1.11 come from Dieudonne [37], [38]. The notion of a differential space originates with Sikorski [100], [101]. Proposition 2.4.2.14 showing that the orbit space of a proper action is a differential space comes ´ from Cushman and Sniatycki [33], as does lemma 2.4.2.16. The fact that the sheaf of smooth functions on a locally compact and paracompact differential space is fine follows from lemma 2.4.1.10, see Gunning [50]. The notion of a subcartesian space is due to Aronszjan [9], but its application to studying the orbit type stratification of an orbit space of a proper group action ´ is due to Sniatycki [106]. Facts about Hilbert maps and stratifications may be found in Pflaum [91]. The Tarski-Seidenberg theorem is proved

80

Group actions and orbit spaces

in Friedman [46] on pp. 225–235. For a discussion of the inequalities appearing in the image of the Hilbert map we refer the reader to Procesi and Schwarz [93]. The approach to proving the minimality of the orbit type stratification of the orbit space is due to Bierstone [15], who in [14] proved proposition 2.6.3.27 relating the stratification by orbit types with the primary stratification of the semialgebraic variety given by the image of the Hilbert map. The result that strata of the orbit type stratification ´ are orbits of the family of all vector fields is due to Lusala and Sniatycki [71]. The notion of smooth stratified tangent bundle and smooth stratified vector field is taken from theorems 2.1.2, 2.2.6, and 2.2.9 of Pflaum [91]. The fact that the dimension of the Zariski tangent space at a point m of the orbit space M of a proper action G on a smooth manifold M is equal to the number of generators in a Hilbert basis of the algebra of Gm -invariant polynomials on Tm M/Tm (G · m) is due to Mather [78]. The definition of tangent cone is due to Whitney [117]. The definition of the tangent ´ [33] as is the first wedge Wm to M at m is due to Cushman and Sniatycki proof of proposition 2.10.4.38 that the tangent wedge (Wm , C ∞ (Wm )) is diffeomorphic to the orbit space (M , C ∞ (M )) near m.

Chapter 3

Symmetry and reduction

3.1

Dynamical systems with symmetry

In this section we discuss symmetry and reduction for a general dynamical system. These results will be used in the discussion of symmetry and reduction of nonholonomically constrained systems given in subsequent sections. 3.1.1

Invariant vector fields

A smooth dynamical system is a pair (M, V ), where M is a smooth manifold and V is a smooth vector field on M . Let Φ be a smooth action of a Lie group G on a smooth manifold M . We say that G is a symmetry of (M, V ) if the G-action Φ leaves V invariant, that is, if Tm Φg (V (m)) = V (Φg (m)) for every (g, m) ∈ G × M . Lemma 3.1.1.1. V is a smooth G-invariant vector field if and only if its flow ϕt commutes with the action of G. More formally, Ig·m = Im , the maximum time interval in R containing 0, where the integral curve t 7→ ϕt (m) of V starting at m is defined, and Φg (ϕt (m)) = ϕt (Φg (m)), for every (g, m) ∈ G × M and t ∈ Im .

(1)

Proof. Suppose that V is a smooth G-invariant vector field on M , then d (Φg (ϕt (m))) = Tm Φg V (ϕt (m)) = V (Φg (ϕt (m))) dt and Φg (ϕ0 (m)) = Φg (m). Therefore γ : Im → M : t 7→ Φg (ϕt (m)) is an integral curve of V , that is, d dγ(t) = V (γ(t)), starting at Φg (m) as is t γ e : Im → M : t 7→ ϕt (Φg (m)). Therefore by uniqueness of integral curves starting at the same point, we obtain Ig·m = Im and that (1) holds. The converse follows by differentiating (1). 81

82

3.1.2

Symmetry and reduction

Reduction of symmetry

In this section we show how to reduce the symmetry of the G-invariant vector field V on a smooth manifold M . Our first observation about a G-invariant vector field V is that equation (1) implies that for t ∈ Im the flow ϕt of V maps the G-orbit G · m through m onto the G-orbit G · ϕt (m) through ϕt (m) in a G-equivariant fashion. Therefore ϕt induces a map ϕt : π(Dt ) → M = M/G such that ϕt ◦ π = π ◦ ϕt on Dt .

(2)

Here π : M → M = M/G is the G-orbit map and Dt = {m ∈ M t ∈ Im }. Since Dt is an open G-invariant subset of M , the domain π(Dt ) of definition of ϕt is an open subset of the G-orbit space M . In addition, the group property (see §1 of chapter 2) holds for ϕt , namely, if s, t ∈ R, m ∈ π(Ds ), and ϕs (m) ∈ π(Dt ), then m ∈ π(Dt+s ) and ϕt (ϕs (m)) = ϕt+s (m). Thus ϕt is a continuous flow on M , which we call the reduced flow or reduced dynamical system. If the flow ϕt of V is complete, then so is the reduced flow ϕt . Then t #→ ϕt is a one parameter group of homeomorphisms of M . 3.1.3

Reduction for a free and proper G-action

If G acts freely and properly on M , then the orbit space M is a smooth manifold, and the reduced flow ϕt is a smooth flow of a reduced smooth d vector field V on M defined by V (m) = dt ϕt (m). Note that we have t=0

Tm π(V (m)) = V (π(m)), for every m ∈ M .

Because the orbit space M has smaller dimension than M , if dim G > 0, there is a chance that the reduced dynamical system is simpler than the original one. 3.1.4

Reduction of a nonfree, proper G-action

Assume that the action of G on M is proper, but is not necessarily free. In this case, the orbit space M is a subcartesian differential space endowed with the orbit type stratification S. The ring C ∞ (M ) of smooth functions on M is {f : M → R | π ∗ f ∈ C ∞ (M )},

where π : M → M is the G-orbit map, which is isomorphic to the ring C ∞ (M )G of G-invariant functions in C ∞ (M ).

3.1. Dynamical systems with symmetry

83

Since the vector field V is G-invariant, it is a derivation of C ∞ (M ) which preserves C ∞ (M )G . Hence it induces a derivation V of C ∞ (M ) such that, π ∗ (V (f ) ) = V (π ∗ f ) for every f ∈ C ∞ (M ). Proposition 3.1.4.2. The reduced flow ϕt , defined by equation (2), is the local one parameter group of local diffeomorphisms of M generated by V . Proof. Let ϕt be the flow of V and ϕt the reduced flow on M . We have shown above that ϕt is a local one parameter group of local homeomorphisms of M . For each f ∈ C ∞ (M ), the argument above implies that π ∗ (ϕ∗t f ) = ϕ∗t (π ∗ f ) ∈ C ∞ (M ). Hence, ϕt is a local one parameter group of local diffeomorphisms of M . Moreover, π∗

d ∗ d (ϕt f ) = ϕ∗t π ∗ (f ) = V (ϕ∗t π ∗ (f )) = V (π ∗ ϕ∗t (f )) = π ∗ (V (ϕ∗t (f ))). dt dt

d Hence, dt (ϕ∗t f ) = V (ϕ∗t (f )) for every f ∈ C ∞ (M ). This implies that V is a vector field on M and ϕt is the local one parameter local group of diffeomorphisms of M generated by V .

Since V is a vector field on M , it is tangent to strata of the orbit type stratification S. For every S ∈ S, the restriction of V to S is a vector field VS on S. Morever, the reduced flow ϕt preserves S and it coincides on S with the flow of VS . We summarize the above discussion with Theorem 3.1.4.3. (singular reduction for dynamical systems) Let V be a vector field on M and G a symmetry group of V which acts properly on M . Then, the space M = M/G of G-orbits on M is a subcartesian space stratified by reduced dynamical systems (S, VS ). The orbit map π : M → M maps each trajectory of the original dynamical system to a trajectory of one of the reduced dynamical systems (S, VS ). We make the following observation. Proposition 3.1.4.4. Each path component of an isotropy type of the Gaction on M is invariant under the flow of a G-invariant vector field V . Proof. Let Im be the maximal domain of definition of the integral curve t '→ ϕt (m) of V starting at m. Then for every t ∈ Im we have g ∈ Gm ⇐⇒ Φg (m) = m ⇐⇒ Φg (ϕt (m)) = ϕt (Φg (m)) = ϕt (m) ⇐⇒ g ∈ Gϕt (m) .

84

Symmetry and reduction

In other words, Gm = Gϕt (m) . Thus the flow of V leaves each isotropy type MGm invariant. Because the integral curve t 7→ ϕt (m) is continuous, it follows that every path component of each isotropy type is left invariant by the flow of V . Using lemma 2.3.1.4, we can replace the action of G on the path component N of the isotropy type MH , where H = Gm for some m ∈ M , by the free action of N (H)/H on N . Since the G-action on M is proper, it is a proper action on N . Therefore the action of N (H) on N is proper, since N (H) is a Lie subgroup of G. Consequently, the action of N (H)/H on N is free and proper. Remark 3.1.4.5. Proposition 3.1.4.4 can be viewed as a conservation law which is a consequence of the non-freeness of the G-action. Corollary 3.1.4.6. If {m} is an isolated path component of the isotropy type MH , then m is an equilibrium point of V , that is, V (m) = 0. 3.2

Nonholonomic singular reduction for a proper action

In this section we define the notion of symmetry of a nonholonomically constrained system and then show how to reduce it, when the action of the symmetry group is proper. Let Φ : G × D → D be a smooth action of a Lie group G on a smooth manifold D. We say that G is a symmetry of the distributional Hamiltonian system (D, H, $, h) if and only if 1. the distribution H on D is G-invariant, that is for every (g, u) ∈ G × D Hg·u = Tu Φg Hu . 2. the symplectic form $ on H is G-invariant, that is, $g·u (Tu Φg vu , Tu Φg wu ) = $u (vu , wu ), for every (g, u) ∈ G × D and every vu , wu ∈ Hu . 3. h is a G-invariant smooth function on D. In treating reduction of a nonholonomic system with a symmetry, we will begin with the almost Poisson bracket formulation of §7 of chapter 1. Recall that on C ∞ (D) we have an almost Poisson bracket { , } defined by {f, k} = −LYf k for every f, k ∈ C ∞ (D). Here Yf is the almost Hamiltonian

85

3 2. Nonholonomic singular reduction

vector field on D with values in H defined by Yf chapter 1.

$ = ∂H f , see §6.3 of

Let C ∞ (D)G be the space of smooth G-invariant functions on D. Lemma 3.2.7. For every f ∈ C ∞ (D)G , the almost Hamiltonian vector field Yf is G-invariant. Proof. For every (g, u) ∈ G × D we have $g·u (Tu Φg Yf (u), Tu Φg vu )

= (Φ∗g $)u (Yf (u), vu ) = $u (Yf (u), vu ), since $ is G-invariant = hdf (u) | vu i = hd(Φ∗g f )(u) | vu i = hΦ∗g (df )(u) | vu i = hdf (Φg (u)) | Tu Φg vu i = $g·u (Yf (Φg (u)), Tu Φg vu ).

(3)

But Tu Φg : Tu Hu → Tu Hg·u is bijective, since H is G-invariant. Moreover, $g·u is nondegenerate. Therefore (3) implies Yf (Φg (u)) = Tu Φg Yf (u), that is, Yf is G-invariant. Corollary 3.2.8. The space (C ∞ (D)G , { , }) is an almost Poisson subalgebra of (C ∞ (D), { , }). Proof. We need only show that if f, k ∈ C ∞ (D)G , then {f, k} ∈ C ∞ (D)G . By definition of the almost Poisson bracket { , } on C ∞ (D) we have {f, k}(u) = −LYf k(u) = −hdk(u) | Yf (u)i,

for every u ∈ D. Now

hd(Φ∗g k)(u) | Φ∗g (Yf )(u)i = hTu Φg t (dk(Φg (u))) | (Tu Φg )−1 Yf (Φg (u))i = hdk(Φg (u)) | Yf (Φg (u))i = −{f, k}(Φg (u)).

(4)

But for every u ∈ D and every vu ∈ Hu

$u (YΦ∗g f (u), vu ) = hd(Φ∗g f )(u) | vu i = hΦ∗g (df )(u) | vu i = hdf (Φg (u)) | Tu Φg vu i = $g·u (Yf (Φg (u)), Tu Φg vu ) = (Φ∗g $)u ((Tu Φg )−1 Yf (Φg (u)), vu ) = $u (Φ∗g Yf (u), vu ),

since $u is G-invariant. Therefore, YΦ∗g f = Φ∗g Yf , since $u is nondegenerate on Hu . Thus we can rewrite (4) as {f, k}(Φg (u)) = −hd(Φ∗g k)(u) | YΦ∗g f (u)i

= −hdk(u) | Yf (u)i, since f, k ∈ C ∞ (D)G

= {f, k}(u).

86

Symmetry and reduction

We denote by D the space of G-orbits on D and by ρ : D → D the orbit map. We know from the discussion in §4 of chapter 2 that (D, C ∞ (D)) is a subcartesian differential space, where C ∞ (D) = {f : C ∞ (D) → R | ρ∗ f ∈ C ∞ (D)}. The map ρ∗ : C ∞ (D) → C ∞ (D) defines an associative algebra isomorphism of C ∞ (D) onto C ∞ (D)G . We can use this isomorphism to define in C ∞ (D) the structure of almost Poisson algebra. For each f , k ∈ C ∞ (D), the bracket {f , k} is uniquely defined by the condition ρ∗ {f , k} = {ρ∗ f, ρ∗ k}.

Proposition 3.2.9. (C ∞ (D), { , }) is an almost Poisson algebra. Proof. We only need to verify {f , g · k} = {f , g} · k + g · {f , k}.

(5)

This we do as follows. ρ∗ {f , g · k} = {ρ∗ f , ρ∗ g · ρ∗ k}

= {ρ∗ f , ρ∗ g} · ρ∗ k + ρ∗ g · {ρ∗ f , ρ∗ k},

since Leibniz’ rule holds on (C ∞ (D), { , })

= ρ∗ ({f , g} · k + g · {f , k}).

Equation (5) follows because ρ∗ an injective map from C ∞ (D) into C ∞ (D), since ρ is surjective. We call (C ∞ (D), { , }) the reduced almost Poisson algebra on C ∞ (D) with almost Poisson structure { , }. Information about the geometry of the orbit space D is encoded in the structure of the reduced almost Poisson algebra (C ∞ (D), { , }). The remainder of this section is devoted to retrieving this information and formulating it in explicitly geometric terms. Since the action of G on D is proper, we know from the discussion of §6 of chapter 2 that D = D/G has the orbit type stratification S. It follows from theorem 2.10.4.41 of chapter 2 that each stratum of S is an orbit of the family X (D, C ∞ (D)) of all vector fields on D. Thus, the stratification S is determined by derivations of C ∞ (D) which generate local one-parameter groups of diffeomorphisms of D. On the other hand, the notion of a diffeomorphism of D is defined in terms of its differential structure C ∞ (D). Hence, we know how to obtain the orbit type stratification of D from its differential structure C ∞ (D).

3 2. Nonholonomic singular reduction

87

Since the almost Poisson bracket on D satisfies Leibniz’ rule, it assigns to each f¯ ∈ C ∞ (D) a derivation Pf of C ∞ (D) such that Pf k = {f , k}.

(6)

Lemma 3.2.10. For each f ∈ C ∞ (D), the derivation Pf of C ∞ (D) is a vector field on D. Proof. The pull back ρ∗ f is a function f which lies in C ∞ (D)G . The almost Hamiltonian vector field Yf associated to f generates a local oneparameter group of local diffeomorphism ϕt of D. Since f is G-invariant, Yf is G-invariant and ϕt commutes with the action of G. Hence, ϕt gives rise to a local one-parameter group of local diffeomorphisms of D such that ϕ¯t ◦ ρ = ρ ◦ ϕt . For every k ∈ C ∞ (D) and u ∈ D, let k = ρ∗ k and u be an element of ρ−1 (u). Then, d d d d k(ϕt (u)) = k(ϕt (ρ(u)) = k(ρ ◦ ϕt (u)) = k(ϕt (u)) dt dt dt dt = (Yf k)(ϕt (u)) = {f, k}(ϕt (u)) = {f , k}(ρ(ϕt (u)) = {f , k}(ϕt (ρ(u)) = (Yf k)(ϕt (u)).

Hence, t 7→ ϕt (u) is an integral curve of Pf through u. This implies ϕt is a local one parameter group of local diffeomorphisms of D generated by Yf . Hence Pf is a vector field on D. ¯ is called the almost Poisson vector field corThe vector field Pf¯ on D responding to f¯ ∈ C ∞ (D) Proposition 3.2.11. Let S be a stratum of the orbit type stratification S of D. The ring C ∞ (S) of smooth functions on S inherits an almost Poisson algebra structure from the reduced almost Poisson algebra (C ∞ (D), { , }). Since a stratum S of S is an orbit of the familly X (D, C ∞ (D)) of all vector fields on D, it follows from lemma 3.2.10 that, for every f ∈ C ∞ (D), the vector field Pf is tangent to S. Hence, for every u in S and k ∈ C ∞ (D), {f , k}(u) = (Pf k)(u) depends on k only through the restriction of k to S. On the other hand, {f , k}(u) = −{k, f }(u) and by the same argument as above it depends on f only through the restriction of f to S. Hence, the restriction to S of the almost Poisson bracket on D is a well defined bracket on the space of restrictions to S of functions in C ∞ (D).

88

Symmetry and reduction

By lemma 2.4.1.11 of chapter 2 the ring C ∞ (S) of smooth functions on S given by the manifold structure of S is generated by restrictions to S of smooth functions on D. That is, for each fS ∈ C ∞ (S) and each point u ∈ S, there is a neighborhood U of u in S and a function f ∈ C ∞ (D) such that fS | U = f | U . This implies that the bracket defined on the space of restrictions to S of smooth functions on D extends to a bracket defined on C ∞ (S) which we denote also by { , }. Clearly, the bracket { , } on C ∞ (S) is bilinear, skew symmetric, and satisfies Leibniz’ rule. Hence, (C ∞ (S), { , }) is an almost Poisson algebra. For each fS ∈ C ∞ (S), we denote by PfS the almost Poisson vector field of fS given by PfS kS = {fS , kS } for every kS ∈ C ∞ (S). Let HS be a generalized distribution spanned by the family PS = {PfS | fS ∈ C ∞ (S)} of Poisson vector fields on S. Since the family PS of almost Poisson vector fields on S is closed under addition and multiplication by numbers, it follows that HS = {PfS (s) | s ∈ S and fS ∈ C ∞ (S)}.

(7)

Proposition 3.2.12. The almost Poisson bracket on C ∞ (S) gives rise to a unique antisymmetric bilinear form $S on HS such that $S (PfS , PkS ) = {kS , fS }

(8)

for every fS , kS ∈ C ∞ (S). Proof. For each s ∈ S, $S (PfS (s), PkS (s)) = {kS , fS }(s).

(9)

Suppose that fS and fS0 in C ∞ (S) are such that PˇfS (s) = PfS0 (s). Let f 0

and f be functions in C ∞ (D) such that their restrictions to S are fS and fS0 , respectively. Then PfS and PfS0 are the restriction to S of Pf and Pf 0 , respectively. Moreover, if k ∈ C ∞ (D) is an extension of kS , then {kS , fS }(s) = {k, f }(s) = −(Pf k)(s) = −Pf (s)k = −PfS (s)k = −PfS0 (s)k 0

= −Pf 0 (s)k = −(Pf 0 k)(s) = {k, f }(s) = {kS , fS0 }(s).

Hence, the right hand side of equation (9) depends on fS only through PfS (s). Similarly, one can show that the right hand side of equation (9)

3 2. Nonholonomic singular reduction

89

depends on kS only through PkS (s). Hence, $S (PfS (s), PkS (s)) is well defined by equation (9). Bilinearity and antisymmetry of $S (s) follow from bilinearity and antisymmetry of the almost Poisson bracket. Proposition 3.2.13. The antisymmetric bilinear form $S on HS is nondegenerate, so that (HS , $S ) is a symplectic distribution on S. Proof. See the argument leading to proposition 1.7.1.26 of chapter 1. For each fS ∈ C ∞ (S), we denote by YfS the almost Hamiltonian vector field on S defined in terms of the symplectic distribution (HS , $S ). In other words, YS is the unique vector field on S with values in HS such that YS $S = ∂HS fS , where ∂HS denotes the restriction of d to HS . Equation (8) gives $S (PfS , PkS ) = {kS , fS } = PkS fS = hdfS | PkS i. Since the vector fields PkS , for k ∈ C ∞ (S), span HS and $S is nondegenerate, it follows that PfS $S = ∂HS fS . Thus, we have proved Proposition 3.2.14. For each fS ∈ C ∞ (S), the almost Poisson vector field PfS coincides with the almost Hamiltonian vector field PfS = YfS . By definition the Hamiltonian h of the original distributional Hamiltonian system (D, H, $, h) is G-invariant. Hence, there exists a unique h ∈ C ∞ (D) such that h = ρ∗ h. The pull back of h to a stratum S of D is a function hS ∈ C ∞ (S). Thus, each stratum S of D carries the structure (S, HS , $S , hS ) of a distributional Hamiltonian system. Motions of the distributional Hamiltonian system (D, H, $, h) are given by integral curves of the almost Hamiltonian vector field Yh of h. Let c : (a, b) → D be an integral curve of Yh . It follows from the discussion in §7 of chapter 2 that c = ρ ◦ c : (a, b) → D is an integral curve of the vector field Y h = ρ∗ Yh . This implies that c : (a, b) → D is contained in a stratum S of D. Moreover, for each f ∈ C ∞ (D), ρ∗ (Y h f ) = Yh ρ∗ f = $(Yρ∗ f , Yh ) = {h, ρ∗ f } = ρ∗ {h, f } = ρ∗ Ph f . This implies that Y h = Ph . Since the restriction of Ph to the stratum S of D is equal to the almost Hamiltonian vector field YhS of the restriction hS of h to S. This implies that, the curve c = ρ ◦ c is an integral curve of the

90

Symmetry and reduction

almost Hamiltonian vector field YhS . We summarize the above discussion in Theorem 3.2.15. (singular nonholonomic reduction) Given a distributional Hamiltonian system (D, H, $, h) with a symmetry group G which acts properly on D, the space D = D/G of G-orbits on D is a subcartesian differential space stratified by reduced distributional Hamiltonian systems (S, HS , $S , hS ). The orbit map ρ : D → D sends each trajectory of the original distributional Hamiltonian system (D, H, $, h) to a trajectory of one of the reduced distributional Hamiltonian systems (S, HS , $S , hS ). Remark 3.2.16. Singular Hamiltonian reduction. Singular Hamiltonian reduction is a special case of singular nonholonomic reduction. A Hamiltonian system (P, ω, h) can be considered as an integrable distributional Hamiltonian system (P, H, $, h), where H = T P and $ = ω. If G is a symmetry group of (P, ω, h) acting properly on P , then it is a symmetry group of the distributional Hamltonian system. The orbit space P = P/G is a subcartesian space stratified by reduced distributional Hamiltonian systems (S, HS , $S , hS ). In this case, for each stratum S, the distribution HS is involutive and, for each integral manifold M of S, the form $S restricted to T M gives a symplectic form ωM on M . Thus, each stratum S of P is singularly foliated by reduced Hamiltonian systems (M, ωM , hM ), where hM is the pull back of h to M . Remark 3.2.17. Singular symplectic reduction. Let G be a Lie group, which acts properly and symplectically on the smooth symplectic manifold (M, ω). The algebra of smooth functions on (M, ω) is C ∞ (M ). We now describe the process of singular symplectic reduction. We start by noting that the space P = M/G of G orbits on M is a differential space with differential structure C ∞ (P ) = {h : P → R π ∗ h ∈ C ∞ (M )}. Here π : M → P is the G-orbit map. Since the action of G on M preserves the symplectic form ω on M , the induced action on C ∞ (M ) preserves the Poisson bracket. Hence C ∞ (M )G is a Poisson subagebra of C ∞ (M ). Using the ring isomorphism π ∗ : C ∞ (P ) → C ∞ (M )G we can pull back the Poisson bracket on C ∞ (M )G to a Poisson bracket on C ∞ (P ), namely, for each h1 , h2 ∈ C ∞ (P ) their Poisson bracket {h1 , h2 } is given by π ∗ {h1 , h2 } = {π ∗ h1 , π ∗ h2 }. For each h ∈ C ∞ (P ) define the Poisson derivation Yh corresponding to h by Yh (h0 ) = {h, h0 } for every h0 ∈ C ∞ (P ). Each Poisson derivation is a vector field on P in the sense that its flow at a fixed time is a local one parameter group of local diffeomorphisms of P .

3 2. Nonholonomic singular reduction

91

Let P(P ) ⊆ X (P ) be the collection of all Poisson derivations of C ∞ (P ). Here X (P ) is the set of all vector fields on P . S Recall that P is a stratified space p∈P Pp where the stratum Pp through the point p of P is an orbit of the family X (P ) of all vector fields on P . The space C ∞ (Pp ) of smooth functions on Pp is generated by the restriction to Pp of smooth functions on P . The restriction to Pp of vector fields on P are tangent to Pp , because Pp is an orbit of X (P ). Hence C ∞ (Pp ) inherits the structure of a Poisson algebra from C ∞ (P ). Thus Pp is a Poisson manifold. The space P(Pp ) of all Poisson vector fields on Pp coincides with the space of restrictions to Pp of Poisson vector fields on P . The orbit through p ∈ Pp of the family P(Pp ) of Poisson vector fields on Pp is the symplectic leaf (Ppsymp , ωp ) through p. Using the Poisson structure on C ∞ (P ), the symplectic form ωp on Ppsymp is defined as follows: ωp (Yh1 (p), Yh2 (p)) = {h1 , h2 }(p), for every h1 , h2 ∈ C ∞ (P ). See [107]. Note that the symplectic form ωp on the symplectic leaf Ppsymp is completely determined by the Poisson structure on C ∞ (P ). Let R be a subcartesian differential space such that its differential structure C ∞ (R) is a Poisson algebra. Then the orbits of the family X (R) of vector fields on R are Poisson manifolds, which are singularly foliated by symplectic manifolds. Suppose that ϕ : (P, C ∞ (P )) → (R, C ∞ (R)) is a diffeomorphism of differential spaces such that ϕ∗ : C ∞ (R) → C ∞ (P ) is an isomorphism of Poisson algebras. Then for each p ∈ P the restriction of ϕ to the symplectic leaf Ppsymp through p in P is a symplectomorphism of (Ppsymp , ωp ) onto the symplectic leaf of R through r = ϕ(p). Remark 3.2.18. Singular reduction by stages. Suppose that the Lie group G acts properly and symplectically on a smooth symplectic manifold (M, ω). If G has a normal subgroup H, then we may first reduce by the action of H, obtaining the space Q = M/H of H-orbits on M . The space Q is a stratified subcartesian differential space with differential structure C ∞ (Q) = {hQ : Q → R ρ∗ hQ ∈ C ∞ (M )}, where ρ : M → Q is the Horbit map. C ∞ (Q) has a Poisson algebra structure and for each q ∈ Q the stratum Qq through q is a smooth Poisson manifold, which is singularly foliated by symplectic leaves. Now consider the action of the quotient group L = G/H on Q = M/H. Since M is locally compact and the G-action is proper, it follows that the action of L on Q is proper. In addition the space R = Q/L of L-orbits

92

Symmetry and reduction

on Q is a locally compact, subcartesian differential space with respect to the differential structure C ∞ (R) = {hR : R → R σ ∗ hR ∈ C ∞ (Q)}. Here σ : Q → R is the L-orbit map. See [72]. The definition of C ∞ (R) implies that C ∞ (R) = {hR : R → R (σ ◦ ρ)∗ ∈ C ∞ (M )G }. Hence (σ ◦ ρ)∗ : C ∞ (R) → C ∞ (M )G is a ring homomorphism, which we can use to pull back the Poisson algebra structure on C ∞ (M )G to C ∞ (R). Proposition 3.2.19. There is a unique isomorphism ϕ : (P, C ∞ (P )) → (R, C ∞ (R)) of differential spaces such that ϕ ◦ π = σ ◦ ρ. Moreover, ϕ∗ : C ∞ (R) → C ∞ (P ) is an isomorphism of Poisson algebras. Proof. Both the maps π : M → P and σ ◦ ρ : M → R are surjective. Let x0 ∈ M , let p0 = π(x0 ), q0 = ρ(x0 ), and r0 = σ(q0 ) be points on P , Q, and R, respectively. The fiber π −1 (p) is the orbit Gx0 of G through x0 . The action of L on Q associates to each element gH ∈ G/H in L and each q ∈ Q the point (gH)q = ρ(gx) for any x ∈ ρ−1 (q). Hence (σ ◦ ρ)−1 (r0 ) = ρ−1 σ −1 (r0 ) = ρ−1 (Lq0 ) = ρ−1 ({ρ(gx0 ) g ∈ G}) = Gx0 . Consequently, π −1 (π(x0 )) = (σ ◦ ρ)−1 (σ(ρ(x0 ))) for every x0 ∈ M . Hence there is a unique bijection ϕ : P → R such that ϕ(π(x)) = σ(ρ(x)) for every x ∈ M .

To show that ϕ is smooth, note that ϕ ◦ π = σ ◦ ρ implies that (σ ◦ ρ)∗ hR ∈ C ∞ (M )G for every hR ∈ C ∞ (R). Therefore there is a unique hP ∈ C ∞ (P ) such that (σ ◦ ρ)∗ = π ∗ hP . But ϕ ◦ π = σ ◦ ρ implies that hP = ϕ∗ hR . Moreover, every hP ∈ C ∞ (P ) is of the form π ∗ f for some f ∈ C ∞ (M )G . On the other hand f = (σ ◦ ρ)∗ hR for some hR ∈ C ∞ (R). Hence ϕ∗ maps C ∞ (R) onto C ∞ (P ), which implies that ϕ : (P, C ∞ (P )) → (R, C ∞ (R)) is a diffeomorphism. The Poisson algebra structures on C ∞ (P ) and C ∞ (R) are induced by the Poisson algebra structure on C ∞ (M )G and the ring homomorphisms π ∗ : C ∞ (P ) → C ∞ (M )G and (σ ◦ ρ)∗ : C ∞ (R) → C ∞ (M )G . This implies that ϕ∗ : C ∞ (R) → C ∞ (P ) is an isomorphism of Poisson algebras. Taking into account remark 3.2.18 on singular symplectic reduction we have proved Corollary 3.2.20. For every x ∈ M , the restriction of the map ϕ to the symp symplectic leaf Pπ(x) of the stratum Pπ(x) of P is a symplectomorphism of symp (Pπ(x) , ωπ(x) ) onto the symplectic leaf through r = ϕ(π(x)) = σ(ρ(x)) of the orbit through r of the family X (R) of all vector fields on R.

93

3.3. Nonholonomic regular reduction

The above results can be informally stated that for a proper action of a Lie group on a symplectic manifold that Poisson reduction by stages proposition 3.2.19 implies symplectic reduction by stages corollary 3.2.20. Corollary 3.2.21. Suppose that the reduced space has one stratum, that is, S = D, and that the reduced distributional system (D, HD , $D , hD ) is simple, that is, D is the only accessible set of HD . Then the only Casimir functions of the reduced Poisson algebra (C ∞ (D), { , }) are the constant functions. Remark 3.2.22. The conclusion of corollary 3.2.21 is in contrast with G-invariant Hamiltonian systems, where the momentum functions associated to the symmetries are nonconstant Casimirs for the reduced Poisson algebra.

3.3

Nonholonomic reduction for a free and proper action

Suppose that Φ : G × D → D is a free and proper action, which is a symmetry of the distributional Hamiltonian system (D, H, $, h). In this section we relate the construction of the reduced distributional Hamiltonian system given above to regular nonholonomic reduction. Because the G-action is free and proper, the space D of G-orbits on D is a smooth manifold and the G-orbit map ρ : D → D exhibits D as a principal G-bundle. In other words, the orbit type stratification S of D has only one stratum S = D. Hence, the reduced distributional system (S, HS , $S , hS ) will be denoted by (D, H, $, h). For each u ∈ D let Uu = span{Yf (u) ∈ Hu f ∈ C ∞ (D)G }.

(10)

From lemma 1.7.1.24 of chapter 1 it follows that u 7→ Uu is a generalized distribution on D. ◦

Proposition 3.3.23. Let (ker Tu ρ) = {αu ∈ Tu∗ D αu | ker Tu ρ = 0}. ◦ Then u 7→ (ker Tu ρ) is a distribution on D and (ker Tu ρ)

◦

= span{df (u) ∈ Tu∗ D f ∈ C ∞ (D)G }.

(11)

Proof. Since ρ : D → D is a principal G-bundle, dim ker Tu ρ = dim g, ◦ where g is the Lie algebra of G. Therefore dim(ker Tu ρ) = dim D − dim g. ◦ Clearly, u 7→ (ker Tu ρ) is smooth and therefore defines a distribution on D.

94

Symmetry and reduction

Let f ∈ C ∞ (D)G . For every u ∈ D and every ξ ∈ g we have hdf (u) | X ξ (u)i = (LX ξ f )(u) = =

d dt

t=0

d dt

f (Φu (exp tξ)) t=0

f (u), since f ∈ C ∞ (D)G

= 0. Therefore df (u)| ker Tu ρ = 0, since ker Tu ρ = span{X ξ (u) ξ ∈ g}. So ◦ df (u) ∈ (ker Tu ρ) . Consequently, ◦

span{df (u) ∈ Tu∗ D f ∈ C ∞ (D)} ⊆ (ker Tu ρ) .

Let u = ρ(u). Since D is a smooth manifold, Tu∗ D = span{df (u) f ∈ C ∞ (D)}.

Because Tu ρ : Tu D → Tu D is surjective, its transpose (Tu ρ)t : Tu∗ D → Tu∗ D is injective. But then (Tu ρ)t (Tu∗ D) = span{(Tu ρ)t (df (u)) f ∈ C ∞ (D)} = span{d(ρ∗ f )(u) f ∈ C ∞ (D)} = span{df (u) f ∈ C ∞ (D)G }.

(12)

Now dim(ker Tu ρ)

◦

= dim Tu∗ D − dim g

= dim Tu∗ D ≤ dim span{df (u) f ∈ C ∞ (D)G }. The last inequality above follows from (12). Therefore (11) holds. Corollary 3.3.24. The map u 7→ Uu (10) is a distribution on D. Proof. Since the linear map $u[ is invertible we have dim D − dim g = dim $u[ span{df (u) ∈ Tu∗ D f ∈ C ∞ (D)G } = dim span{$u[ (df (u))|Hu f ∈ C ∞ (D)G }

= dim span{Yf (u) ∈ Hu f ∈ C ∞ (D)G } = dim Uu . Therefore u 7→ Uu is a distribution.

An alternative characterization of U is given in the following Proposition 3.3.25. For each u ∈ D we have

Uu = {wu ∈ Hu $u (vu , wu ) = 0 for every vu ∈ Hu ∩ ker Tu ρ}.

(13)

95

3.3. Nonholonomic regular reduction

Proof. We compute. $u[ (Uu ) = $u[ (span{Yf (u) ∈ Hu f ∈ C ∞ (D)G }), by definition of Uu = span{df (u)|Hu f ∈ C ∞ (D)G },

by definition of almost Hamiltonian vector field. = span{αu ∈ Hu∗ αu | ker Tu ρ = 0}, by (11)

= $u[ (span{wu ∈ Hu $u (wu , vu ) = 0, for every vu ∈ Hu ∩ ker Tu ρ}),

since αu = $u[ (wu ) for a unique wu ∈ Hu . Because $[ is invertible, equation (13) holds. Let H = T ρ(U ). Since, for each f ∈ C ∞ (D), the vector field Yf is Ginvariant, it follows that it projects to a vector field Yf on D. Hence, H is a generalized distribution on D. In §3.3 of chapter 5 we show that the reduced generalized distribution for Carath´eodory’s sleigh is not a distribution. For each u ∈ D the linear map Tu ρ|Uu : Uu ⊆ Hu → H u is surjective with kernel equal to Uu ∩ (Hu ∩ ker Tu ρ) = Uu ∩ Uu$u . Therefore Tu ρ|Uu induces a bijective linear map ρu : Uu /(Uu ∩ Uu$u ) → H u . The symplectic form $u on Hu induces a nondegenerate skew symmetric bilinear form $ bu on Uu /(Uu ∩ Uu$u ) defined by $ b u (vu + Uu ∩ Uu$u , vu0 + Uu ∩ Uu$u ) = $u (vu , vu0 ),

for every vu , vu0 ∈ Uu . To see this suppose that 0 = $ b u (vu + Uu ∩ Uu$u , vu0 + $u 0 0 Uu ∩ Uu ) for every vu ∈ Uu . Then 0 = $u (vu , vu ) for every vu0 ∈ Uu . This implies that vu ∈ Uu$u . But vu ∈ Uu by hypothesis. So vu ∈ Uu ∩ Uu$u . Therefore vu + Uu ∩ Uu$u = 0 ∈ Uu /(Uu ∩ Uu$u ). In other words, $ b u is nondegenerate. On H u define $u by $u = ((ρu )−1 )∗ $ b u . Since ρu and $ b u[ [ are invertible, so is $u . Therefore $u is nondegenerate. This proves Proposition 3.3.26. For each u ∈ D the skew symmetric bilinear form $u on H u is nondegenerate.

For each f ∈ C ∞ (D), we denote by Yf the almost Hamiltonian vector field corresponding to the symplectic distribution (H, $) on D. In other words, Yf is the unique vector field on D with values in H such that Yf

$ = ∂H f .

Proposition 3.3.27. For each f ∈ C ∞ (D)G the push forward Yf of the almost Hamiltonian vector field Yf on D by the orbit map ρ : D → D is equal to the almost Hamiltonian vector field Yf where ρ∗ f = f .

96

Symmetry and reduction

Proof. Since f = ρ∗ f is G-invariant, it follows that Yf has values in the distribution U on D defined by equation (10). Hence, its push forward Y f to D has values in H = T ρ(U ). Given u ∈ D and w ∈ Uu , let u = ρ(u) and w = T ρ(w) ∈ H u . Then, "Yf (u)

( | w# = "∂H f | w# = "df | w# = "df | w#.

On the other hand, Yf

( = ∂H f implies

"df | w# = "∂H f | w# = "Yf (u)

( | w#

= ((Yf (u), w) = ((T ρ(Yf (u)) | T ρ(w)) = ((Yf (u), w) = "Yf (u)

( | w#.

Hence, Yf (u) ( = Yf (u) ( for every u ∈ D. degenerate, it follows that Yf = Yf .

Since, ( is non-

The Hamiltonian h of our distributional Hamiltonian system (D, H, (, h) is G-invariant and it pushes forward to a function h ∈ C ∞ (D). Hence, we obtain a reduced distributional Hamiltonian system (D, H, (, h). On the other hand, singular reduction applied to our case leads to a distributional Hamiltonian system (S, HS , (S , hS ) on each stratum of D. Since the action of G on D is free and proper, the orbit space D is a manifold. It means that the stratification S of D by orbit type has only one stratum S = D. We are going to show that HS = H, (S = ( and hS = h. Lemma 3.3.28. For each f ∈ C ∞ (D)G the push forward Yf of the almost Hamiltonian vector field Yf on D by the orbit map ρ : D → D is equal to the almost Poisson vector field Pf where ρ∗ f = f . Proof. We compute. For every u ∈ D and every k ∈ C ∞ (D) we have ρ∗ (LYf k)(u) = "dk(u) | Y f (u)#, where u = ρ(u) = "dk(u) | Tu ρYf (u)#,

= "(Tu ρ)t dk(ρ(u)) | Yf (u)# = "ρ∗ (dk)(u) | Yf (u)#

= "d(ρ∗ k)(u) | Yf (u)# = "dk(u) | Yf (u)#, where ρ∗ k = k

= {f, k}(u) = {ρ∗ f , ρ∗ k}(u) = ρ∗ {f , k}(u) = ρ∗ (LPf k)(u).

But ρ∗ : C ∞ (D) → C ∞ (D)G : f '→ f ◦ ρ is injective, since ρ : D → D is surjective. Therefore LPf k = LYf k for every k ∈ C ∞ (D), that is, the vector fields Pf and Yf on D are equal.

97

3.4. Chaplygin systems

Equation (7) yields HS = {PfS (s) | s ∈ S and fS ∈ C ∞ (S)} = {Pf (u) | u ∈ D and f ∈ C ∞ (D)}

= {Yf (u) | u ∈ D and f ∈ C ∞ (D)}

since S = D by proposition 3.2.12

∞

= {T ρ(Yf (u)) | u ∈ D and f ∈ C (D)G }

= T ρ(Uu ) = H u .

Moreover, since S = D, we can write f for fS in C ∞ (S). Proposition 3.2.11 and proposition 3.2.12 give $S (PfS , PkS ) = $S (Pf , Pk ) = $S (Yf , Yk ). On the other hand, equation (8) yields $S (PfS , PkS ) = {kS , fS } = {k, f } = Pk f = Yk f = $(Yf , Yk ). Hence, $S = $. We summarize our results with Theorem 3.3.29. (regular nonholonomic reduction) Given a distributional Hamiltonian system (D, H, $, h) which has a symmetry group G that acts freely and properly on D, there is a reduced distributional Hamiltonian system (D, H, $, h) where (H, $) is a generalized symplectic distribution on the G-orbit space D and ρ∗ h = h. Here ρ : D → D is the G-orbit map. The reduced distributional Hamiltonian system (D, H, $, h) coincides with the reduced distributional Hamiltonian system obtained by singular reduction. 3.4

Chaplygin systems

e be a smooth free and proper action of a Lie group G on a smooth Let Φ manifold Q with orbit map π : Q → Q/G = R.

Suppose that we have a nonholonomically constrained dynamical system (D, `) with constraint distribution D on Q and Lagrangian ` on T Q. Suppose that this system has the following properties. e g q˙ leaves the 1. The G-action Φ : G × T Q → T Q : (g, (q, q)) ˙ → 7 Tq Φ distribution D invariant. 2. For each q ∈ Q, we have Tq Q = Dq ⊕ g · q, where g · q = span{Xξ (q) = d e q (exp t ξ) ∈ Tq Q ξ ∈ g}. Φ dt t=0

e 3. The Lagrangian ` : T Q → R is invariant under the G-action Φ. We call (D, `, G) a Chaplygin system.

98

Symmetry and reduction

' is free and proper, the G-orbit map π : Q → R Because the G-action Φ defines a principal G-bundle. The lifted G-action Φ on T Q is proper, as is its restriction to the G-invariant distribution D, which is a closed subset of T Q because it is a submanifold. The restricted action Φ|D : G × D → D is free. Thus T π|D : D → T R is a smooth fibration.

On Q we have two smooth distributions: ver and hor with verq = g · q = ker Tq π and horq = Dq . By property 2 we have Tq Q = verq ⊕ horq . Moreover Tq π : horq → Tπ(q) R is a bijective linear map. Therefore we have an Ehresmann connection on Q with horizontal distribution D, which for short we call the connection D. From properties 1 and 3 it follows that the connection D is G-invariant. For each wr ∈ Tr R with r = π(q) there is a unique vq ∈ Dq such that Tq π vq = wr . The vector vq = liftq wr is called the horizontal lift of wr with respect to the connection D. Lemma 3.4.30. The G-invariant map T π|D : D → T R induces a diffeomorphism ψ : D/G → T R. Proof. Let q * ∈ Q. Then q * ∈ G · q if and only if there is a unique g ∈ G ' on Q is free. Because D is such that q * = g · q, since the G action Φ G-invariant, for every wr ∈ Tr R with r = π(q) = π(q * ) we have ' g (liftq wr ). liftq" wr = Tq Φ

Therefore each fiber of T π|D is a G-orbit on D. Consequently, the map T π|D : D → T R induces a diffeomorphism ψ : D/G → T R. In the rest of this section we identify the orbit space D/G with T R. Because the Lagrangian 3 : T Q → R is Φ-invariant, there is a unique function 3 : T R → R such that 3|D = (T π|D)∗ 3. We call 3 the reduced Lagrangian on T R. In fact, for every wr ∈ Tr R with r = π(q) we have 3(r, wr ) = 3(q, liftq wr ).

For each (q, q) ˙ ∈ T Q, the Lagrange derivative δ3 = δ(q, q) ˙ of 3 at (q, q) ˙ is a linear function on Tq Q. The Lagrange-d’Alembert principle in §3.2 of chapter 1 states that δ3|Dq = 0. From properties 1 and 3 of a Chaplygin system it follows that δ3|D is a G-invariant function on D. Therefore there is a unique mapping δ3 : T R → T ∗ R such that "δ3(r, r) ˙ | wr # = "δ3(q, q) ˙ | vq #,

where (r, r) ˙ = T (T π)(q, q), ˙ wr = Tq π vq , and vq ∈ Dq . The last two of the preceding conditions just say that vq = liftq wr . We call δ3 the reduced Lagrange derivative of 3.

99

3.4. Chaplygin systems

We now compare the reduced Lagrange derivative δ` with the Lagrange derivative δ` of the reduced Lagrangian. To do this we use a trivialization of T R. Suppose that we have an isomorphism λ : R × Rp → T R of vector bundles. Then the map λ : Q × (Rp × g) → T Q : (q, (c, ξ)) 7→ liftq λ(π(q), c) + Xξ (q)

(14)

is an isomorphism of vector bundles. Using the notation of proposition 1.4.6 R Q of chapter 1, we see that e cQ q = (c, 0)q = liftq cπ(q) lies in Dq . Therefore p λ|(Q × (R × {0})) is the inverse of a trivialization of D, thought of as a vector subbundle of T Q. Let γ : R → Q be a smooth curve. Define a smooth curve R → Rp ×{0} : t 7→ e c(t) = (c(t), 0) by requiring that γ 0 (t) = λ(γ(t), e c(t)) lie in Dγ(t) for every t ∈ R. Applying the formula for the Lagrange derivative in a trivialization from proposition 1.4.6 of chapter 1 to L = ` ◦ λ we obtain hδL(γ(t), e c(t)) | e ai =h

d ∂L (γ(t), e c) dt ∂e c −

e c=e c(t)

∂` (γ(t), q) ˙ ∂ q˙

∂L (q, e c(t)) |e aQ γ(t) i ∂q q=γ(t) h i Q Q | e c(t) , e a ;

|e ai − h q=e ˙ cQ γ(t)

γ(t)

(15)

while applying the same formula to L = ` ◦ λ gives hδL(γ(t), c(t)) | ai =h

d ∂L (γ(t), c) dt ∂c −

c=c(t)

∂` (γ(t), r) ˙ ∂ r˙

∂L R (r, c(t)) | aγ(t) i ∂r r=γ(t) h i R | c(t) , aR .

| ai − h

r=c ˙ Q γ(t)

γ(t)

(16)

Here γ(t) = π(γ(t)), c(t) = Tγ(t) π e c(t), and aR aR γ(t) = Tγ(t) π e γ(t) . Since R L(q, e c) = `(q, e cQ q ) = `(q, liftq cr ), where r = π(q)

= `(r, cR r ) = L(r, c),

the first two terms in (15) and (16) are equal. h i Q Q The tangent vector e c(t) , e a to Q at γ(t) does not lie in Dγ(t) , γ(t)

Q

R aQ = liftγ(t) aγ(t) even though the vector fields e c(t) = liftγ(t) cR on Q γ(t) and e

100

Symmetry and reduction Q

R

have values in Dγ(t) . Because (T π|D) c(t) = c(t) and (T π|D)∗ e aQ = aR , h i ∗he i it follows that (T π|D)∗ e c(t)Q , e aQ = c(t)R , aR . Therefore h i i h Q Q R e c(t) , e a = liftγ(t) c(t) , aR + Xξ (γ(t)), (17) γ(t)

γ(t)

for some ξ ∈ g.

We now determine ξ. First recall that a g-valued connection 1-form Θ corresponding to the connection D on the G-principal bundle π : Q → R = Q/G is a linear map Θ(q) : Tq Q → g, which depends smoothly on q ∈ Q, such that 1. ker Θ(q) = Dq , for every q ∈ Q; 2. hΘ(q) | Xξ (q)i = ξ, for every ξ ∈ g. Applying hΘγ(t) to both sides of (17) and using the fact that the horii R R zontal lift of c(t) , a lies in Dγ(t) = ker Θγ(t) , we obtain γ(t) h i Q Q Q Q ξ = hΘγ(t) | e c(t) , e a i = −(dΘ)γ(t) (e c(t) , e a ), (18) γ(t)

Q

Q

because e c(t) , e a ∈ Dγ(t) = ker Θγ(t) and dΘ(X, Y ) = X

Θ) − Y

d(Y

d(X

Θ) − [X, Y ]

Θ

for every vector field X, Y on Q with values in D. The g-valued 2-form dΘ applied to a pair of horizontal vector fields on Q is called the curvature 2-form Ω of the connection D. Therefore we may rewrite (18) as ξ = Q Q −Ωγ(t) (e c(t) , e a ). Because h i

∂` R (γ(t), e c) | liftγ(t) c(t) , aR ∂ q˙ γ(t) e c=e c(t)Q γ(t)

we may rewrite (15) as

∂` = (γ(t), c) ∂ r˙

c=c(t)

h i R | c(t) , aR

γ(t)

,

hδ`(γ(t), c(t)) | aR i = hδ`(γ(t), c(t)) | aR i

∂` Q R + (γ(t), e c) | 0, Ωγ(t) liftγ(t) c(t) , liftγ(t) aR , ∂ q˙ c=e e c(t)Q

(19)

γ(t)

Q

R

where q˙ = e c(t) = liftγ(t) c(t) . Since the correction term in (19) is a homogeneous quadratic polynomial in cR , the vector fields on R defined by δ` = 0 and δ` = 0 have the same zeroes and linearizations at each zero.

101

3.4. Chaplygin systems

The correction term in (19) may be written as

∂` (γ(t), e c) ∂ q˙

e c=e c(t)Q γ(t)

Q | 0, Ωγ(t) (e c(t), e a) ,

(20)

which is the momentum applied to the curvature.

Proposition 3.4.31. Let (D, `, G) be a Chaplygin system. If the distribution D is integrable, then δ` = δ`. Proof. Since D is integrable, it follows that Ω = 0. The proposition follows because the correction term in (19) vanishes since it vanishes in (20). Counterexample 3.4.32. Chaplygin claimed1 that the converse to proposition 3.4.31 is valid. We give the following counterexample. Suppose that G acts freely and properly on a smooth manifold Q. Then there is a G-invariant Riemannian metric k on Q. Let π : Q → R = Q/G be the G-orbit map. Then q 7→ verq = g · q = {Xξ (q) ∈ Tq Q ξ ∈ g} defines a vertical distribution ver on the bundle π, because verq = ker Tq π. Let Dq = ver⊥ q be the kq orthogonal complement to verq in Tq Q. Then q 7→ Dq defines a horizontal distribution D on Q, which is invariant under e : G × T Q → T Q : (q, q) the G-action Φ ˙ 7→ Tq Φg q, ˙ so that Tq Q = verq ⊕ Dq . More concretely, let k be a Riemannian metric on R and let kg be any inner product of g. Let D be a G-invariant distribution on Q which is complementary to the distribution ver and for which T π|D : D → T R is a bundle isomorphism. In other words, D is a connection on the bundle π : Q → R = Q/G. For any wr , wr0 ∈ Tr R and any ξ, ξ 0 ∈ g let k(q)(liftq wr + Xξ (q), liftq wr0 + Xξ0 (q)) = k(r)(wr , wr0 ) + kg (ξ, ξ 0 ), where r = π(q) is a Riemannian metric on Q which is G-invariant. Moreover, Dq⊥ = verq = g·q for every q ∈ Q. Let ` : T Q → R : q˙ 7→ 21 k(q)(q, ˙ q) ˙ = [ k (q)(q) ˙ be the kinetic energy associated to the metric k. Then (D, `, G) is a Chaplygin system. Since ∂∂`q˙ = k[ (q)(q), ˙ the correction term in (19) Q Q equals k(γ(t)) e c(t)γ(t) , Ωγ(t) (e c(t), e aγ(t) ) = 0, because (e c(t), e aQ γ(t) ) ∈ Dγ(t) while Ωγ(t) (e c(t), e aQ γ(t) ) ∈ g = verγ(t) . From the fact that in general D is not integrable, it follows that Ω 6= 0, in general. 1 See

equation (7) in Chaplygin [23].

102

Symmetry and reduction

3.5

Orbit types and reduction

Let (D, H, $, h) be a distributional Hamiltonian system with a symmetry G which acts properly on the smooth manifold D. In this section we want to describe the reduced almost Poisson algebra (C ∞ (D), { , }) in terms of the reduced almost Hamiltonian systems on orbit types in D. From lemma 1.7.1.23 of chapter 1, it follows that the distribution H on D is given by u 7→ Hu = span{Yf (u) f ∈ C ∞ (D)}. Let K be the isotropy group Gu of the G-action Φ on D. The K-symmetry type2 DK = {v ∈ D Gv = K} is a smooth submanifold of D. Since the distribution H on D is G-invariant, Tu Φg Hu = Hg·u for every (g, u) ∈ G × D. Therefore, Tu Φk Hu = Hu for every k ∈ K. This implies that Tu DK = T k∈K (ker Tu Φk − id) and that K acts trivially on DK .

Lemma 3.5.33. For each u ∈ D, the subspace HuK = Tu DK ∩ Hu of (Hu , $u ) is symplectic.

Proof. For each k ∈ K, we know that Ak = Tu Φk is a linear map of Hu into itself. Moreover, K → Gl (Hu , R) : k 7→ Ak is a homomorphism of groups. Because K is compact and HuK is invariant under Ak for every k ∈ K, there is a subspace F of Hu , which is invariant under Ak for every k ∈ K such that Hu = HuK ⊕ F . From the fact that $u is Φg -invariant for every g ∈ G, it follows that Φ∗k $u = $u for every k ∈ K. For k ∈ K, wu ∈ Hu and vu ∈ HuK we have $u (vu , Ak wu − wu ) = $u (vu , Ak wu ) − $(vu , wu )

= $u (A−1 k vu , wu ) − $u (vu , wu ) = $u ((Ak−1 − id)vu , wu ) = 0.

Therefore HuK ⊆ ((Ak − id)Hu )$u . Conversely, if wu ∈ ((Ak − id)Hu )$u , then for every vu ∈ Hu we have 0 = $u (wu , (Ak − id)vu ) = $u ((Ak−1 − id)wu , vu ). Because $u is nondegenerate on Hu , we conclude that wu ∈ ker(Ak−1 − id). Since this conclusion holds for every k ∈ K, we obtain wu ∈ HuK . Consequently, HuK = ((Ak − id)Hu )$u for every k ∈ K. But (Ak − id)Hu = (Ak − id)HuK + (Ak − id)F = (Ak − id)F ⊆ F.

So F $u ⊆ ((Ak − id)Hu )$u = HuK . This implies that {0} ⊆ F ∩ F $u ⊆ F ∩ HuK = {0}. Therefore $u |F is nondegenerate, which implies that $u |HuK is nondegenerate, since $u |Hu is nondegenerate. 2 Also

called the K-isotropy type

3.5. Orbit types and reduction

103

Lemma 3.5.34. If f ∈ C ∞ (D)G , then Yf (u) ∈ HuK for every u ∈ D. Moreover, the condition $u (Yf (u), v) = hdf (u)|vi for every v ∈ HuK , determines Yf (u) uniquely. Proof. Because Φ∗g Yf = YΦ∗g f for every g ∈ G and f ∈ C ∞ (D)G , it follows that the almost Hamiltonian vector field Yf on D is G-invariant, that is, Tu Φg Yf (u) = Yf (Φg (u)) for every (g, u) ∈ G×U . Therefore for every k ∈ K we have Yf (u) ∈ ker(Tu Φk − id) ∩ Hu . In other words, Yf (u) ∈ HuK ∩ Hu . For every v ∈ Hu we have $u (Yf (u), v) = hdf (u) | vi. But $u |(HuK ∩Hu ) is nondegenerate and Yf (u) ∈ HuK ∩Hu . Therefore $u (Yf (u), v) = hdf (u) | vi for every v ∈ HuK ∩ Hu determines Yf (u) uniquely. Corollary 3.5.35. Let f, k ∈ C ∞ (D)G and u ∈ D. Then {f |DK , k|DK } = {f, k}|DK , where K = Gu . Proof. By definition of the almost Poisson bracket { , } on C ∞ (D)G , for each u ∈ DK we have {f, k}(u) = −hdk(u) | Yf (u)i

= −hdk(u)|(Hu ∩ HuK ) | Yf (u)|(Hu ∩ HuK )i, since Yf (u) ∈ Hu ∩ HuK

= −hd(k|DK )(u) | Yf |DK (u)i = {f |DK , k|DK }(u). Let M = MK be a connected component of the symmetry type DK . Then M is a smooth submanifold of D with smooth distribution HM = {Hu u ∈ M } and smooth symplectic form $|M . The proper G-action Φ on D induces a free and proper action of GK = N (K)/K on DK , since K acts trivially on DK . Here N (K) is the normalizer of K in G. Let GM = {g ∈ GK g · m ∈ M for every m ∈ M }. Then GM is a closed subgroup of GK which acts freely and properly on M . Therefore GM is a symmetry group of the distributional Hamiltonian system (M, HM , $M , h|M ). Lemma 3.5.36. Let fM : M → R be a smooth GM -invariant function on M . Then there is a smooth G-invariant function fD : D → R such that f |M = fM . Proof. Let u ∈ M with Gu = K. Let Su be a slice to the GK action through u such that V = (GK ·u)×Su is a GK -invariant open neighborhood of u in DK . Since M is a smooth submanifold of DK , there is an open neighborhood of u in DK of the form W × U , where W is a connected

104

Symmetry and reduction

open neighborhood of u in M . By hypothesis fM ∈ C ∞ (M )GM . Let fSu ∩W = fM |(Su ∩ W ). Extend fSu ∩W to a smooth function fSu on Su , which is constant on the fibers of the projection map (Su ∩ W ) × (Su × U ) → Su ∩ W . Extend fSu to a smooth GK -invariant function fV on V . Using a C ∞ (DK )N (K) -partition of unity on DK , there is a function fDK ∈ C ∞ (DK )N (K) and an open neighborhood W 0 ⊆ W of u in M such that fDK |W 0 = fM |W 0 . Since fDK is N (K)-invariant, it is GK -invariant. Because G acts properly on D, there is a slice Seu at u ∈ M ⊆ DK such that Ve = (G · u) × Seu is a G-invariant open neighborhood of u in D. Since DK is a smooth submanifold of D, there is an open neighborhood f×U e , where W f is an open neighborhood of u in of u in D of the form W f ) × (Seu × U e ). So Ve = (G · u) × DK . We may assume that Seu = (Seu × W f ) × (Seu × U e ) is a G-invariant open neighborhood of u in D. From (Seu × W the argument of the first paragraph we have found fDk ∈ C ∞ (DK )GK e f which extends fM ∈ C ∞ (M )GM . Let feSeu ×W f = fDK |(Su × W ). Extend fee f to a smooth function fee on Seu , which is constant on the fibers S u ×W

Su

f ) × (Seu × U e ) → (Seu × W f ). Extend fee to of the projection map (Seu × W Su a smooth G-invariant function on the G-invariant open neighborhood Ve . Using a C ∞ (D)G -partition of unity on D, we obtain feD ∈ C ∞ (D)G and f0 ⊆ W f of u in DK such that feD |(W f0 ∩ W 0) = an open neighborhood W 0 0 0 0 f ∩ W ) = fM |(W f ∩ W ), which proves the lemma. fDK |(W From lemma 3.5.36 and corollary 3.5.35 we obtain

Proposition 3.5.37. (C ∞ (MK ), { , }) is an almost Poisson algebra. Because GMK is a symmetry of the distributional Hamiltonian system (MK , HMK , $MK , h|MK ) which acts freely and properly on MK , we may apply regular nonholonomic reduction, see theorem 3.2.15, to obtain a reduced distributional Hamiltonian system (M K , H M K , $M K , h|M K ) on the orbit type M K = ρ(MK ) in the Gorbit space M . Recall that (M K , C ∞ (M K )) is a smooth submanifold of the differential space (M , C ∞ (M )). From corollary 3.5.35 and the construction of the reduced almost Poisson algebra (C ∞ (M K ), { , }) on M K in §2, it follows that the inclusion map iK : M K → M , which is a smooth map between the differential spaces (M K , C ∞ (M K )) and (M , C ∞ (M )), induces a surjective almost Poisson map of the almost Poisson algebra (C ∞ (M ), { , }) onto the almost Poisson algebra (C ∞ (M K ), { , }), that is, for every f , k ∈ C ∞ (M ) we have i∗K {f , k} =

3.6. Conservation laws

105

{i∗K f , i∗K k}, or equivalently, {f , k}|M K = {f |M K , k|M K }. Moreover, the restriction to M K of the reduced almost Hamiltonian vector field Yh on the locally compact subcartesian differential space (M , C ∞ (M )) is the almost Hamiltonian vector field Yh|M K on the smooth symplectic manifold (M K , (M K ). 3.6

Conservation laws

In this section we consider a distributional Hamiltonian system (D, H, (, h), which has conservation laws in addition to preservation of energy h. From the nonholonomic Noether theorem, see theorem 1.11.47 of chapter 1, a smooth function f on D is a conservation law for the distributional Hamiltonian vector field Yh if and only if LYh f = 0, that is, f is constant on integral curves of Yh . 3.6.1

Momentum map

Suppose that the distributional Hamiltonian system (D, H, (, h) has a symmetry group G and that the action Φ : G × D → D is proper. Also assume that the distributional Hamiltonian vector field Yf associated to the conservation law f has a flow ϕft : D → D : u '→ Φu (exp t ξ) for some ξ ∈ g, the Lie algebra of G. In other words, Yf = X ξ for some ξ ∈ g. d Φu (exp t ξ) for every u ∈ D. Since the maps Recall that X ξ (u) = dt t=0

g → X ∞ (D) : ξ '→ X ξ and C ∞ (D) → X ∞ (D) : f '→ Yf are linear, we obtain that h = {ξ ∈ g there is f ∈ C ∞ (D) such that X ξ

( = ∂H f }

is a vector subspace of g. Lemma 3.6.1.38. h is an ideal in g. Proof. To prove the lemma it suffices to show that if X ξ X Adg ξ

( = ∂H (f ◦ Φg−1 ),

( = ∂H f then (21)

for every g ∈ G. To prove equation (21) let Z(u) ∈ Hu where u ∈ D. We

106

Symmetry and reduction

now calculate "X Adg ξ (g · u)

((g · u) | T Φg Z(u)# = ((g · u)(T Φg X ξ (u), T Φg Z(u))

= (Φ∗g ()(u)(X ξ (u), Z(u)) = ((u)(X ξ (u), Z(u)), since ( is G-invariant = (∂H f (u))Z(u) = "df (u) | Z(u)#, =

"Φ∗g−1 (df )(g

by hypothesis and since Z(u) ∈ Hu

· u) | T Φg Z(u)# = "d(Φ∗g−1 f )(g · u) | T Φg Z(u)#

= "(∂H (f ◦ Φg−1 )(g · u)) | T Φg Z(u)#,

since H is G-invariant implies that T Φg Z(u) ∈ Hg·u . Let h∗ be the dual space of h and let J : D → h∗ be a smooth map such that for each ξ ∈ h, Xξ

( = ∂H "J | ξ#.

(22)

Here "J | ξ# (u) = "J(u)|ξ# for every u ∈ D. We shall refer to J as a momentum map for the G-action Φ on D. For each ξ ∈ h, the function J ξ = "J | ξ# is the momentum associated to ξ. From the nonholonomic Noether theorem, it follows that for every ξ ∈ h the momentum J ξ is a constant of motion for every G-invariant Hamiltonian. Theorem 3.6.1.39. For every accessible set L of the distribution H, there exists an affine action AL : G × h∗ → h∗ such that J(g · u) = AL (g, J(u))

(23)

for every g ∈ G and every u ∈ L. Proof. Since h is an ideal in g, the adjoint action Ad : G × g → g preserves h and hence induces a G-action on h which we also denote by Ad. The coadjoint action Ad∗ : G × h∗ → h∗ on h∗ is defined by * ∗ + * + Adg α | ξ = α | Adg−1 ξ (24) for every g ∈ G, every α ∈ h∗ , and every ξ ∈ g. Next we show that, for each u ∈ L the mapping

σL : G → h∗ : g → σL (g) = J(Φg (u)) − Ad∗g J(u)

(25)

does not depend on the choice of u ∈ L, that is, σL (g) is constant on L. ) Let {ξi } be a basis of h. For each g ∈ G, we can write Adg ξi = j cij (g)ξj .

107

3.6. Conservation laws

Using equation (21) we obtain

∂H J ◦ Φg−1 | ξi (g · u) = ∂H (J ξi ◦ Φg−1 )(g · u) = T Φg (X ξi (u)) X = X Adg ξi (g · u) $ = cij (g)X ξj (g · u) $ =

X j

$

j

P cij (g)∂H J (g · u) = ∂H J j cij (g)ξj (g · u) ξj

= ∂H hJ | Adg ξi i (g · u) = ∂H hAd∗g−1 J | ξi i(g · u).

Hence, ∂H {J ◦ Φg−1 − Ad∗g−1 J}(u) = 0 for every u ∈ L. Since every pair of points of L can be connected by a piecewise smooth curve whose tangent vector lies in H, it follows that σL (g) is constant on L. Thus the map σL (25) is well defined. For every g, g 0 ∈ G the following calculation shows that σL (gg 0 ) = σL (g) + Ad∗g (σL (g 0 )).

We compute σL (gg 0 ) = J ◦ Φgg0 − Ad∗gg0 J = J ◦ Φg ◦ Φg0 − Ad∗g Ad∗g0 J

= (J ◦ Φg − Ad∗g J) ◦ Φg0 + Ad∗g (J ◦ Φg0 − Ad∗g0 J) = (J ◦ Φg − Ad∗g J) + Ad∗g (J ◦ Φg0 − Ad∗g0 J) = σL (g) + Ad∗g σL (g 0 ).

Define the map AL : G × h∗ → h∗ by

AL (g, α) = Ad∗g α + σL (g).

(26)

We now show that AL is a G-action. If e is the identity element of G, then AL (e, α) = Ad∗e α + σL (e) = α + J ◦ Φe − Ad∗e J = α. Moreover, AL (gg 0 , α) = Ad∗gg0 α + σL (gg 0 ) = Ad∗g Ad∗g0 α + σL (g) + Ad∗g σL (g 0 ) = Ad∗g (Ad∗g0 α + σL (g 0 )) + σL (g) = AL (g, AL (g 0 , α)). Hence, AL is a G-action on h∗ . To finish the proof of the theorem it remains to show that equation (23) is satisfied. We have AL (g, J(u)) = Ad∗g J(u) + σL (g) = Ad∗g J(u) + J ◦ Φg (u) − Ad∗g J(u) = J(g · u).

108

Symmetry and reduction

Let L be an accessible set of the distribution H. We assume that StabL = {g ∈ G g · u ∈ L for every u ∈ L} is a closed subgroup of G and hence is a Lie group. Because G acts properly on D, it follows that StabL acts properly on L. Let K be a compact subgroup of G and set LK = {u ∈ L Gu = K}. Then LK is the intersection of the symmetry type DK with L. Let StabLK = {g ∈ StabL g · u ∈ LK for every u ∈ LK }. Then StabLK is a closed subgroup of StabL and hence is a Lie group. Moreover, K is a closed normal subgroup of both StabLK and StabL . Fact 3.6.1.40. StabLK is a subgroup of the normalizer N (K) in G. Proof. Suppose that g ∈ StabLK and u ∈ LK . Then g · u ∈ LK , which implies K = Gg·u = gGu g −1 = gKg −1 , since u ∈ LK implies Gu = K. Therefore g ∈ N (K). Since K acts trivially on DK , it follows that GDK = N (K)/K acts freely and properly on DK , see theorem 2.3.3.7 of chapter 2. Consequently, GLK = StabLK /K acts freely and properly on LK . This result can be refined a bit. Let HLK = {vu ∈ H u ∈ LK }. Then HLK is a smooth distribution on LK with symplectic form $LK = iK ∗ $, where iK : LK → D is the inclusion map. Let M be an accessible set of the distribution HLK on LK and set StabM = {g ∈ StabLK g · m ∈ M for every m ∈ M }. Then StabM is a closed subgroup of StabLK and hence is a Lie group. From K ⊆ StabM ⊆ StabLK and the fact that K is a closed normal subgroup of StabM , it follows that GM = StabM /K is a Lie group. Because K acts trivially on M and GM is a closed subgroup of GLK , we obtain that GM acts freely and properly on M . For each accessible set L of the distribution H, each compact subgroup K of G, each accessible set M of HLK , and every α ∈ h∗ , the set M ∩J −1 (α) is preserved by the distributional Hamiltonian vector field YhM associated to the GM -invariant Hamiltonian function hM = h|M of the GM -invariant distributional Hamiltonian system (M, HM , $M , HM ). Theorem 3.6.1.41. For every accessible set M of the distribution HLK and every α ∈ h∗ , each connected component of J −1 (α) ∩ M is a smooth submanifold of M . The remainder of this section is devoted to the proof of theorem 3.6.1.41.

109

3.6. Conservation laws

Above we have shown that the group GM = StabM /K acts freely and properly on M . Let k, sM , and gM = sM /k be the Lie algebras of the Lie groups K, StabM , and GM , respectively. Let hM = (sM ∩ h)/(k ∩ h) ⊆ gM

(27)

and let η : (sM ∩ h) → hM be the canonical projection map. Proposition 3.6.1.42. The action of GM on M preserves the symplectic distribution (HM , $M ) and has a momentum map JM : M → h∗M . For −1 every αM ∈ h∗M every connected component of JM (αM ) is a smooth submanifold of M . Proof. Clearly, the action of StabM on M preserves the symplectic distribution (HM , $M ). Since GM = StabM /K and K acts trivially on M , the GM -action preserves the symplectic distribution (HM , $M ). For each ξ ∈ g, we have X ξ $ = ∂H J ξ . However, ξ ∈ k implies that X ξ (u) = 0 for all u ∈ M ⊆ LK . Hence ∂H J ξ = 0 on M. Since M is connected and is contained in an accessible set of H, it follows that the restriction of J ξ to M is constant. Therefore, for every ξ ∈ k ∩ h the restriction of J ξ to M is constant. Let κ : k ∩ h → h be the inclusion map and let κ∗ : h∗ → (k ∩ h)∗ be its transpose. The restriction of J ξ to ξ in k∩h is given by κ∗ ◦ J : D → (k∩h)∗ . Let J|M be the restriction of J to points in M . From the argument in the paragraph above, we deduce that κ∗ ◦ J|M is constant. Let µ : k ∩ h → sM ∩ h be the inclusion map. Its transpose µ∗ : (sM ∩ h)∗ → (k ∩ h)∗ is onto. Hence, there exists jM ∈ (sM ∩ h)∗ such that µ∗ (jM ) = κ∗ ◦ J |M . Let ν : sM ∩ h → h be the inclusion map and let ν ∗ : h∗ → (sM ∩ h)∗ be its transpose. Every element of the kernel of the canonical projection map η : (sM ∩ h) → hM is k ∩ h is mapped to zero by (ν ∗ ◦ J|M − jM ) : M → (sM ∩ h)∗ . Hence there exists a unique map JM : M → h∗M such that η ∗ JM = ν ∗ ◦ J|M − jM .

(28)

We now show that JM is a momentum map for the action of GM on M . For each ξ ∈ sM ∩ h, we have ν(ξ) ∈ h. Moreover, the G-action on M restricted to the one parameter group t → exp tν(ξ) is generated by the vector field X ν(ξ) restricted to M . Here X ν(ξ) $ = ∂H hJ | ν(ξ)i . Similarly, the GM -action on M restricted to the one parameter subgroup η(ξ) t → exp tη(ξ) is generated by the vector field XM on M . Since on M

110

Symmetry and reduction

the G-action of exp tν(ξ) and the GM -action of exp tη(ξ) coincide, it folη(ξ) lows that XM is the restriction of X ν(ξ) to M . Restricting the equation X ν(ξ) $ = ∂H hJ | ν(ξ)i to HM we get η(ξ)

XM

$M = ∂HM h(J|M ) | ν(ξ)i = ∂HM hν ∗ ◦ (J|M ) | ξi

= ∂HM h(η ∗ JM + jM ) | ξi = ∂HM hJM | η(ξ)i ,

where the last equality above follows because jM is constant. Hence JM : M → h∗M is a momentum map for the action of GM on M . ξ Since the action of GM on M is free, it follows that XM (u) 6= 0 for every ξ ∈ hM and every u ∈ M . This implies that ∂HM hJM | ξi(u) 6= 0 for every ξ ∈ hM and every u ∈ M . Hence, the map ∂HM JM : HM → h∗M is onto. Therefore, for every αM ∈ h∗M which lies in the image of JM , the set −1 JM (αM ) is a local submanifold of M . For each u ∈ M , let Hu = H ∩ Tu D, HM = {Hu | u ∈ M }, HM,u = HM ∩ Tu D, and $ HM,u = {v ∈ Hu | $(v, w) = 0

for all w ∈ HM,u } .

(29)

Since M is an accessible set of HLK , it follows that HM is the restriction of HLK = H ∩ T LK to vectors whose base point lies in M . In other words, HM,u = Hu ∩ Tu LK . $ Because HLK is a symplectic distribution on LK , we see that HM,u ∩ $ Tu LK = {0}. Furthermore, HM,u is K-invariant, which implies that R $ HM,u ⊆ Tu⊥ LK , where Tu⊥ LK = {v ∈ Tu L K Tu Φk v dk = 0}. Here $ dk is Haar measure on K with vol K = 1. Thus, HM,u = Hu ∩ T ⊥ L K . From Tu L = Tu LK ⊕ Tu⊥ LK ,

(30)

$ Hu = HM,u ⊕ HM,u ,

(31)

we obtain the decomposition

$ HM,u

Tu⊥ M

where HM,u ⊆ Tu M and ⊆ = {v ∈ Tu D | v = 0}. To prove (30) we argue as follows. Each u ∈ LK is fixed by the action of K ⊆ StabLK . Therefore Tu Φk : Tu LK → Tu LK for every k ∈ K. For R v ∈ Tu L let v = K Tu Φk v dk. Then Tu Φk v = v for every k ∈ L. So v ∈ Tu LK . Write v = v + (v − v). Then v − v = 0. So v − v ∈ Tu⊥ LK . Therefore (30) holds, because if w ∈ Tu LK ∩ Tu⊥ LK , then w = w = 0. Since h is an ideal in g, it is invariant R under the adjoint action of K ⊆ G on g. For each ξ ∈ h, its average ξ = K Adk (ξ) dk is in h. In addition, the average of ξ − ξ is zero. Let m = {ξ ∈ h | Adk ξ = ξ

for all k ∈ K} and n = {ζ ∈ h | ζ = 0}.

111

3.6. Conservation laws

From ξ = ξ + (ξ − ξ) and the fact that m ∩ n = {0}, it follows that h = m ⊕ n.

(32)

Lemma 3.6.1.43. Let M ⊆ LK be an accessible set of HLK and let u ∈ M. $ Then YJ ξ (u) ∈ HM,u for every ξ ∈ m and YJ ζ (u) ∈ HM,u for every ζ ∈ n. Proof. To verify the first assertion we argue as follows. Because u ∈ M ⊆ LK and k ∈ K, we have Φk (u) = u. We now compute Ψuk (YJ ξ (u)) = Ψuk (X ξ (u)) = T Φk (X ξ (u)) = X Te Lk ξ (u) d d = Φexp(tTe Lk ξ) (u) = ΦL exp tξ (u) dt t=0 dt t=0 k =

d dt

=X

Φk exp(tξ)k−1 (u) =

t=0 Adk (ξ)

d dt

Φexp tAdk (ξ) (u) t=0

(u) = YJ Adk (ξ) (u).

(33)

If ξ ∈ m, then Adk (ξ) = ξ for all k ∈ K. Hence YJ ξ (u) is K-invariant, which implies that YJ ξ (u) ∈ Tu LK . On the other hand, YJ ξ (u) ∈ Hu which using HM,u = Hu ∩ Tu LK shows that YJ ξ (u) ∈ HM,u . We now verify the second assertion of the lemma. For ζ ∈ n the average ζ is zero. Since the map ζ 7→ YJ ζ (u) is linear, equation (33) implies that Z Z Adk (ζ) YJ ζ (u) = Ψuk (YJ ζ (u)) dk = Ψuk (Y J (u)) dk K

K

= Y ζ (u) = 0.

(34) Tu⊥ LK .

Hence, (34) ensures that YJ ζ (u) ∈ But, YJ ζ (u) ∈ Hu . So using $ $ HM,u = Hu ∩ T ⊥ LK we see that YJ ζ (u) ∈ HM,u . Proposition 3.6.1.44. For each u ∈ M, the connected component of −1 M ∩ J −1 (J(u)) and JM (JM (u)) containing u coincide. Proof. Equation (28) yields η ∗ JM (u) = ν ∗ (J(u)) − jM (u). In the proof of proposition 3.6.1.42 we have shown that jM is locally constant. Since η ∗ : h∗M → (sM ∩ h)∗ is injective and ν ∗ : h∗ → (sM ∩ h)∗ is surjective, it follows that the connected component of M ∩ J −1 (J(u)) containing u is a −1 subset of the connected component of JM (JM (u)) containing u. −1 By proposition 3.6.1.42, the connected components of JM (JM (u)) are submanifolds of M . Hence, it suffices to show that J is constant on con−1 nected components of JM (JM (u)). According to equation (32) we may decompose J into components J m : M → m∗ and J n : M → n∗ . If ξ = ξm + ξn with ξ ∈ h, ξm ∈ m, and ξn ∈ n, then hJ | ξi = J ξ = hJ m | ξm i + hJ n | ξn i.

112

Symmetry and reduction

For ξ ∈ m, lemma 3.6.1.43 implies that the vector field YJ ξ is tangent to M . Hence, the G-action Φ restricted to the one parameter group t → exp tξ preserves M . Consequently, ξ is in the Lie algebra sM of the stability group StabM of M . Hence ξ ∈ sM ∩ h. Using equation (28) we find that hJ m | ξi = J ξ = hJM | η(ξ)i + hjM | ξi, where η : sM ∩ h → hM = (sM ∩ h)/(k ∩ h) is the canonical projection. The above equation holds for all ξ ∈ m, which implies that J m = η ∗ JM + jM .

(35)

Since jM is constant on M , it follows that J m is constant on the level sets of JM . $ For ξ ∈ n, lemma 3.6.1.43 implies that YJ ξ (u) ∈ HM,u for all u ∈ M . ξ Since YJ ξ $ = dJ , it follows that, for every w ∈ HM,u , hdJ ξ (u) | wi = $(YJ ξ (u), w) = 0, $ where the last equality is a consequence of the definition (29) of HM,u . Because the above equation holds for every ξ ∈ n, we deduce that dJ n vanishes on T M . Hence, J n is constant on connected components of the level sets of JM : M → hM . Since J = J m + J n and both J m and J n are constant on connected components of level sets of JM , it follows that J is constant on connected components of level sets of JM .

Proof of theorem 3.6.1.41. Proposition 3.6.1.44 ensures that for each −1 u ∈ M the connected component of M ∩ J −1 (J(u)) and JM (JM (u)) containing u coincide. By proposition 3.6.1.44, the connected compo−1 nents of JM (JM (u)) are manifolds. Hence, the connected components of M ∩ J −1 (J(u)) are also manifolds. This proves theorem 3.6.1.41. 3.6.2

Gauge momenta

A smooth function f ∈ C ∞ (D) is called a gauge momentum for the action Φ of G on D if it is G-invariant and its distributional Hamiltonian vector field Yf is tangent to each G-orbit in D. Fact 3.6.2.45. Gauge momenta are constant on integral curves of each distributional Hamiltonian vector field Yk where k ∈ C ∞ (D)G .

113

3.7. Lifted actions and the momentum equation

Proof. Let f ∈ C ∞ (D) be a gauge momentum function. Then for each v ∈ D we have Yf (u) ∈ Tu (G · v) if and only if there is a ξ ∈ g such that Yf (u) = X ξ (u) for every u ∈ D. Therefore for every u ∈ D we have (LYk f )(u) = −(LYf k)(u) = −hdk(u) | Yf (u)i = −hdk(u) | X ξ (u)i =−

d dt

k(Φu (exp t ξ)) = −

t=0

d dt

k(u) = 0. t=0

Hence gauge momenta are constant on every accessible set N of the distribution spanned by the distributional Hamiltonian vector fields of Ginvariant functions. Since a gauge momentum function is G-invariant, it pushes forward to a smooth function on the orbit space D = D/G which is constant on the image of N under the G-orbit map. From §1 we know that the space C ∞ (D)G of G-invariant functions on D is an almost Poisson subalgebra of (C ∞ (D), { }). Following the terminology used for Poisson algebgras, we say that f ∈ C ∞ (D)G is a Casimir of the almost Possion algebra (C ∞ (D)G , { , }) if {f, f 0 } = 0 for all f 0 ∈ C ∞ (D)G . Proposition 3.6.2.46. A function f ∈ C ∞ (D)G is a gauge momentum if and only if it is a Casimir in (C ∞ (D)G , { , }). Proof. If f is a gauge momentum, then Yf is tangent to each G-orbit in D. Hence, {f, f 0 } = LYf f 0 = 0 for every f 0 ∈ C ∞ (D)G . This implies that G f is a Casimir in (C ∞ (D)G , { , }). Conversely, if f ∈ C ∞ (D) is a Casimir, then LYf f 0 = 0 for all f 0 ∈ C ∞ (D)G . This implies that Yf is tangent to each G-orbit in D. Thus f is a gauge momentum. 3.7 3.7.1

Lifted actions and the momentum equation Lifted actions

In this subsection we assume that the G action egu Φ : G × D → D : (g, u) 7→ TτQ (u) Φ

e : G × Q → Q of G is the restriction to D of the lift to T Q of an action Φ on Q. Here τQ : T Q → Q is the tangent bundle projection map. In other e g (u) words, D ⊆ T Q is a G-invariant distribution on Q, and Φg (u) = T Φ e for every g ∈ G and every u ∈ D. In addition we assume that the action Φ

114

Symmetry and reduction

is free and proper. The results of this subsection will be used to prove the momentum equation for free and proper actions in subsection two. e on Q is free and proper implies The assumption that the G-action Φ that Q is a principal bundle with structure group G. Let Q be the space of orbits of the G-action on Q and let π : Q → Q be the G-orbit map, which is the bundle projection of the principal bundle. The bundle ver T Q of vertical vectors of Q consists of vectors tangent to the fibers of π : Q → Q. Let k be a G-invariant Riemannian metric on Q and let the horizontal distribution hor T Q = (ver T Q)⊥ be the k-orthogonal complement of ver T Q. We obtain the bundle direct sum decomposition T Q = ver T Q ⊕Q hor T Q.

(36)

Since k is G-invariant, the distribution hor T Q is G-invariant. Thus it defines a connection on the principal bundle Q. The connection form ϑ of this connection is a g-valued 1-form on Q with kernel ker ϑ = hor T Q such that hϑ | Xξ i = ξ for every ξ ∈ g. Here Xξ is the vector field on Q whose e q (exp t ξ). flow is ϕt (q) = Φ The restriction of the metric k to hor T Q pushes forward to a metric kQ on Q. In other words, k(hor u1 , hor u2 ) = kQ (T π(u1 ), T π(u2 )). The tangent bundle T Q has a 1-form θ such that hθ | wi = kQ (τT Q (w), T τQ (w)) for every w ∈ T Q. Here τQ : T Q → Q and τT Q : T (T Q) → T Q are tangent bundle projection maps. The exterior derivative dθ of θ is the 2-form ω, which is a symplectic form on T Q. e Let D be a Φ-invariant distribution on Q. The decomposition (36) of T Q gives rise to D = ver D ⊕Q (ver D)⊥ ∩ D , (37) where ver D = D ∩ ver T Q. We now make the following Hypothesis 3.7.1.47. ver D is a distribution on Q. In other words, ver D is a subbundle of ver T Q. Let DQ = T π(D). In the next few propositions we describe the structure of the orbit space D = D/G.

3.7. Lifted actions and the momentum equation

115

For each q ∈ Q, let q = π(q). We denote the horizontal lift from Tq Q to Tq Q by liftq : Tq Q → Tq Q. In other words, for each u ∈ Tq Q, liftq (u) is the unique vector in hor Tq Q such that T π(liftq (u)) = u. Proposition 3.7.1.48. For each q ∈ Q, there exists a unique linear map Λq : (DQ )π(q) → (ver D)⊥ ∩ D q such that liftq (T π(u)) + Λq (T π(u)) = u for every u ∈ (ver D)⊥ ∩ D q . Consequently,

(ver D)⊥ ∩ D = (lift ⊕ Λ)(DQ ).

(38)

(39)

Moreover, the map Λ : Q → L(DQ , (ver D)⊥ ∩ D) : q 7→ Λq is smooth, has its image contained in ver T Q, and is G-equivariant, that is, Φg ◦ Λq = Λg·q for every g ∈ G and every q ∈ Q. Proof. We have T π(ver D) = {0} and T π (ver D)⊥ ∩ D = DQ . Moreover, ver((ver D)⊥ ∩ D) = ((ver D)⊥ ∩ D) ∩ (ver T Q) ⊆ (ver D)⊥ ∩ (ver D) = {0}. Hence the map Λ satisfies (38). Equation (38) ensures that ((ver D)⊥ ∩ D)q = (ver D)q ⊕ (liftq + Λq ) ((DQ )π(q) ), which gives (39). For u ∈ Tq Q, hor u = liftq (T π(u)). If u ∈ (ver D)⊥ ∩ D, then ver u = u − hor u = u − liftq (T π(u)) = Λq (T π(u)). Thus the range of Λ is contained in ver T Q. For g ∈ G and u ∈ (ver D)⊥ ∩ D q ∩ Tq Q, we have

Φg (Λq (T π(u))) = Φg (ver u) = ver (Φg (u)), which is an element of (ver D)⊥ ∩ D g·q ∩ ver Tg·q Q. However, the set (ver D)⊥ ∩D g·q ∩ ver Tg·q Q has a unique element Λg·q (T π(Φg (u))) = Λg·q (T π(u)). Hence Φg ◦ Λq = Λg·q . Finally, for every vector field X on Q with values in DQ , we know that ◦ Λ X ◦ π is a smooth vector field on Q. This ensures that Λ is smooth. Proposition 3.7.1.49. The bundle (T Q)/G of G-orbits on T Q over the space Q of G-orbits on Q is isomorphic to the fiber product Q[g] ×Q T Q of bundles over Q. Here Q[g] is the adjoint bundle of Q over Q.

116

Symmetry and reduction

Proof. In the category of vector bundles, fiber products and direct sums are equivalent. Hence, the decomposition of T Q given in (36) is equivalent to T Q = ver T Q ×Q hor T Q. The lifted action of G on T Q preserves ver T Q and hor T Q separately. Hence (T Q)/G = ((ver T Q)/G) ×Q ((hor T Q)/G) .

Let Ad : G × g → g be the adjoint action of G on its Lie algebra g. The mapping Q × g → (ver T Q) : (q, ξ) #→ Xξ (q) is a bundle isomorphism intertwining the action G × (Q × g) → Q × g : (g, (q, ξ)) #→ (g · q, Adg ξ),

and the restriction to ver T Q of the lifted action of G on T Q. But the G-orbit space (Q × g)/G = Q[g] is the adjoint bundle of Q. Hence, the bundle (ver T Q)/G is isomorphic to the adjoint bundle Q[g]. The restriction to hor T Q of T π : T Q → T Q maps hor T Q onto T Q, and it is constant along G-orbits in hor T Q. It induces a vector bundle isomorphism (hor T Q)/G → T Q. Using proposition 3.7.1.49 we can identify (T Q)/G with Q[g] ×Q T Q. Let π1 : (T Q)/G → Q[g] be the projection onto the first factor of Q[g] ×Q T Q. By hypothesis 3.7.1.47, ver D is a subbundle of ver T Q. Moreover, it is G-invariant. Hence the space S = ver D/G of G-orbits on D is a subbundle of the adjoint bundle Q[g]. Proposition 3.7.1.50. There is a smooth map Σ : DQ → Q[g] such that the space D of G-orbits on D is diffeomorphic to S ×Q graph Σ, which equals ! " (s, Σ(u), u) ∈ S × Q[g] × DQ | πS (s) = τQ (u) . (40)

Proof. Let ρ : T Q → (T Q)/G be the orbit map for the lifted G action Φ on T Q. Given u ∈ (DQ )q , where q ∈ Q, take any q ∈ π −1 (q) ∈ Q and set Σ(u) = π1 (ρ(Λq (u))) ∈ Q[g].

(41)

Since ρ(Λg·q (u)) = ρ(Φg (Λq (u))) = ρ(Λq (u)) for every g ∈ G, it follows that Σ(u) does not depend on the choice of q ∈ π −1 (q). Hence the map Σ : DQ → Q[g] is well defined. For every vector field X on Q with values in DQ , we know that Λ ◦ X ◦ π is a smooth G-invariant vector field on Q with values in ver T Q. Hence it gives rise to a smooth section Σ ◦ X = π1 ◦ ρ ◦ Λ ◦ X : Q → Q[g] of the adjoint bundle Q[g] → Q. This implies that the map Σ : DQ → Q[g] is a smooth vector bundle morphism. Equation (38) ensures that the space D of G-orbits in D is diffeomorphic to S ×Q graph Σ.

117

3.7. Lifted actions and the momentum equation

3.7.2

Momentum equation

In this subsection we shall rewrite the reduced dynamics in terms of the momentum equation and the second order differential equation condition, see §8.2 of chapter 1. From results of §2 it follows that the reduced space D is endowed with a symplectic distribution (HD , $D , hD ) such that the reduced equations of motion are given by the distributional Hamiltonian vector field YhD . In other words, the reduced motion t 7→ u(t) = ρ(u(t)) is an integral curve of YhD . Each curve is uniquely determined by df (u(t)) = (LYh f )(u(t)) D dt

for every f ∈ C ∞ (D)

(42)

and the initial condition. The space of smooth functions C ∞ (D) on the smooth manifold D consists of the push forwards ρ∗ f of G-invariant functions f on D. In other words, C ∞ (D) = {ρ∗ f | f ∈ C ∞ (D)G }. Hence equation (42), which determines t 7→ u(t), is equivalent to df (u(t)) = (LYh f )(u(t)) for every f ∈ C ∞ (D)G dt for t 7→ u(t) where u(t) = ρ(u(t)).

(43)

Recall that a curve t 7→ u(t) satisfies the momentum equation for a smooth vector field Z on Q if and only if dPZ (u(t)) = −(LXPZ h)(u(t)), (44) dt where PZ (u) = k(u, Z(τQ (u))) and XPZ is the Hamiltonian vector field corresponding to the function PZ on the symplectic manifold (T Q, ω). A curve t 7→ u(t) is said to satisfy the second order differential equation condition for f ∈ C ∞ (Q) if ∗ dτQ f (u(t)) = hdf | u(t)i. dt

(45)

Theorem 3.7.2.51. The reduced equations of motion (42) are equivalent to the momentum equation (44) for G-invariant vector fields on Q with values in D and the second order differential equation condition (45) for G-invariant functions f ∈ C ∞ (Q). Proof. It follows from theorem 1.8.2.36 of chapter 1 that the evolution t 7→ u(t) of the constrained system satisfies the momentum equation for all smooth vector fields on Q with values in D together with the second

118

Symmetry and reduction

order differential equation condition for every function f ∈ C ∞ (Q). Let t 7→ u0 (t) be another curve in D such that ρ(u0 (t)) = ρ(u(t)) for all t. Then u0 (t) = g(t) · u(t) for some smooth curve t 7→ g(t) ∈ G. For every G-invariant vector field Z on Q, the momentum PZ is G-invariant. Also LXPZ h is G-invariant. Hence dPZ (u0 (t)) dPZ (g(t) · u(t)) dPZ (u(t)) = = dt dt dt = −(LXPZ h)(u(t)) = −(LXPZ h)(g −1 (t) · u0 (t)) = −(LXPZ h)(u0 (t)).

Thus t 7→ u0 (t) satisfies the momentum equation for all smooth vector fields on Q with values in D. Similarly, for every G-invariant function f ∈ C ∞ (Q), ∗ ∗ ∗ dτQ f (u0 (t)) dτQ f (g(t) · u(t)) dτQ f (u(t)) = = dt dt dt = hdf | u(t)i = hdf | u0 (t)i.

Hence the curve t 7→ u0 (t) satisfies the second order differential equation condition for all G-invariant functions f ∈ C ∞ (Q). We now prove the converse. For each q ∈ Q there exists an open neighm borhood U 1 of q and smooth functions {f i }i=1 on Q whose restrictions m {f i | U 1 }i=1 form a chart for Q. Let dim(ver D) = k. There exists a neighborhood U 2 of q contained in U 1 such that π −1 (U 1 ) = G × U 1 . We can choose G-invariant vector fields k {Zi0 }i=1 with values in ver D which are linearly independent in π −1 (U 2 ). If dim D = `, there is a neighborhood U of q contained in U 2 and ` vector ` fields {Z i }i=1 on Q with values in D which are linearly independent on U . For each i = 1, ..., `, let Zi00 = Λ ◦ Z i + lift Z i . `

From equation (39) it follows that the vector fields {Zi00 }i=1 have values in (ver D)⊥ ∩ D. Since Λ and lift are G-invariant mappings, we find that Zi00 0 00 are G-invariant. Thus {Z1, , ..., Zk0 , Z1, , ..., Z`00 } are G-invariant vector fields on Q with values in D which are linearly independent in π −1 (U ). Let n = dim D. Since the metric k is G-invariant, we can use the GramSchmidt orthogonalization together with a partition of unity to construct n n = k + ` vector fields {Zi }i=1 on Q with values in D which are G-invariant n and such that at every point q 0 ∈ π −1 (U ), {Zi (q 0 )}i=1 is an orthonormal basis in Dq0 . Hence for each q 0 ∈ π −1 (U ), every vector u ∈ Dq is of the form u = c1 Z1 (q 0 ) + ... + cn Zn (q 0 ).

3.7. Lifted actions and the momentum equation

119

Since the vector fields {Zi }ni=1 are orthonormal, it follows that ci = k(u, Zi (q 0 )) = PZi (u)

for every i = 1, ..., n. Since the momenta PZ are G-invariant, they push forward to functions ρ∗ PZ on (T Q)/G. From proposition 3.7.1.50 it follows that D is a fiber bundle over Q. Let χ : D → Q be its projection map. For each u ∈ D, we have χ(ρ(u)) = π(τQ (u)). The functions {ρ∗ χ∗ f 1 , ..., ρ∗ χ∗ f m , PZ1 , ..., PZn } are smooth, Ginvariant, and independent on D ∩ T (π −1 (U )). By proposition 3.7.1.48 dim D/G = dim Q + dim(ver D) + dim D = m + k + ` = m + n. Hence the m+n functions {χ∗ f 1 , ..., χ∗ f m , ρ∗ PZ1 , ..., ρ∗ PZn } define a chart. This means that, for every function f ∈ C ∞ (D), there exists a smooth function F on Rm+n such that the restriction of f to χ−1 (U ) satisfies f | χ−1 (U ) = F (χ∗ f 1 , ..., χ∗ f m , ρ∗ PZ1 , ..., ρ∗ PZn ) | χ−1 (U ). Every G-invariant smooth function f on D is a pull back of a function f ∈ C ∞ (D) by the orbit map ρ : D → D, that is, f = ρ∗ f . Hence the derivative f along t 7→ u(t) ∈ D ∩ T (π −1 (U )) is given by df (u(t)) df (ρ(u(t)) dF ∗ = = (χ f 1 , ..., χ∗ f m , ρ∗ PZ1 , ..., ρ∗ PZn )(ρ(u(t))) dt dt dt dF ∗ ∗ = (ρ χ f 1 , ..., ρ∗ χ∗ f m , PZ1 , ..., PZn )(u(t)). dt ∗ ∗ But χ ◦ ρ = π ◦ τQ implies that ρ∗ χ∗ f i (u) = τQ π f i for all i = 1, ..., m. df Thus dt (u(t)) is uniquely determined by the momentum equations for the n G-invariant vector fields {Zi }i=1 on Q and the second order differential m equation condition for G-invariant functions {π ∗ f i }i=1 on Q. From the proof of theorem 3.7.2.51 we see that to determine the reduced equations of motion locally, it suffices to take into account the second order m differential equation conditions for G-invariant functions {π ∗ f i }i=1 on Q which push forward to local coordinates on Q and the momentum equations m for G-invariant vector fields {Zi }i=1 on Q which form a local moving frame for the distribution D. Corollary 3.7.2.52. Let D be a distribution on a Lie group G, k a Riemannian metric on G, and V a smooth function on G which are invariant under the action of G on itself by left multiplication. Then the G-reduced equations of motion for the constrained mechanical system (`, D) with Lagrangian ` = 21 k − V , where k is the kinetic energy k associated to the

120

Symmetry and reduction

kinetic energy metric k, V is the potential energy, and D is the constraint distribution, are given by the momentum equations corresponding to left invariant vector fields on G spanning D. Proof. Since the action of G on itself is transitive, it follows that there are no nontrivial second order differential equation conditions. Hence theorem 3.7.2.51 implies that the G-reduced equations of motion are given by the momentum equations corresponding to invariant vector fields on G spanning D. 3.8

Notes

When G is compact, proposition 3.1.4.4 follows from proposition A4 of Field [44]. The conclusion of corollary 3.1.4.6 should be compared with theorem 1 of Michel [80], which states that for every smooth action of a compact Lie group G on M , every smooth G-invariant function has m as a critical point if and only if the orbit G · m is an isolated element of the orbit type MGm /G in the orbit space M/G. We note that Michel’s theorem holds for proper actions. We now give a brief history of the theory of reduction for Hamiltonian and nonholonomic systems. For Hamiltonian systems two cases were considered: the regular case where the symmetry group acts properly and freely on phase space with a momentum map and the singular case where the group acts properly and has a momentum map. The regular case is classical, see [77], [79], [109], and [68]; whereas the singular case is of much more recent origin, see [6], [102], [5], [30], [12], [85], and [87]. In the regular case, the theory of reduction for nonholonomic systems ´ was pioneered by Bates and Sniatycki [13], who showed that the distributional Hamiltonian formulation of the equations of motion is preserved ´ under reduction, see also Cushman and Sniatycki [35]. One difficulty with nonholonomic systems is that symmetries need not give rise to conserved quantities. Thus for a nonholonomic system the reduced space is the space of orbits of the symmetry acting on phase space, whereas for Hamiltonian systems it is the space of orbits of the symmetry on the inverse image of a coadjoint orbit under the momentum map. Bloch, Krishnaprasad, Marsden and Murray [16] have formulated a Lagrangian theory of symmetry reduction, which does not preserve the form of the equations of motion. For the singular case of reduction for nonholonomic systems the text ´ follows the approach of Sniatycki [104], which is based on Cushman and

3.8. Notes

121

´ Sniatycki [33] for the Hamiltonian case. It uses the concepts of an accessible set of a generalized distribution, the Stefan-Sussmann theorem [111], [112], and differential spaces [100] to reduce the proof to the regular case. Theorem 3.6.1.41 is the nonholonomic analogue of a theorem of Ortega [85] in the Hamiltonian case. The use of the category of subcartesian spaces in the theory of reduction is taken from [106]. In §3 of chapter 3 Neimark and Fufaev [82] call a system satisfying 1–3 of §3 with G = Rn a nonholonomic Chaplygin system. The generalization to an arbitrary Lie group G and all the results of this section are due to Koiller [59]. The case of a classical mechanical system with ` = k − V , where the kinetic energy k is defined by a Riemannian metric k on Q, and the infinitesimal G-action is k-orthogonal to G, then D is called the mechanical connection of the system, see Marsden [75]. The observation that the vector fields associated to the Lagrange derivative of the reduced Lagrangian and that to the reduced Lagrange derivative have the same zeroes and linearization at each zero is due to Korteweg [62], see §13 and §14. Theorem 3.6.1.39 is due to Souriau [109], when L is a symplectic manifold, H = T L, and h = g.

This page intentionally left blank

Chapter 4

Reconstruction, relative equilibria and relative periodic orbits

4.1

Reconstruction

Suppose that Φ : G × M → M is a smooth proper action of a Lie group G on a smooth manifold M . Let V be a smooth G-invariant vector field on M with flow ϕt : Dt → M : m 7→ ϕt (m), where γ : Im → M : t 7→ ϕt (m) is an integral curve of V starting at m with maximal domain of definition Im . Here Dt = {m ∈ M t ∈ Im }. Since V is G-invariant, Φg ◦ ϕt = ϕt ◦ Φg for every g ∈ G. Therefore the flow ϕt induces a reduced flow ϕt : Dt → M = M/G on the space M of G-orbits on M . Here Dt = {m ∈ M t ∈ Im = Im , where m ∈ m}. The map ϕt is the flow of a reduced vector field V on M , thought of as the differential space (M , C ∞ (M )), see §4 of chapter 2. 4.1.1

Reconstruction for proper free actions

Assume that the G-action Φ is free. Then the reduced space M is a smooth manifold and the reduced vector field V satisfies Tm π V (m) = V (π(m)) for every m ∈ M . Here π : M → M : m 7→ m is the G-orbit map.

Suppose that γ : Im → M is an integral curve of the reduced vector field V on M starting at m. The purpose of reconstruction is to find an integral curve γ : Im → M of V which starts at m ∈ m such that π ◦ γ(t) = γ(t) for every t ∈ Im . To do this we start with an arbitrary smooth curve β : Im → M such that π(β(t)) = γ(t). For each t ∈ Im there is a unique element g(t) ∈ G such that γ(t) = g(t) · β(t), (1) because the action of G on M is free. Moreover the curve Im → G : t 7→ g(t) is smooth with g(0) = e, the identity element of G. Using the inverse G × g → T G : (g, ξ) 7→ Te Lg ξ 123

124

Reconstruction, relative equilibria and relative periodic orbits

of the standard left trivialization of T G, where Lg : G → G : g 0 7→ gg 0 is left multiplication by g, we transfer the tangent vector g(t) ˙ ∈ Tg(t) G to an element ξ(t) of the Lie algebra g = Te G. Differentiating (1) with respect to t gives ˙ γ(t) ˙ = Tg(t) Φβ(t) (Te Lg(t) ξ(t)) + Tβ(t) Φg(t) β(t).

(2)

But t 7→ γ(t) is an integral curve of V . So γ(t) ˙ = V (γ(t)) = V (g(t) · β(t)) = Tβ(t) Φg(t) V (β(t)),

(3)

since the vector field V is G-invariant. Moreover, Tg(t) Φβ(t) (Te Lg(t) ξ(t)) = Xξ(t) (g(t) · β(t)) = Tβ(t) Φg(t) Xξ(t) (β(t)).

(4)

Substituting (3) and (4) into (2) and cancelling Tβ(t) Φg(t) gives ˙ Xξ(t) (β(t)) = V (β(t)) − β(t).

(5)

In order to solve equation (5) for ξ(t) we use a connection 1-form Θ for the G-principal bundle π : M → M . Specifically, Θ is a smooth g-valued 1-form on M satisfying 1. for every (m, ξ) ∈ M × g we have Θ(m)(Xξ (m)) = ξ; 2. for every (m, g) ∈ M × G and every vm ∈ Tm M , we have Θ(g · m)(Tm Φg vm ) = Adg (hΘ(m)(vm )i). The linear subspace horm = ker Θ(m) in Tm M is complementary to verm = Tm (G · m) = ker Tm π in Tm M . Moreover, hor : M → T M : m 7→ horm and ver : M → T M : m 7→ verm are smooth vector subbundles of T M called the horizontal and vertical distributions, respectively. From property 1 it follows that hor is invariant under the G action on T M induced from the G-action on M . Assuming that M is paracompact, we can piece together connection 1-forms on local trivializations of the bundle π by means of a partition of unity in M to construct a connection 1-form Θ on the Gprincipal bundle M . Applying Θ(β(t)) to both sides of (5) gives ˙ ξ(t) = Θ(β(t))(Xξ(t) (β(t))) = Θ(β(t))(V (β(t)) − β(t)).

(6)

Substituting (6) into (5) gives a differential equation satisfied by β, which is what we wanted.

4.1. Reconstruction

4.1.2

125

Reconstruction for nonfree proper actions

Let H be a compact subgroup of G and let N be a connected component of the isotropy type MH = {m ∈ M Gm = H}. By item 1 of theorem 2.3.3.7, the submanifold N is smooth. It is also invariant under the flow of the vector field V . Because the subgroup N (H) = {g ∈ G gHg −1 = H} of G preserves MH , the free and proper action of GMH = N (H)/H on MH is a symmetry of V |MH . More precisely, if GN = {g ∈ GMH g · n ∈ N for every n ∈ N }, then GN acts freely and properly on N and is a symmetry of V |N . Thus there is a unique smooth reduced vector field V = π∗ V on N = π(N ), which is contained in the orbit type M(H) = π(M(H) ) in the orbit space M such that V (π(n)) = Tn π V (n) for every n ∈ N . If H is conjugate to H 0 by an element g ∈ G, then the flow of the vector fields V |N and V |Φg N are conjugate by Φg . Therefore the reduced vector field V |N does not depend on the choice of H in the conjugacy class (H). The process of reconstruction described in §1.1 can be now applied to each connected component of an orbit type in the orbit space. 4.1.3

Application to nonholonomic systems

Let us now apply the above theory of reconstruction to a distributional Hamiltonian system (D, H, $, h) which has a symmetry group G that acts properly on D. Let γ : I → D : t 7→ u(t) be an integral curve of the distributional Hamiltonian vector field Yh , which starts at u0 ∈ D. Let L be an accessible set of the distribution H on D, which passes through u0 . The image of γ lies in L. Thus we can restrict our attention to the distributional Hamiltonian system (DL , HL , $L , h|L). Let K = Gu0 . Then the stability group stabL = {g ∈ G g · u ∈ L for every u ∈ L}, which by hypothesis is a closed subgroup of G, acts properly on L. The group K is a subgroup of stabL . Let N be a connected component containing u0 of the submanifold LK of points of L whose isotropy group in stabL is K. Then the integral curve γ of Yh |N is contained in N . Additional restrictions on the motion γ are provided by the level set of the momentum map J : D → h∗ . Let αN = J(u0 ). Let stabN be the stability group of N , which by hypothesis is a closed subgroup of G. The group GN = stabN /K acts freely and properly on N and is a symmetry group of the distributional Hamiltonian system (N, HN , $N , h|N ). Moreover, GN has a momentum map JN : N → h∗N such that the connected component

126

Reconstruction, relative equilibria and relative periodic orbits

−1 M of N ∩J −1 (αN ) containing u0 and the connected component of JN (αN ) coincide. The distributional Hamiltonian system (M, HM , $M , h|M ) has a symmetry group GM = stabM /K, which acts freely and properly on M and has a momentum map JM : M → h∗M . Let βM = JM (u0 ). The image of the curve γ is contained in a connected component P of J −1 (βM ). The isotropy group GβM = {g ∈ GM Adtg−1 βM = βM } acts freely and properly on P with smooth orbit space P and orbit map ρP : P → P , which is a surjective submersion. Thus GβM is a symmetry of the distributional Hamiltonian system (P, HP , $P , h|P ), which may be reduced to give a distributional Hamiltonian system (P , H P , $P , hP ), where H P is a generalized $P -symplectic distribution on P . Therefore the integral curve γ of Yh|P on P projects under the GβM -orbit map ρP to an integral curve γ : I → P : t 7→ u(t) of the reduced distributional Hamiltonian vector field Yh|P on P , which starts at u0 . This curve can be reconstructed using the procedure described in §1.1.

4.2

Relative equilibria

In this section we study the relative equilibria of a G-invariant vector field V on a smooth manifold M . We assume that the G-action on M is proper.

4.2.1

Basic properties

Before we can state what a relative equilibrium is we prove Lemma 4.2.1.1. Suppose that G has countably many connected components. The following statements are equivalent. 1. The vector V (m) is tangent at m to the G-orbit G · m through m, that is, there is a ξ ∈ g such that V (m) = Xξ (m). 2. The integral curve γm : Im → M of V , which starts at m, is equal to the G-action restricted to a one parameter subgroup t 7→ exp t ξ of G. In other words, Im = R and γm (t) = Φm (exp t ξ) = exp t ξ · m for every t ∈ Im . 3. {γm (t) t ∈ R} ⊆ G · m. 4. m = G · m is an equilibrium point of the reduced flow on M , that is, γ m (t) = m for every t ∈ Im , where m ∈ m. Proof. item 1 ⇐⇒ item 2. Suppose that V (m) = Xξ (m). Let ϕt : Dt → M

4 2. Relative equilibria

127

be the flow of the vector field V . Then Xξ (ϕt (m)) = Tm ϕt Xξ (m) = Tm ϕt V (m) = V (ϕt (m)), where the first equality follows because Φexp t ξ ◦ ϕt = ϕt ◦ Φexp t ξ . Therefore dϕt (m) = V (ϕt (m)) = Xξ (ϕt (m)) dt and ϕ0 (m) = m implies ϕt (m) = Φexp t ξ (m) for every t ∈ Im . Because t 7→ exp t ξ is defined for all t ∈ R, we deduce that Im = R. Therefore item 1 ⇒ item 2. Differentiating γm (t) = Φm (exp t ξ) with respect to t and then setting t = 0 gives V (m) = Xξ (m). Therefore item 2 ⇒ item 1. That item 2 ⇒ item 3 is obvious. As is item 3 ⇐⇒ item 4. The implication item 3 ⇒ item 1 is obvious if the G-orbit G · m is an embedded submanifold of M . To prove it in general we argue as follows. Let q be a complement to gm = Te Gm in g. Let Q be a submanifold of G such that e ∈ Q and Te Q = q. Similarly, let E be a complementary subspace in Tm M to Tm (G · m) = span{Xξ (m) ξ ∈ g}. Let E be a smooth submanifold of M containing m such that Tm E = E. Then the tangent at (e, m) of the map ψ : Q × E → M : (q, x) 7→ q · x is a bijective linear map of q × E onto Tm M . Shrinking Q and E if necessary, from the inverse function theorem it follows that ψ is a diffeomorphism of Q × E onto an open neighborhood U of m in M . Suppose that {xj } is a sequence in E ∩ (G · m) which converges to x ∈ E ∩ (G · m) both in the topology of E and in the orbit topology of G · m. The latter means that if xj = gj · m for a sequence {gj } ⊆ G which converges in G to g ∈ G, then x = g · m. Because the tangent to the map ψx : Q → M : q 7→ q · x at e is injective, q ∩ gx = {0}. Now x = g · m implies that gx = g gm g −1 . Therefore dim gx = dim gm = dim g − dim q. Thus the map ρ : Q × Gx → G : (q, h) 7→ q · h is a diffeomorphism from an open neighborhood of (e, e) onto an open neighborhood of e in G. Because gj g −1 → e in G as j → ∞, for sufficiently large j we can write gj g −1 = qj hj where qj ∈ Q and hj ∈ Gx . Moreover, both of the sequences {qj } and hj converge to e. So e · xj = xj = gj · m = gj g −1 · x = qj hj · x = qj · x,

(7)

where the last equality follows because hj ∈ Gx . Using the injectivity of the mapping ψ from (7) we conclude that xj = x for sufficiently large j. Suppose that E0 is a compact neighborhood of m in E. Let K be a compact subgroup of G. If E ∩ (K · m) is infinite, there is an infinite sequence {gj } of elements of K such that xj = gj · m are distinct elements

128

Reconstruction, relative equilibria and relative periodic orbits

of E0 . Passing to a subsequence we may assume that gj → g for some g ∈ G. The sequence {xj } converges to x ∈ E0 , because E0 is compact. But this contradicts the conclusion that xj = x if j is sufficiently large. Because G has a countable number of connected components, it is the union of a countable number of compact subgroups Kn , see theorem 1.9.1 of [42]. Therefore for each positive integer n we know that E0 ∩ (Kn · m) is a finite set. Thus E0 ∩ (G · m) is countable. There is an open interval I ⊆ Im containing 0 and smooth curves I → Q : t 7→ q(t) and I → E0 : t 7→ x(t) such that ϕt (m) = q(t) · x(t) for every t ∈ I under the assumption that γm (t) = ϕt (m) ∈ G · m for every t ∈ Im . Therefore x(t) ∈ E0 ∩(G·m) for every t ∈ I. Because E0 ∩(G·m) is countable, this implies that t 7→ x(t) is constant and therefore x(t) = x(0) = m. Hence ϕt (m) = q(t) · m

(8)

for every t ∈ I. Differentiating (8) with respect to t and then setting t = 0 gives V (m) = Xξ (m), where ξ = q 0 (0) ∈ q ⊆ g. Therefore item 3 ⇒ item 1. A point m ∈ M (and also the integral curve γm of the G-invariant vector field V on M which starts at m) is a relative equilibrium of V if and only if one (and hence each) of the conditions 1 – 4 of lemma 4.2.1.1 is satisfied. If item 1 is satisfied then we say that ξ ∈ g is a generator of the relative equilibrium m of the G-invariant vector field V . Corollary 4.2.1.2. An equilibrium point m of V with V (m) = Xξ (m) is a relative equilibrium if and only if ξ ∈ gm . Proof. ⇒. Take ξ = 0 in item 1. ⇐. If m is an equilibrium point of V , which is a relative equilibrium then 0 = Xξ (m). So exp t ξ · m = m for every t ∈ R, that is, exp t ξ ∈ Gm for every t ∈ R. Therefore ξ ∈ gm , which is the Lie algebra of the isotropy group Gm . Corollary 4.2.1.3. If m is a relative equilibrium of V generated by ξ ∈ g and the one parameter group R → G : t 7→ exp t ξ is periodic with set of periods Perm = {t ∈ R exp t ξ · m = m}, then Perm is equal to i) R, or ii) Z · τ for a unique τ > 0, or iii) {0}. Proof. Note that Perm is a closed subgroup of R. Hence it is one of the three possibilities listed in the statement of the corollary. In case 1 the integral curve γm of V starting at m is an equilibrium point of V . In case 2

129

4 2. Relative equilibria

it is a nonconstant periodic integral curve with primitive period τ . In case 3 the relative equilibrium γm is not periodic. 4.2.2

Quasiperiodic relative equilibria

A smooth curve γ : R → M is quasiperiodic with at most k-frequencies if and only if there is a smooth map Γ : Rk /Zk → M from the standard k-torus Rk /Zk to M and a vector ν ∈ Rk , called the frequency vector , such that γ(t) = Γ(t ν + Zk ) for every t ∈ R. Let T be a torus, that is, a connected, compact, commutative Lie group. Suppose that t is its Lie algebra. Then the exponential map exp : t → T is a surjective homomorphism of Lie groups from the additive group (t, +) onto T . Its kernel Λ = ker exp is an additive subgroup of the vector space t. There exist {λj }kj=1 ⊆ Λ, which are linearly independent over R, such that Pk every element λ of Λ can be written uniquely as λ = j=1 zj λj for some P k z = (z1 , . . . , zk ) ∈ Zk . The mapping Zk → Λ : z 7→ j=1 zj λj is bijective and is called a Z-basis of Λ. Because Zk is an integer lattice of Rk , we call Λ an integer lattice of t. k

`

Let ` = dim t. We extend {λj }j=1 to an R-basis {λj }j=1 of t. Consider P` e the mapping ψe : R` → T : (θ1 , . . . , θ` ) 7→ exp j=1 θj λj . Then ψ is k k `−k a bijective homomorphism of Lie groups from R /Z × R onto T . Therefore ψe−1 is continuous. Hence Rk /Zk × R`−k is compact, which implies ` = k. Thus ψe induces an isomorphism ψ from the Lie group Rk /Zk onto T0 . Consequently, every Z-basis of Λ is an R-basis of T . Therefore Pk k {λ0j }i=1 is a Z-basis of Λ if and only if k 0 = dim t = k and λ0i = j=1 Aji λj for every 1 ≤ i ≤ k. The entries (Aij ) of the k × k matrix A are integers. Moreover, A is invertible and its inverse has integer entries. Therefore det A = ±1. In other words, A ∈ Gl(k, Z). The mapping ψe above is not unique, the freedom of nonuniqueness is precisely the choice of a Z-basis of Λ. Lemma 4.2.2.4. Let G be a Lie group with Lie algebra g. If ξ ∈ g, then γ : R → G : t 7→ exp t ξ is either a dense one parameter subgroup of a torus subgroup T of G or the mapping γ is proper. Proof. Let T be the closure in G of the set S = {exp t ξ t ∈ R}. Because S is a commutative, connected subgroup of G so is T . Since T is a closed subgroup of G, it is a Lie group, which is isomorphic to (Rk /Zk ) × R` .

130

Reconstruction, relative equilibria and relative periodic orbits

Because the image of γ lies in T , it follows that ξ = γ 0 (0) lies in the Lie algebra t of T . Let (η, ζ) ∈ Rk × R` be the corresponding element in the Lie algebra of (Rk /Zk ) × R` . Then exp t ξ corresponds to (t η + Zk , t ζ) in (Rk /Zk ) × R` . Since S is dense in T , it follows that U = {(t η + Zk , t ζ) t ∈ R} is dense in (Rk /Zk ) × R` . If ζ 6= 0, then the map R → (Rk /Zk ) × R` : t 7→ (t η + Zk , t ζ) is proper, because the projection R` → R : t ζ 7→ t is proper. Hence the mapping R → T : t 7→ exp t ξ is proper. Because T is a closed subset of G it follows that γ is proper. If ζ = 0, then U is contained in the torus subgroup (Rk /Zk ) × {0} of k (R /Zk ) × R` . Because U is dense in (Rk /Zk ) × R` , it follows that ` = 0. Therefore T is a torus subgroup of G. An element ξ ∈ g is elliptic if and only if the image of γ : R → G : t 7→ exp t ξ is dense in a torus subgroup of G. Lemma 4.2.2.5. Let ξ be an elliptic element of the Lie algebra g of the Lie group G. Then 1. the closure in G of {exp t ξ t ∈ R}, is a torus subgroup T of G. Also ξ ∈ t, the Lie algebra of T . k 2. For every Z-basis {λj }j=1 of the integer lattice of t, the mapping ψe : Pk Rk → T : (θ1 , . . . , θk ) 7→ exp j=1 θj λj induces an isomorphism ψ : Rk /Zk → T of Lie groups from the standard k-torus Rk /Zk onto T . 3. Let ν = (ν1 , . . . , νk ) ∈ Rk . Then U = {t ν +Zk t ∈ R} is dense in Rk /Zk k if and only if the real numbers {νj }j=1 are linearly independent over Q.

4. Let ξ = (ξ1 , . . . , ξk ) with respect to the R-basis {λj }kj=1 of t. Then the real k

numbers {ξj }j=1 are linearly independent over Q and U = {t ξ + Zk t ∈ R} is dense in Rk /Zk . Proof. 1. This is just the definition of elliptic element. 2. This follows from lemma 4.2.2.4. Pk 3. Suppose that 0 = j=1 qj νj for some qj ∈ Q, not all zero. Clearing Pk denominators gives 0 = j=1 zj νj for zj ∈ Z not all zero. Therefore we Pk obtain a nonzero linear map ϑ : Rk → R : (θ1 , . . . , θk ) 7→ j=1 zj θj , which maps Zk onto Z. So ϑ induces a nonzero homomorphism ϑe : Rk /Zk → R/Z. e But ker ϑe is a closed subgroup of Rk /Zk of dimension Clearly, U ⊆ ker ϑ.

k − 1. Consequently, U is not dense in Rk /Zk . Conversely, if U is not dense in Rk /Zk , then item 1 implies that the closure of U in Rk /Zk is a

131

4 2. Relative equilibria

subtorus S of Rk /Zk of dimension ` < k. Therefore (Rk /Zk )/S is a torus of dimension k − ` > 0. Composing the isomorphism λ : (Rk /Zk )/S → k−` k−` R/Z with the projection R/Z → R/Z on the first factor, gives a k k surjective homomorphism µ of (R /Z )/S onto R/Z. Therefore ϑ = µ ◦ λ : Rk /Zk → R/Z is a surjective homomorphism such that S ⊆ ker ϑ. The tangent map T ϑ of ϑ is a linear map from Rk to R which sends Zk to Z. Therefore T ϑ = (z1 , . . . , zk ) where zj ∈ Z for 1 ≤ j ≤ k. Because ν ∈ s, Pk the Lie algebra of S, and s ⊆ ker T ϑ, it follows that 0 = j=1 zj νj . This completes the proof of item 3. 4. This follows from item 3, L = ψ −1 (S), and the fact that ψ −1 is continuous. We are now in position to state the following proposition about quasiperiodic relative equilibria. Proposition 4.2.2.6. Suppose that m is a relative equilibrium of a Ginvariant vector field V on a smooth manifold M generated by ξ ∈ g. If ξ is an elliptic element of g, then the integral curve γ : Im → M of V which starts at m is quasiperiodic. Moreover, the minimal number of frequencies of γ is equal to the dimension of the closure in M of {γ(t) ∈ M t ∈ R}. Proof. Because m is a relative equilibrium generated by ξ, it follows that γ(t) = exp t ξ · m = Φm (exp t ξ). Therefore the closure of {γ(t) ∈ M t ∈ R} in M is the image under Φm of the closure T in G of the one parameter subgroup generated by ξ. Since ξ is elliptic, T is a torus. If {λj }kj=1 , where k = dim T , is a Z-basis of the integer lattice Λ = ker exp of t, the Lie algebra of T , then we have a mapping Γ : Rk /Zk → M : (θ1 , . . . , θk ) 7→ exp

k X j=1

θj λj · m.

Pk k With respect to the R-basis {λj }j=1 of t write ξ = j=1 νj λj . Then the real numbers {ν j }kj=1 are linearly independent over Q, since ξ is elliptic, see item 3 in lemma 4.2.2.5. From item 4 of lemma 4.2.2.5 it follows that {t ν +Zk t ∈ R} is dense in Rk /Zk . Because γ(t) = exp t ξ ·m = Γ(t ν +Zk ), we see that {γ(t) ∈ M t ∈ R} is dense in Γ(Rk /Zk ). Since Γ(Rk /Zk ) is a compact, and therefore closed, subset of M , Γ(Rk /Zk ) is equal to the closure of {γ(t) ∈ M t ∈ R} in M . Let ψe : Rk → T be the map constructed in item 2 of lemma 4.2.2.5. Then Γ−1 (m) = ψe−1 (T ∩ Gm ) is a sublattice Λ0 of the integer lattice Λ of t.

132

Reconstruction, relative equilibria and relative periodic orbits

Thus there is an induced isomorphism ϑ of Lie groups from T0 = Λ/Λ0 onto T /(T ∩Gm ). Because T /(T ∩Gm ) is a compact, connected and commutative e = Φm ◦ ϑ is an injective immersion of T0 into group, it is a torus. In fact, Γ e M . Since T0 is compact, Γ is an embedding. Let ` be the minimal number of frequencies attained by a smooth map δ : R` /Z` → M with γ(t) = δ(t ω +Z` ), where ω ∈ R` . The minimality of ` implies that {t ω +Z` t ∈ R} is dense in R` /Z` , using the proof of item 3 in lemma 4.2.2.5. Therefore the image of δ is equal to the closure of {γ(t) ∈ M t ∈ R} in M . So e 0 ). By Sard’s theorem the set of regular values of δ has δ R` /Z` = Γ(T full measure. Thus there is a point in R` /Z` where the tangent map of δ is surjective. Consequently, ` ≥ dim T0 . Thus dim T0 is the minimal number of frequencies. e = Γ, Corollary 4.2.2.7. If the G-action is free at m, then T0 = Rk /Zk , Γ and the minimal number of frequencies is k. 4.2.3

Runaway relative equilibria

A continuous curve γ : I → M with I an open interval in R is a runaway curve in M if and only if the map γ is proper, that is, if K is a compact subset of M , then γ −1 (K) is a compact subset of I. Because γ is continuous, the inverse image of every closed subset of M is a closed subset of I. Since compact subsets of I are closed subsets which are contained in a closed interval [a, b] for some a, b ∈ I, it follows that γ is proper if and only if for every compact subset K of M there are a, b ∈ I such that γ(t) ∈ K implies t ∈ [a, b]. Equivalently, t ∈ I \ [a, b] implies γ(t) 6∈ K. In other words, t 7→ γ(t) runs out of every compact subset of M as t runs to either end of I. Proposition 4.2.3.8. Let m be a relative equilibrium of a G-invariant vector field V on a smooth manifold M generated by ξ ∈ g. Suppose that the G-action is proper. The following statements are equivalent. 1. ξ is not an elliptic element of g. 2. An integral curve γ : Im → M of the vector field V starting at m is a runaway curve on M . 3. γ is not quasiperiodic. Proof. item 1 ⇒ item 2. Suppose that ξ is not elliptic. Then the one parameter group λ : R → G : t 7→ exp t ξ is proper. Because the map Φm

133

4 2. Relative equilibria

is proper by hypothesis, it follows that the map Φm ◦ λ : R → M : t 7→ exp t ξ · m = γ(t) is proper. Therefore item 1⇒ item 2. ¬ item 3 ⇒ ¬ item 2.1 If γ is quasiperiodic, then its image is contained in Γ(Rk /Zk ), see the proof of proposition 4.2.2.6. But Γ(Rk /Zk ) is compact and the map Γ is continuous. Therefore γ is not a runaway curve. ¬ item 1 ⇒ ¬ item 3. This follows from item 1 of lemma 4.2.2.5. Example 4.2.3.9. Let G = R × C with multiplication (ξ, ζ) · (ξ 0 , ζ 0 ) = (ξ + ξ 0 , ei ξ ζ 0 + ζ). The only elliptic element of g is 0. Let M = R × R × C on which G acts by (ξ, ζ), (a, x, z) 7→ (a, ξ + x, ei ξ z + ζ).

The action of G on M is proper and free. The G-orbit space M is R with orbit map π : M → M : (a, x, z) 7→ a. On M define a vector field V by V (a, x, z) = (0, a, ei x ). Then V is G-invariant. An integral curve γ : R → M : t 7→ a(t), x(t), z(t) of V starting at m = a(0), x(0), z(0) is given by a(t) = a(0) x(t) = x(0) + t a(0)  1  i a(0) (eia(0) t − 1)ei x(0) , if a(0) 6= 0 z(t) = z(0) +  t ei x(0) , if a(0) = 0.

The integral curve t → γ(t) is a relative equilibrium of V . The R-action on M defined by the flow of V is not proper. For if a(0) 6= 0 and t = 2π/a(0), then a(t) = a(0), x(t) = x(0) + 2π and z(t) = z(0). Let a(0) → 0 and keep x(0) and z(0) bounded. Then {(a(0), x(0), z(0))} remains in a compact subset of M , whereas t does not remain in any compact subset of R. 4.2.4

Relative equilibria when the action is not free

The results in §2.2 are not optimal when the action of G at the relative equilibrium m ∈ M is not free. In order to obtain stronger results we prove Lemma 4.2.4.10. If m ∈ M , t ∈ Im , g(t) ∈ G, and the integral curve γ : Im → M of the G-invariant vector field V starting at m satisfies γ(t) = g(t) · m for each t ∈ Im , then g(t) ∈ N (Gm ). 1 If

p is a proposition, then ¬ p is its negation.

134

Reconstruction, relative equilibria and relative periodic orbits

Proof. Let H = Gm . Then m ∈ MH . Since the H-isotropy type MH is invariant under the flow of V , it follows that g · m lies in MH , where g = g(t). Therefore H = Gg·m = gGm g −1 = gHg −1 . So g ∈ N (H). Let H = Gm , h = gm , and n be the Lie algebra of N (H). Suppose that m ∈ M is a relative equilibrium point of the vector field V generated by ξ ∈ n. Setting g = exp t ξ, from lemma 4.2.4.10 it follows that exp t ξ ∈ N (H) for every t ∈ R. Therefore ξ' = ξ + h ∈ n/h belongs to the Lie algebra gMH of GMH = N (H)/H. Because GMH acts freely and properly on the isotropy type MH , which contains m, we obtain the following proposition whose statement is the same as proposition 4.2.2.6 with G being replaced by GMH . Proposition 4.2.4.11. Suppose that ξ' is an elliptic element of gMH , then the corresponding relative equilibrium γ : Im → MH , which is an integral curve of V starting at m, is quasiperiodic. Moreover, the minimal number of frequencies of γ is equal to the dimension of the closure in MH of {γ(t) ∈ MH t ∈ R}. 4.2.5

Other relative equilibria in a G-orbit

Let m be a relative equilibrium of a G-invariant vector field V on a smooth manifold M , which is generated by ξ ∈ g. Let γm : Im → M be an integral curve of V starting at m. For every t ∈ Im we have γg·m (t) = g · γm (t) = (g exp t ξ) · m

= (g exp t ξ g −1 )(g · m) = exp tAdg ξ · (g · m).

Thus we have proved Lemma 4.2.5.12. Let m be a relative equilibrium of V generated by ξ ∈ g. Then g · m is a relative equilibrium of V generated by Adg ξ. 4.2.5.1

When the G-action is free

Suppose that the G-action Φ on M is free at the point m ∈ M . Then the orbit map Φm : G → M : g '→ g · m is a diffeomorphism, using the orbit topology on G · m. Lemma 4.2.5.1.13. The orbit map Φm intertwines the flow ϕξt of the vector field Xξ on G with the flow ϕt of the vector field V on G · m.

135

4 2. Relative equilibria

Proof. For every g ∈ G we have

ϕt (Φm (g)) = ϕt (g · m) = g · ϕt (m), = g · (exp t ξ · m),

since V is G-invariant

since ξ generates the relative equilibrium m

= (g exp t ξ) · m = ϕξt (g) · m = Φm (ϕξt (g)).

(9)

Suppose that ξ is an elliptic element of G. Then the closure T in G of {exp t ξ ∈ G t ∈ R} is a torus. Using (9) with g = e we see that the image of T under Φm is diffeomorphic to the closure Γm in M of {ϕt (m) ∈ M t ∈ R}. For an arbitrary g ∈ G, equation (9) implies that the image of the coset gT under Φm is diffeomorphic to the closure Γg·m in G · m of the integral curve of V starting at g · m. Thus we get ` Proposition 4.2.5.1.14. The image of G/T = g∈G gT under the G-orbit map Φm is diffeomorphic to the closure ΓG·m in G · m of the set of integral curves of V which start at a point in G · m. This latter set is G-invariant. Consider the map µ : T ×G → G : (s, g) 7→ gs. Since T is a commutative group, µ defines an action of T on G. In more detail, for every s, s0 ∈ T and every g ∈ G we have µ(s0 s, g) = gs0 s = gss0 = µ(s0 , gs) = µ(s0 , µ(s, g)).

The action µ is free, for if gs = g, then s = e. Moreover, µ is a proper action since T is compact. Therefore the T -orbit map ρ : G → G/T realizes G as a T -principal bundle over the T -orbit space G/T , which is a smooth manifold. Corollary 4.2.5.1.15. The G-orbit map Φm : G → G · m induces an isomorphism of the T -principal bundle ρ with the bundle ΓG·m → G · m, whose total space ΓG·m is the closure in G · m of the set of integral curves of V which start at a point in G · m and whose fiber Γg·m over g · m is the closure in G · m of the integral curve of V starting at g · m. 4.2.5.2

When the G-action is not free

Now suppose that the G-action Φm is not free at the point m ∈ M . Let H = Gm and let MH be the H-isotropy type. Recall that the Lie group GMH = N (H)/H acts freely and properly on MH . Here N (H) is the normalizer of H in G. The GMH -orbit map φm : GMH → GMH · m = {`H · m ∈ MH ` ∈ N (H)} = N (H) · m

(10)

136

Reconstruction, relative equilibria and relative periodic orbits

is a diffeomorphism, using the orbit topology on G·m. Since MH is invariant under the flow ϕt of V , it follows that V |MH is a vector field on MH with flow ϕt |MH . Suppose that m, which lies in MH , is a relative equilibrium of the N (H)invariant vector field V |MH , which is generated by ξ ∈ n, the Lie algebra of N (H). Since gMH , the Lie algebra of GMH , is isomorphic to n/h, where h is the Lie algebra of H, we may write ξe = ξ + h for the image of ξ under the projection mapping n → n/h. Analogous to lemma 4.2.5.1.13 we have

e

Lemma 4.2.5.2.16. The map φm (10) intertwines the flow ϕξt of Xξe on GMH with the flow ϕt |MH of V |MH on MH . Proof. For every ` ∈ N (H) we have ϕt (φm (`H)) = ϕt (`H · m) = ϕt (` · m), = ` · ϕt (m),

since H = Gm

because V is N (H)-invariant

= ` · (exp t ξ · m),

since m is a relative equilibrium of V |MH generated by ξ ∈ n e

= ((` exp t ξ)H) · m = φm (ϕξt (`H)).2

(11)

Suppose that ξ is an elliptic element of n. Then ξe is an elliptic element of gMH . So the closure in GMH of {exp t ξe ∈ GMH t ∈ R} is a torus T in GMH . Using (11) we obtain the following analogue of proposition 4.2.5.1.14. ` Proposition 4.2.5.2.17. The image of GMH /T = g∈GM gT under the H map φm (10) is the closure ΓGMH ·m in GMH · m of the set of all integral curves of V |MH , which start at a point in GMH · m. In addition, we obtain the following analogue of corollary 4.2.5.1.15. Corollary 4.2.5.2.18. The GMH orbit map φm : GMH → GMH ·m induces an isomorphism of the T -principal bundle π : GMH → GMH /T with the bundle ΓGMH ·m → GMH · m, whose fiber over `H · m is the closure Γ`H·m of the integral curve of V |MH which starts at `H · m. 2 We justify the last equality in (11) when n (and hence h) is a Lie subalgebra of gl. Then exp(ξ + h) = (exp ξ)H, using the Campbell–Baker–Hausdorff formula and the fact that h is an ideal in Lie algebra n.

137

4 2. Relative equilibria

Now consider the action of N (H) on G×N (H) GMH defined by N (H) × (G × GMH ) → G × GMH : (n, (g, `H)) 7→ (gn−1 , n`H), (12) where n, ` ∈ N (H) and g ∈ G. Let us look at the map ϑem : G × GMH → G · m : (g, `H) 7→ g`H · m = g` · m. e Since ϑm is invariant under the N (H)-action (12), it induces the map ϑm : G×N (H) GMH → G · m : (g, `H) 7→ g` · m, (13) where G×N (H) MH is the orbit space of the N (H)-action (12) with orbit map G × GMH → G×N (H) GMH : (g, `H) 7→ (g, `H). Lemma 4.2.5.2.19. The map ϑm (13) is a diffeomorphism. Proof. Clearly ϑm is smooth and surjective. Because dim G×N (H) GMH = dim G · m, it follows that ϑm is a local diffeomorphism. To see that it is a diffeomorphism, it suffices to check that a fiber of ϑem is a unique N (H)orbit. Towards this goal suppose that g` · m = g 0 `0 · m for some g, g 0 ∈ G and `, `0 ∈ N (H). Then (g −1 g 0 ) · (`0 · m) = ` · m. But ` · m and `0 · m lie in MH . So g −1 g 0 = n−1 ∈ N (H). Now (g`)−1 (g 0 `0 ) · m = m implies that `−1 (g −1 g 0 )`0 = h ∈ H. Therefore `0 = (g −1 g 0 )−1 `h, which implies `0 H = n`hH = n(`H). (14) Equation (14) together with g 0 = gn−1 shows that (g, `H) and (g 0 , `0 H) lie in the same N (H)-orbit. Hence the fiber of ϑem is a unique N (H)-orbit.

Recall that gMH = n/h, is the Lie algebra of GMH and that ξ ∈ n. Set e e ξe = ξ + h. The flow ϕξt of the vector field Xξe on GMH induces a flow θetξ e e on G × GM defined by θeξ (g, `H) = (g, ` (exp t ξH) ). Since θeξ commutes H

t

with the N (H)-action (12), it induces a flow

t

e θtξ

on G×N (H) GMH .

e

Lemma 4.2.5.2.20. The diffeomorphism ϑm (13) intertwines the flow θtξ on G×N (H) GMH with the flow ϕt of the vector field V on G · m. Proof. Let g ∈ G and ` ∈ N (H). Then e

ϑm (θtξ ((g, `H))) = ϑm ((g, ` exp t ξH)) = g` exp t ξ · m

= g` · ϕt (m) since ξ generates the relative equilibrium m = ϕt (g` · m), since V is G-invariant = ϕt (ϑm ((g, `H))).

(15)

138

Reconstruction, relative equilibria and relative periodic orbits

Since ξ is an elliptic element of n by hypothesis, it follows that ξ# = ξ + h is an elliptic element of gMH = n/h. Therefore the closure T in GMH of {exp t ξ# ∈ GMH t ∈ R} is a torus. From (15) it follows that the image under ϑm (13) of an orbit of the T -action ∗ : T × (G×N (H) GMH ) → G×N (H) GMH : (sH, (g, 'H)) #→ (g, 'sH) (16) is diffeomorphic to the closure Γm in G · m of an integral curve of V which starts at m. Since ξ ∈ n is elliptic, the closure T of {exp t ξ ∈ N (H) t ∈ R} in N (H) is a torus. Note that every element of the torus in GMH may be written as sH, where s is an element of the torus in N (H), because exp ξ# = (exp ξ)H. Define a T -action on G · m by ◦ : T × (G · m) → G · m : (sH, g · m) #→ gs · m. (17) ◦ ◦ Note that the T -action (17) on G · m is free, for if sH (g · m) = g · m, then gs · m = g · m, which implies s · m = m. So s ∈ H. Therefore sH = H, which is the identity element of T . Moreover, T -action ◦ is proper, because T is compact. From ϑm (sH ∗ (g, 'H)) = ϑm (sH, (g, 'H)) = ϑm ((g, 'sH)) it follows that Γm

= g's · m = s ◦ (g' · m) = s ◦ ϑm ((g, 'H)) (18) is an orbit of the T -action (17) on G · m. Thus we obtain

Proposition 4.2.5.2.21. The image under the map ϑm (13) of all the orbits of the T -action (16) is diffeomorphic to the closure ΓG·m in G · m of the union of all integral curves of V , which start at some point in G · m. Moreover, the set ΓG·m is the union of all the orbits of the T -action (17), which pass through some point in G · m. Showing that the T -action (16) is free and proper is a special case of the following situation. Let L be a closed subgroup of a Lie group G and let K be another Lie group. Suppose that G and K act by · and ∗, respectively, on a smooth manifold Q and that the actions of L and K commute. Let L act on G × Q by · : L × (G × Q) → G × Q : (', (g, q)) #→ (g'−1 , ' · q) with orbit space G×L Q and let K act on G × Q by ∗ : K × (G × Q) → G × Q : (k, (g, q)) #→ (g, k ∗ q). Then the K and L-actions on G × Q commute, since the L and K-actions on Q commute. Therefore there is an induced K-action ∗ on the L-orbit space G×L Q. Lemma 4.2.5.2.22. If the K-action on Q is free and proper, then so is the induced K-action on G×L Q.

4 2. Relative equilibria

139

Proof. Suppose that (g, q) = k ∗ (g`−1 , ` · q) for some ` ∈ L. Then (g, q) = (g`−1 , ` · (k ∗ q)). But this implies ` = e and q = k ∗ q. But K acts freely on Q. Therefore k = e. Consequently, the induced K-action ∗ on G×L Q is free. To show that this action is proper we argue as follows. Let ρ : G × Q → G×L Q be the orbit map of the L-action ◦ . Suppose that {xj } is a sequence in G×L Q which converges to x ∈ G×L Q. Then there is a sequence {(gj , qj )} in G×Q with ρ(gj , qj ) = xj , which converges to (g, q) ∈ G×Q with ρ(g, q) = x. Suppose that there is a sequence {kj } in K such that kj ∗ xj → x0 . So there is a sequence {`j } in L such that kj ∗ `j · (gj , qj ) → (g 0 , q 0 ), where ρ(g 0 , q 0 ) = x0 . Then gj `−1 → g 0 and `j · kj ∗ qj → q 0 . Because gj → g, we j 0 −1 see that `j = (g ) gj → (g 0 )−1 g = `, which lies in L since L is a closed subgroup of G. Since the action of K on Q is proper, there is a subsequence {kjn } of {kj } which converges in K to k. Therefore kjn ∗ (`jn · qjn ) → k ∗ (` · q) = q 0 . Consequently, the induced K-action on G×L Q is proper. Applying lemma 4.2.5.2.22 with L = N (H), K = T , and Q = GMH , we conclude that the T -action (16) on G×N (H) GMH is free and proper. In addition, the T -action (17) is free and proper. Therefore we obtain the following corollary of proposition 4.2.5.2.21. Corollary 4.2.5.2.23. The diffeomorphism ϑm (13) induces an isomorphism of the T -principal bundle ρe : G×N (H) GMH → (G×N (H) GMH )/T with the T -principal bundle ρ : G · m → (G · m)/T , whose fiber over ρ(g · m) is Γg·m , the closure in G · m of the integral curve of V which starts at g · m. 4.2.6

Smooth families of quasiperiodic relative equilibria

Let F be the image under the G-orbit map π of the set of relative equilibrium points of a G-invariant vector field V on a smooth manifold M . In this subsection we find a smooth manifold in F , whose preimage under π is a smooth family of quasiperiodic integral curves of V . To see some of the difficulties involved in finding smooth families of quasiperiodic orbits, consider the following situation. Let T be a torus in G with Lie algebra t. For each ξ ∈ t let Tξ be the closure in T of {exp t ξ t ∈ R}. Then Tξ is the smallest closed subgroup of T such that ξ belongs to its Lie algebra tξ of Tξ . Suppose that dim T > 1. Then for every 1 ≤ k ≤ dim T the set {ξ ∈ t dim Tξ = k} is dense in t. This shows that the mapping ξ 7→ Tξ is highly discontinuous – even the mapping ξ 7→ dim Tξ

140

Reconstruction, relative equilibria and relative periodic orbits

is everywhere discontinuous. So if G contains tori T of dimension greater than 1, then one needs a very detailed description of the dependence of Tξ on the parameter ξ to show that the closure of a quasiperiodic orbit of the G-invariant vector field V corresponding to a relative equilibrium generated by ξ depends smoothly on ξ. If one drops the requirement that such quasiperiodic orbits depend on a minimal number of frequencies, then, under mild hypotheses, we shall find smooth families of tori on which the motion is quasiperiodic. 4.2.6.1

Elliptic, regular, and stably elliptic elements of g

We start by defining certain subspaces of the Lie algebra g of the Lie group G, which will play an important role later on. Let G be a Lie group with Lie algebra g. For any ξ ∈ g, the centralizer gξ of ξ in g is {η ∈ g [ξ, η] = 0}. Note that gξ is the Lie algebra of the centralizer Gξ of ξ in G, namely, {g ∈ G Adg ξ = ξ}. An element ξ ∈ g is regular if and only if there is an open neighborhood U of ξ in g such that dim gξ ≤ dim gη for every η ∈ U . Because dim gη ≤ dim gξ for every η ∈ g near ξ, it follows that ξ is a regular element of g if and only if dim gη is constant for every η ∈ g near ξ. Recall that ξ ∈ g is elliptic if and only if {exp t ξ ∈ G t ∈ R} is dense in a torus subgroup of G. An element ξ in g is stably elliptic if there is an open neighborhood U of ξ in g such that every η ∈ U is an elliptic element of g. Lemma 4.2.6.1.24. Let G be a Lie group with Lie algebra g and let ξ ∈ g. The following statements are equivalent. 1. ξ is a regular, stably elliptic element of g. 2. The centralizer gξ of ξ in g is equal to the Lie algebra of a torus subgroup of G. Proof. item 1 ⇒ item 2. Let adξ : g → g : η 7→ [ξ, η]. Then exp t adξ = Adexp t ξ for all t ∈ R, see theorem 1.5.2a) of [42] . If ξ is an elliptic element of g, then the closure in G of {exp t ξ ∈ G t ∈ R} is a torus subgroup of G. Because the adjoint representation Ad : G → Gl(g, R) : g 7→ Adg is a continuous homomorphism, it follows that {exp t adξ ∈ Gl(g, R) t ∈ R} is contained in a compact subgroup K of Gl(g, R). Averaging an arbitrary inner product on g over K, we obtain a K-invariant inner product β on g, see corollary 4.2.2 in [42]. Consequently, t 7→ exp t ξ is a one parameter group of β-orthogonal transformations of (g, β). Therefore adξ is

4 2. Relative equilibria

141

a β-antisymmetric linear transformation. This implies that im adξ is the β-orthogonal complement of ker adξ in g. Therefore g = im adξ ⊕ ker adξ (19) and adξ |im adξ is a bijective linear map of im adξ into itself. Consider the mapping ψ : G × gξ → g : (g, ζ) 7→ Adg ζ. Then its tangent at (e, ξ) is T(e,ξ) ψ : g × gξ → g : (ρ, η) 7→ [ρ, ξ] + η. Because of (19) we see that T(e,ξ) ψ is surjective. Hence, by the implicit function theorem there is an open neighborhood U of ξ in g and real analytic mappings ϕ : U ⊆ g → G and χ : U ⊆ g → gξ such that ϕ(ξ) = e, χ(ξ) = ξ, and η = Adϕ(η) χ(η) (20) for every η ∈ U . So every element η of g near ξ is conjugate by an element ϕ(η)−1 in the identity component G0 of G to an element χ(η) in gξ . Let ζ ∈ gξ . Then the linear mappings adξ and adζ commute. Therefore adζ leaves gξ = ker adξ and im adξ invariant. If ζ is sufficiently close to ξ in gξ , then adζ |im adξ is bijective. Consequently, gζ ⊆ gξ . From the hypothesis that ξ is a regular element of g, it follows that gζ = gξ for all ζ ∈ gξ sufficiently close to ξ. Let η ∈ gξ be sufficiently close to 0. Then ζ = ξ + η is sufficiently close to ξ in gξ so that gζ = gξ . Thus for ρ ∈ gξ ⊆ gζ we have 0 = [ζ, ρ] = [ξ + η, ρ] = [η, ρ], since ρ ∈ gξ . Because the map gξ → g : η 7→ [η, ρ] is linear, it follows that gξ is commutative. Since T = exp gξ is the identity component of a closed subgroup of G, it is also a closed subgroup of G. If for ζ ∈ gξ , the closure in G of {exp t ζ t ∈ R} is a torus subgroup of G, then it is a torus subgroup of T . If ξ is a regular stably elliptic element of g, then it is a stably elliptic element of t. Because T is a connected commutative Lie group, it is isomorphic to (R/Z)k × R` . But the Lie algebra of (R/Z)k × R` contains stably elliptic elements only if ` = 0. This proves item 1 ⇒ item 2. item 2 ⇒ item 1. Suppose that item 2 holds. Then for every η ∈ gξ we have exp tη ∈ T for every t ∈ R. Hence the closure in T of {exp t η t ∈ R} is a connected commutative subgroup of T and hence is a torus subgroup of T . Consequently, η is an elliptic element of g. Because ξ ∈ gξ , it follows that ξ is an elliptic element of g. Because every element ζ of g near ξ is conjugate to an element of gξ , we conclude that ζ is elliptic. So ξ is stably elliptic, which implies that gξ ⊆ gη for every η ∈ gξ . Since T is commutative, gξ is commutative. Therefore ξ is a regular element of g. This proves item 2 ⇒ item 1.

142

Reconstruction, relative equilibria and relative periodic orbits

Let gse , greg , and grse be the set of stably elliptic, regular, and regular stably elliptic elements of g, respectively. Lemma 4.2.6.1.25. grse is an open dense subset of gse . Moreover, for every ξ ∈ grse there is an open neighborhood U of ξ in g and a real analytic mapping θ : U ⊆ g → G with θ(ξ) = e such that gζ = gξ , where ζ = Adθ(η) η and η ∈ U . Proof. By definition the sets gse and greg are open subsets of g. If ξ ∈ gse but is not an element of greg , then there is an η ∈ g arbitrarily close to ξ such that dim gη < dim gξ . Because dim g is finite, we can repeat this argument only finitely many times, after which we obtain an element η ∈ g, which is arbitrarily close to ξ, such that locally dim gη is minimal. Therefore η ∈ greg . So grse is an open dense subset of gse . −1 If we write θ(η) = ϕ(η) , then from η = Adϕ(η) χ(η) (20) it follows that ζ = Adθ(η) η = χ(η) ∈ gξ . Let C be a connected component of grse . Corollary 4.2.6.1.26. For any ξ, η ∈ C there is an element g in the identity component G0 of G such that gη = Adg gξ . In particular, for ξ ∈ C, the dimension of gξ is constant, say k and the map ξ 7→ gξ from C into the Grassmannian manifold of k-dimensional subspaces of g is real analytic. Proof. We say that two elements ξ and η of g are related if gξ is conjugate to gη by g ∈ G0 . Being related is an equivalence relation on G. By lemma 4.2.6.1.25 nearby elements of C are equivalent. Since C is connected, it follows that gη is conjugate to gξ for every ξ, η ∈ C by an element of G. Therefore dim gξ is constant for every ξ ∈ C. For η near ξ in C, we know that gη is conjugate to gξ by g ∈ G0 , which depends real analytically on η. Therefore the mapping η 7→ gη is real analytic in an open neighborhood of ξ in C. In particular, the map ξ 7→ gξ is real analytic on C. Example 4.2.6.1.27. 1. If G0 is a compact subgroup of G, then gse = g. Moreover, grse = greg . If ξ ∈ greg , then gξ is a maximal abelian subspace of g and exp gξ is a maximal torus in G0 . All maximal tori in G0 are conjugate by an element of G0 . In particular, they have the same dimension, called the rank of G0 , see theorem 3.7.1 of [42]. Conversely, if every element of g is elliptic, then G0 is compact.

143

4 2. Relative equilibria

2. A proper Euclidean motion of R2 is an affine mapping (A, a) : R2 → R2 : x 7→ Ax + a, where (A, a) ∈ SO(2) × R2 . The set of all proper Euclidean motions forms the Euclidean group E(2) whose multiplication is given by composition of affine maps. The Lie algebra of E(2) is e(2), which is the set of all infinitesimal Euclidean motions, namely, the set of all affine maps (Ξ, ξ) : R2 → R2 : x 7→ Ξx + ξ, where (Ξ, ξ) ∈ so(2) × R2 . If Ξ 6= 0, then Ξ is a bijective linear map and (Ξ, ξ) is an infinitesimal rotation about p = −Ξ−1 ξ. Let E(2) act on e(2) by conjugation. Then every element of the isotropy group E(2)(Ξ,ξ) leaves p fixed. Therefore E(2)(Ξ,ξ) is the group rse of rotations about p, which is a circle. Consequently, (Ξ, ξ) ∈ e(2) if and rse only if Ξ 6= 0. This shows that e(2) is an open dense subset of e(2). 3. Let Sl(2, R) be the set of 2 × 2 real matrices with determinant 1. Its „ « Lie algebra sl(2, R) is the set of 2 × 2 real matrices ξ =

a b c −a

. Then

sl(2, R)rse = {ξ ∈ sl(2, R) a2 + bc < 0} is the disjoint union of two open convex cones in R3 . If a2 + bc = 0, then ξ is elliptic if and only if ξ = 0. 4.2.6.2

When the G-action is free and proper

Suppose that Φ is free and proper action of a Lie group G on a smooth manifold M with orbit map π : M → M = M/G : m 7→ m. Let V be a Ginvariant smooth vector field on M with flow ϕt . Since Φ is free and proper, the G-orbit space M is a smooth manifold and the G-invariant vector field V induces a smooth vector field V on M with flow ϕt . Let F = {m ∈ M V (m) = 0} = {m ∈ M ϕt (m) = m}. Then F is a smooth submanifold of M . Therefore π −1 (F ) is a G-invariant, locally closed smooth submanifold of M . So π|π −1 (F ) : π −1 (F ) → F is a principal G-bundle. Because π −1 (F ) consists of all of the relative equilibria of V , it is invariant under the flow of V . Additionally, V |π −1 (F ) is complete. Lemma 4.2.6.2.28. Let m ∈ π −1 (F ) and let ξ(m) be the unique element of g which generates the relative equilibrium of V |π −1 (F ) through m. Then the mapping π −1 (F ) → g : m 7→ ξ(m) is smooth. Proof. Write N = π −1 (F ). Let D : N → T N be the smooth distribution D on N defined by m 7→ Dm = span{Xξ (m) ∈ Tm N ξ ∈ g} = Te Φm g. Because V is a G-invariant vector field, V |N is a smooth map from N into

144

Reconstruction, relative equilibria and relative periodic orbits

T N , which has values in D. The map α : N × g → D : (m, ξ) 7→ Xξ (m) is smooth. α is surjective by definition. It is injective, for if Xξ (m) = Xη (n), then applying the tangent bundle projection τ : T N → N to both sides of the preceding equation gives m = n, since Xξ and Xη are smooth vector fields on N . But then Xξ (m) = Xη (m), which implies Xξ−η (m) = 0. So for all t ∈ R we have exp t(ξ − η) · m = m. However, by hypothesis the G action on M is free. Therefore ξ = η. Additionally, α has a bijective tangent map. Consequently, the map α is a diffeomorphism. Let π2 : N × g → N : (m, ξ) 7→ ξ. Then the mapping π2 ◦ α−1 ◦ V |N , which sends m to ξ(m), is smooth. Let F rse = {f ∈ F ξ(m) ∈ grse for some m ∈ π −1 ({f })}.

Because grse is invariant under conjugation by elements of G and ξ(g · m) = Adg ξ(m), it follows that ξ(m) ∈ grse for every m ∈ π −1 ({f }), where f ∈ F . Since the mapping π −1 (F ) → g : m 7→ ξ(m) is continuous, we deduce that ξ −1 (grse ) is an G-invariant open subset of M . Therefore, F rse = π(ξ −1 (grse )) is an open subset of F . Proposition 4.2.6.2.29. Let m0 ∈ π −1 (F ) such that f0 = π(m0 ) ∈ F rse . Set t = gξ(m0 ) and let T = exp t. Then there is an open neighborhood U of f0 in F rse and a smooth section σ : U ⊆ F rse → π −1 (U ) of the bundle π|π −1 (U ) : π −1 (U ) → U such that ξ(σ(f )) ∈ t for every f ∈ U . Proof. Because π|π −1 (F ) : π −1 (F ) → F is a smooth fibration, there is an open neighborhood U1 of f0 in F rse and a smooth section σ1 : U1 → π −1 (U1 ) of π|π −1 (U ) : π −1 (U1 ) → U1 . From lemma 4.2.6.1.25 there is an open neighborhood Ve of ξ(m0 ) in g and a real analytic mapping θ : Ve ⊆ g → G such that Adθ(ξ0 ) ξ 0 ∈ t for every ξ 0 ∈ Ve . Take m = σ1 (f ), ξ 0 = ξ(σ1 (f )), and g = θ(ξ 0 ). Set σ(f ) = θ(ξ 0 ) · ξ 0 and let U = σ1−1 (ξ −1 (Ve )). Then U is an open neighborhood of f0 in F rse such that the map σ : U → π −1 (U ) : f → σ(f ) is defined and is smooth. Because π(g · m) = π(m) = π(σ1 (f )) = f,

we see that σ is a smooth section of the bundle π|π −1 (U ). For every f ∈ U we have σ(f ) = θ(ξ 0 ) · ξ 0 , which implies ξ(σ(f )) = Adθ(ξ0 ) (ξ 0 ) ∈ t. We use the notation of proposition 4.2.6.2.29. We have a free and proper action of the torus T on U × G defined by ∗ : T × (U × G) → U × G : (s, (f, g)) 7→ (f, gs)

(21)

145

4 2. Relative equilibria

and a free and proper action of T on π −1 (U ) defined by •

: T × π −1 (U ) → π −1 (U ) : (s, g · σ(f )) 7→ gs · σf.

(22)

The mapping λ : U × G → π −1 (U ) : (f, g) 7→ g · σ(f ) is a diffeomorphism, which intertwines the T -actions ∗ and

(23) •

because

λ(s ∗ (f, g)) = λ(f, gs) = gs · σ(f ) = s•(g · σ(f )) = s•λ(f, g). Lemma 4.2.6.2.30. The mapping λ (23) intertwines the R-action ◦ : R × (U × G) → U × G : (t, (f, g)) 7→ f, g exp t ξ(σ(f ))

(24)

with the flow ϕt |π −1 (U ) of the G-invariant vector field V |π −1 (U ). Proof. For (t, (f, g)) ∈ R × (U × G) we have λ(t ◦ (f, g)) = λ(f, g exp t ξ(σ(f ))) = g exp t ξ(σ(f )) · σ(f ) = g · ϕt (σ(f )),

= ϕt (g · σ(f )), = ϕt (λ(f, g)).

since σ(U ) ⊆ π −1 (U )

since V |π −1 (U ) is G-invariant

Thus there is a smooth mapping U ⊆ F rse → t : f 7→ ξ(σ(f )) such that V (σ(f )) = ξ(σ(f )) for every f ∈ U . Let C be the connected component of F rse , which contains f0 and let N be the connected component of {m ∈ π −1 (C) ξ(m) ∈ t} which contains m0 . e of the G-action Φ : G×M → M Then N is invariant under the restriction Φ to N (T ) × N , because for (n, m) ∈ N (T ) × N we have ξ(n · m) = Adn ξ(m) ∈ Adn t = t.

(25)

The last equality in (25) holds because n ∈ N (T ). Since N is a locally closed subset of M , the proper free G-action Φ induces a proper free N (T )e on N with orbit map π action Φ e : N → C. Lemma 4.2.6.2.31. π e : N → C is a principal N (T )-bundle.

Proof. We describe the bundle π e using local trivializations and transition maps on their overlaps as follows. Let m e ∈ N and fe = π e(m) e ∈ C. Let Cσ be the open subset of C containing fe and let σ : Cσ → π −1 (Cσ ) be the smooth section of π|π −1 (Cσ ) constructed in proposition 4.2.6.2.29. Then for

146

Reconstruction, relative equilibria and relative periodic orbits

every f ∈ Cσ , we have ξ(σ(f )) ∈ gξ(m) e = t, since σ(f ) ∈ N . Consequently, σ(Cσ ) ⊆ N . Because there is a unique G-orbit through σ(f ), we have π e(σ(f )) = π(σ(f )) = f , where the last equality holds because σ is a section e−1 (Cσ ) and σ is a smooth section of of π|π −1 (Cσ ). Therefore σ(Cσ ) ⊆ π −1 π e|e π (Cσ ). Consider the map λσ : Cσ × N (T ) → π e−1 (Cσ ) : (f, n) 7→ n · σ(f ).

Then λσ is a diffeomorphism, which is a local trivialization of π e |e π because π e λσ (f, n) = π e(n · σ(f )) = π e (σ(f )) = f

(26)

−1

(Cσ )

is the projection map Cσ × N (T ) → Cσ . Since {Cσ } forms an open covering of C, to finish the description of the bundle π e it remains to find the transition map between the overlap of two local trivializations λσ and λτ . This we do as follows. First note that ξ(N ) is a connected component of grse . Using lemma 4.2.6.1.25 we see that for f ∈ Cσ ∩ Cτ there is a g = gτ σ (f ) ∈ G0 , which depends smoothly on f with gτ σ (fe) = e, such that ξ(τ (f )) = Adg ξ(σ(f )). Therefore t = gξ(τ (f )) = Adg gξ(σ(f )) = Adg t,

(27)

since σ(f ), τ (f ) ∈ N implies that ξ(σ(f )) and ξ(τ (f )) both lie in t ⊆ grse . From (27) and exp(Adg ξ) = Adg (exp ξ), we get g T g −1 = T , which implies g ∈ N (T ). Because σ and τ are sections of π e|(e π −1 (Cσ ∩ Cτ )), for f ∈ Cσ ∩ Cτ we have π e (σ(f )) = π e(τ (f )) = f . Therefore σ(f ) and τ (f ) lie in the same N (T )-orbit. So there is an n = n(f ) ∈ N (T ), depending smoothly on f , such that τ (f ) = n·σ(f ). But then ξ(τ (f )) = Adn (ξ(σ(f ))). Using the fact that ξ(τ (f )) = Adg (ξ(σ(f ))), we get Adg−1 n (ξ(σ(f ))) = ξ(σ(f )), which implies Xξ(σ(f )) (g −1 n · σ(f )) = XAdg−1 n (ξ(σ(f ))) (σ(f )) = Xξ(σ(f )) (σ(f )).

(28)

Applying the tangent bundle projection map T N → N to both sides of (28) gives g −1 n · σ(f ) = σ(f ). But the N (T )-action on N is free, so g −1 n = e, that is, n(f ) = gτ σ (f ) for every f ∈ Cσ ∩ Cτ . The transition map between the overlap of Cτ × N (T ) and Cσ × N (T ) is

◦ λτ : (Cτ ∩Cσ )×N (T ) → (Cτ ∩Cσ )×N (T ) : (f, n) 7→ (f, n·gτ σ (f )), ϕτ σ = λ−1 σ

because λτ (f, n) = n · τ (f ) = ngτ σ (f ) · σ(f ) = λσ (f, ngτ σ (f )).

Note that the map gτ σ : Cτ ∩ Cσ → N (T ) : f 7→ gτ σ (f ) is smooth.

147

4 2. Relative equilibria

Corollary 4.2.6.2.32. Corresponding to the principal N (T )-bundle π e : N → C is the G-bundle π b : G · N → C, where π b = π|(G · N ). Proof. Take the local trivializations to be

µσ : Cσ × G → π b−1 (Cσ ) : (f, g) 7→ g · σ(f )

and the transition map between the overlap of Cσ × G and Cτ × G to be φτ σ : (Cσ ∩ Cτ ) × G → (Cσ ∩ Cτ ) × G :(f, g) 7→ (f, ggτ σ (f )). We now construct a T -bundle associated to the G-bundle π b constructed in corollary 4.2.6.2.32. On G · N consider the T -action • defined by •

: T × (G · N ) → G · N : (s, g · m) 7→ gs · m.

(29)

Let π : G · N → C be its orbit map. The set {f } × gT = {(f, gs) ∈ Cσ × G s ∈ T } is an orbit of the T -action (29) on G · N and thus is a fiber of the bundle π. The map µσ : Cσ × gT → π −1 (Cσ ) : (f, gT ) 7→ gT · σ(f ) is a diffeomorphism and is in fact a local trivialization of the bundle π because π ◦ µσ (f, gT ) = π(gT · σ(f )) = π(σ(f )) = π b (σ(f )) = f.

The second to last equality above follows because the T -orbit of σ(f ) ∈ N is contained in the unique G-orbit through σ(f ). The last equality follows because σ is a smooth section of π b|b π −1 (C). To complete the description of the T -bundle π, we note that the transition map between the overlap of Cσ × gT and Cτ × ggτ σ T is given by ◦ µτ : (Cσ ∩Cτ )×gT → (Cσ ∩Cτ )×ggτ σ T : (f, gT ) 7→ (f, ggτ σ (f )T ), ψτ σ = µ−1 σ

since µτ (f, gT ) = gT · τ (f ) = gT gτ σ (f ) · σ(f ) = ggτ σ (f )T · σ(f ),

= µσ (f, ggτ σ (f )T ).

because gτ σ (f ) ∈ N (T )

The T -bundle π need not be a principal bundle. We now develop a criterion to say when it is. Let Z(T ) = {g ∈ G gsg −1 = s for every s ∈ T } be the centralizer of T in G. Clearly Z(T ) is a closed subgroup of G. From

148

Reconstruction, relative equilibria and relative periodic orbits

sT s−1 = T for every s ∈ Z(T ) it follows that Z(T ) is a subgroup of N (T ). In addition, Z(T ) is a normal subgroup of N (T ).3 On G · N define an action of Z(T ) by

: Z(T ) × (G · N ) → G · N : (z, g · m) 7→ gz −1 · m. (30) e This action is free and proper. Let C be the orbit space (G · N )/Z(T ) of e The free and proper action the action ◦ with orbit map π ◦ : G · N → C. ◦

of N (T ) on G · N given by •

: N (T ) × (G · N ) → G · N : (n, g · m) 7→ ng · m

(31)

commutes with the Z(T )-action and therefore induces an N (T )-action e whose orbit map is π on C, ˇ : Ce → C. Let W (T ) = N (T )/Z(T ). Observe that π ˇ is a W (T )-bundle over C, because e (T ) = (G · N )/Z(T ) / N (T )/Z(T ) = (G · N )/N (T ) = C. C/W ◦

We have

Lemma 4.2.6.2.33. W (T ) is a discrete group. Proof. The Lie algebra of N (T ) is {η ∈ g [ξ, η] ∈ t for every ξ ∈ t }. If ξ is a regular element of t, then g = im adξ ⊕t and adξ |im adξ is a bijective linear map of im adξ onto itself. Therefore adξ η ∈ t if and only if η ∈ t. Thus the Lie algebra of N (T ) is t. This implies that the identity component of N (T ) is T . Therefore N (T group and consequently Z(T )/T )/T is a discrete is also. Hence N (T )/T / Z(T )/T = N (T )/Z(T ) = W (T ) is a discrete group. W (T ) is the Weyl group of T . Because N (T ) acts on T by conjugation, there is an induced effective action of W (T ) on T . Therefore W (T ) may be viewed as the automorphism group Aut(T ) of T . Because W (T ) is discrete, we obtain Corollary 4.2.6.2.34. π ˇ : Ce → C is a covering map with fiber W (T ). 3 To

see this we argue as follows. For every z ∈ Z(T ), every n ∈ N (T ), and every t ∈ T we have (nzn−1 )t(nzn−1 )−1 = nz(n−1 tn)z −1 n−1 = n(zt0 z −1 )n−1 , where t0 = n−1 tn ∈ T since n ∈ N (T ) 0 −1

= nt n

, since z ∈ Z(T )

= t. Therefore nzn−1 ∈ Z(T ), which implies nZ(T )n−1 ⊆ Z(T ) for every n ∈ N (T ). But then Z(T ) ⊆ n−1 Z(T )n = (n−1 )Z(T )(n−1 )−1 ⊆ Z(T ). So (n−1 )Z(T )(n−1 )−1 = Z(T ) for every n−1 ∈ N (T ), which proves the assertion.

4 2. Relative equilibria

149

Suppose that γ : [0, 1] → C is a closed curve which begins and ends at e f ∈ C. Because W (T ) is discrete, γ lifts to a unique curve γ e : [0, 1] → C, −1 −1 e e e e where γ e(0) = f1 and γ e(1) = f2 with f1 , f2 ∈ π ˇ ({f }). Since π ˇ ({f }) = W (T ), there is a unique element w(γ) ∈ W (T ) such that fe2 = w(γ)f1 . If γ1 and γ2 are two homotopic closed curves in C which begin and end at f , then their lifts γ e1 and γ e2 are homotopic closed curves in Ce which begin e e at f1 and end at f2 . Therefore, w(γ1 ) = w(γ2 ). So w(γ) depends only on the homotopy class [γ] of γ based at f . If γ2 γ1 is the concatenation of γ1 with γ2 , formed by joining the end point of γ1 to the beginning point of γ2 , then w(γ2 γ1 ) = w(γ2 ) w(γ1 ). So we have a homomorphism µπˇ : π1 (C, f ) → W (T ) : [γ] 7→ w([γ]), called the monodromy homomorphism of the W (T )-bundle π ˇ. Corollary 4.2.6.2.35. The bundle π ˇ : Ce → C is a principal W (T )-bundle if and only if its monodromy homomorphism µπˇ is trivial. e is a product bundle C × W (T ), which is Proof. ⇒. If µπˇ is trivial, then C clearly a principal W (T )-bundle.

⇐. Suppose that π ˇ is a principal W (T )-bundle. Then the monodromy homomorphism µπˇ : π1 (C, f ) → W (T ) : [γ] → w([γ]) is defined and is continuous. Since W (T ) is discrete, it follows from continuity that µπˇ is constant. Because µπˇ is a homomorphism, this constant is the identity element e of W (T ). In other words, µπˇ (π1 (C, f )) = e, that is, µπˇ is trivial.

We return to looking for a criterion to decide when the T -bundle π : G · N → C is a principal bundle. Since the N (T ) and Z(T ) actions (31) and (30), respectively, commute with the T -action (29), the T -bundle π can be factored into the composition of the Z(T )-bundle π ◦ : G · N → Ce followed by the W (T )-bundle π ˇ : Ce → C. In other words, π = π ˇ ◦ π ◦ . Because π ◦ is a principal bundle by construction, π is a principal T -bundle if and only if π ˇ is a principal W (T )-bundle. But this is the case if and only if the monodromy homomorphism µπˇ is trivial. We now recall how to define the monodromy homomorphism of the T -bundle π. Let γ be a closed curve in C starting and ending at f . Transporting a fiber T of the bundle π along C gives an automorphism µπ ([γ]) of the fiber π −1 ({f }) = T , which depends only on the homotopy class [γ] of γ. The map µπ : π1 (C, f ) → Aut(T ) : [γ] 7→ µπ ([γ]) is a continuous homomorphism, called the monodromy homomorphism of π.

150

Reconstruction, relative equilibria and relative periodic orbits

Next we find a relation between the monodromy homomorphism µπ of the bundle π and the monodromy homomorphism µπˇ of the bundle π ˇ. Because π ◦ (π −1 (γ)) = π ˇ −1 (γ), the T -automorphism µπ ([γ]) induces the W (T )-automorphism µπˇ ([γ]). Since Aut(T ) = W (T ), every T for a unique w([γ]) e ∈ W (T ). automorphism µπ ([γ]) is of the form Adw([γ]) e Because this latter T -automorphism induces the W (T )-automorphism given by multiplication by w([γ]), e it follows that µπ ([γ]) = Adµπˇ ([γ]) . Therefore µπ is trivial if and only if µπˇ is trivial. So we have proved Proposition 4.2.6.2.36. π : G · N → C is a principal T -bundle if and only if its monodromy homomorphism µπ is trivial.

Example 4.2.6.2.37. 1. Let G be a compact noncommutative Lie group with Lie algebra g. Then gse = g and grse = greg . For ξ ∈ greg the tori T = exp gξ are the maximal tori of G, which are conjugate to one another, see theorem 3.7.1 in [42]. The Weyl group W (T ) is a finite group, which is nontrivial. 2. If G = E(2), the group of proper Euclidean motions of the plane, see example 2 in §2.6.1.4, then a maximal torus T is a circle group and maximal tori and conjugate to one another. Thus the Weyl group is trivial. Therefore the torus fibration π −1 (grse ) is a principal torus bundle. 4.2.6.3

When the G-action is proper but not free

Suppose that the G-action Φ on M is proper, but not necessarily free. In this subsection we give results which are analogous to those found in §2.6.2. Most of the proofs are omitted, because they are similiar to the ones given in §2.6.2. Let m ∈ M and let H = Gm . The H-isotropy type MH is a smooth submanifold of M , on which the Lie group GMH = N (H)/H acts freely and properly. Let πMH be the GMH -orbit map. The image MH of MH under πMH is a smooth submanifold of the differential space (M , C ∞ (M )), where M is the space of G-orbits on M . Let FMH be the smooth submanifold of MH consisting of all fixed points of the reduced flow of the GMH -invariant vector field V |MH . Here V is a −1 G-invariant vector field on M . Then πM (FMH ) is the set of all relative H equilibria of V |MH on MH and thus is invariant under GMH -action. Let −1 e gMH be the Lie algebra of GMH . For each m ∈ πM (FMH ) let ξ(m) ∈ gMH H

4 2. Relative equilibria

151

be the generator of the relative equilibrium m of the vector field V |MH . Then −1 e Lemma 4.2.6.3.38. The map πM (FMH ) → gMH : m 7→ ξ(m) is smooth. H

Proof. The proof is analogous to that of lemma 4.2.6.2.28.

−1 rse e ∈ grse Let FM be {f ∈ FMH ξ(m) MH for some m ∈ πMH (f ) }. Then H is an open subset of FMH . Analogous to proposition 4.2.6.2.29 we have rse FM H

−1 Proposition 4.2.6.3.39. Let m0 ∈ πM (FMH ) such that f0 = πMH (m0 ) ∈ H rse e 0 ) in gM and let T = exp t. Then FM . Let t be the centralizer of ξ(m H H rse there is an open neighborhood U of f0 in FM and a smooth section σ : H −1 −1 rse e U ⊆ FM → π (U ) of the bundle π |π (U ) such that ξ(σ(f )) ∈ t for MH MH MH H every f ∈ U .

Proof. See the proof of proposition 4.2.6.2.29. Let

−1 λ : U × GMH → πM (U ) : (f, g) 7→ g · σ(f ). H

Then λ intertwines the R-action

e R × (U × GMH ) → U × GMH : (t, (f, g)) 7→ f, g exp t ξ(σ(f ))

−1 −1 with the flow ϕt |πM (U ) of the GMH -invariant vector field V |πM (U ). H H rse Let C MH be the connected component of FM which contains f0 and H −1 e let NMH be the connected component of {m ∈ πM (C MH ) ξ(m) ∈ t}, H which contains m0 . Since NMH is a locally closed subset of MH , the proper free action of GMH on MH induces a proper free action on NMH of the normalizer N (T ) of T in GMH . Let π eMH : NMH → C MH be the N (T )-orbit map. Analogous to lemma 4.2.6.2.31 we have

Lemma 4.2.6.3.40. π eMH : NMH → C MH is a principal N (T )-bundle.

Corresponding to the N (T )-bundle π eMH we have a GMH -bundle π bMH : MH G · NMH → C with projection map π bMH = πMH |(G · NMH ) and local trivializations −1 (CσMH ) : (f, g) 7→ g · σ(f ). µσ : CσMH × GMH → πM H

Here CσMH is the open set U containing fe = πMH (m) e with m e ∈ NMH −1 MH MH and σ : Cσ → π eMH (Cσ ) the smooth section constructed in proposition

152

Reconstruction, relative equilibria and relative periodic orbits

e 4.2.6.3.39. Moreover, ξ(σ(f )) ∈ t for every f ∈ CσMH . The transition map MH between the overlap of Cσ × GMH and CτMH × GMH is φτ σ : (CσMH ∩ CτMH ) × GMH → (CσMH ∩ CτMH ) × GMH : (f, g) 7→ (f, ggτ σ (f )), where gτ σ : (CσMH ∩ CτMH ) → N (T ) : f → gτ σ (f ) is the smooth map such that the transition map between the overlap of the −1 local trivializations λσ : CσMH ×N (T ) → π eM (CσMH ) and λτ : CτMH ×N (T ) → H −1 MH π eMH (Cτ ) of the principal N (T )-bundle π eMH : G · NMH → C MH is given by ϕτ σ : (CσMH ∩ CτMH ) × N (T ) → (CσMH ∩ CτMH ) × N (T ) : (f, n) 7→ (f, ngτ σ (f )).

As in §2.6.2 we can construct a T -bundle on GMH · NMH associated to the GMH -bundle π bMH using the T -action •

: T × (GMH · NMH ) → GMH · NMH : (sH, g · m) 7→ gs · m.

Let π MH be the T -orbit map of the action •. As before the T -bundle π MH : GMH · NMH → C MH need not be a principal bundle. Indeed, an argument shows that π MH is a principal T -bundle if and only if its monodromy homomorphism µπMH : π1 (C MH , f ) → Aut(T ) is trivial. 4.3

Relative periodic orbits

In this section we study the relative periodic orbits of a G-invariant vector field V on a smooth manifold. We assume that the G-action on M is proper and that π : M → M is its G-orbit map. 4.3.1

Basic properties

Before we define what a relative periodic orbit of V is we prove Lemma 4.3.1.41. Let τ ∈ Im \ {0}, where Im is the maximal open interval in R where the integral curve γm : Im → M of V starting at m is defined. The following statements are equivalent. 1. There is a τ 6= 0 and an element s ∈ G such that ϕτ (m) = s · m. Here ϕ : D ⊆ R × M → M : (t, m) 7→ γm (t) is the flow of V .

(32)

153

4.3. Relative periodic orbits

2. Im = R, τ > 0, and there is an element s ∈ G

4

such that

ϕt (m) = sn · ϕt−n τ (m) = ϕt−n τ (sn · m)

(33)

for every t ∈ [nτ, (n + 1)τ ] and every n ∈ Z.5 3. m ∈ M is a periodic point of period τ of the reduced flow ϕt of V on M , that is, ϕτ (m) = m. Proof. We only need to prove item 1 ⇒ item 2 as the other implications are straightforward. Suppose that item 1 holds and that τ > 0. Observe that t ∈ [nτ, (n + 1)τ ] if and only if t − n τ ∈ [0, τ ]. The right hand side of (33) defines a piece of an integral curve γ of V on [nτ, (n + 1)τ ] such that γ(nτ ) = sn · m and γ((n + 1)τ ) = sn · ϕτ (m) = sn · s · m = s(n+1) · m. As n ranges over Z, these pieces fit together to form an integral curve γ of V defined on R such that γ(0) = m. When τ < 0 a similar argument can be used. An integral curve γ of the vector field V , which starts at m ∈ M , is a relative periodic orbit of V of relative period τ if one of the statemtents in lemma 4.3.1.41 holds. If item 1 holds, then we call s ∈ G the shift element of the relative periodic orbit. In addition, we say that m is a relative periodic point of relative period τ . 4.3.2

Quasiperiodic relative periodic orbits

Let G be a Lie group with s ∈ G. Let hsi = {sn ∈ G n ∈ Z}. Then hsi is the smallest subgroup of G containing s. Let S be the closure in G of hsi. We say that S is the closed subgroup of G generated by s. Lemma 4.3.2.42. Either the mapping ρ : Z → S ⊆ G : n 7→ sn is proper or S is a compact subgroup of G. Proof. Suppose that ρ is a proper map. Then for every compact subset K of G we have ρ−1 (K) is a compact, and thus finite, subset F of Z. Let n be an integer not in F . Then sn 6∈ K. Therefore S is not contained in any compact subset of G. Consequently, S is not a compact subgroup of G. This shows that if S is a compact subgroup of G, then ρ is not a proper mapping. 4 The element s can be chosen to be equal to the element s in 1 and is unique up to multiplication on the right by an element of Gm . 5 If τ < 0 then ϕ (m) = s−n · ϕ −n · m). t t+n τ (m) = ϕt+n τ (s

154

Reconstruction, relative equilibria and relative periodic orbits

Proposition 4.3.2.43. Suppose that S is a compact subgroup of G. Then 1. T = S 0 is a torus subgroup of G. 2. There is a unique minimal positive integer n0 such that sn0 ∈ T . The map ψ : Z → S/T : n 7→ sn T induces an isomorphism ψ : Z/n0 Z → S/T . The dimension of T is 0 if and only if sn0 = e. When this is the case S = Z/n0 Z. 3. Let t be the Lie algebra of T . Then sn0 = exp ξ for some ξ ∈ t. The element ξ is determined modulo the integer lattice Λ = ker exp in t. If Pk k k {λj }j=1 is a R-basis of Λ and ξ = j=1 ξj λj , then the elements of {ξj }j=1 are linearly independent real numbers over Q. Proof. 1. As S is a closed subgroup of G, it is a Lie group. Moreover, S is commutative, since hsi is. The identity component S 0 of S is connected. Therefore S 0 is a connected commutative Lie subgroup of G and consequently is a torus T . 2. Suppose that the homomorphism ψ is injective and that {ni } is a sequence of integers which converges to ±∞ such that sni converges to some element g ∈ G. Since S is closed, g ∈ S. Because gT is an open neighborhood of g in S, we have sni ∈ gT for all i sufficiently large. Therefore g ∈ sni T for i sufficiently large. Since sn T are connected components of S, we obtain ψ(ni ) = ψ(nj ) for all sufficiently large i and j. But this contradicts the hypothesis that ψ is injective. Therefore the map ρ : Z → G : n 7→ sn is proper if ψ is injective. But this contradicts the hypothesis that S is a compact subgroup of G, using lemma 4.3.2.42. Therefore the map ψ is not injective. So there is a unique integer n0 such that ker ψ = n0 Z. In other words, n0 is the unique positive integer such that t = sn0 ∈ T . Because hsi is dense in S, each connected component of S contains an sn for some n ∈ Z. Therefore the homomorphism ψ is injective and induces an isomorphism ψ : Z/n0 Z → S/T . We have sn ∈ T if and only if n ∈ n0 Z. Because hsi is dense in S, the group hti is dense in T . Since T is a connected commutative Lie group, T is isomorphic to (Rk /Zk ) × R` for some (k, `) ∈ Z≥0 × Z≥0 . Let (a + Zk , b) be the element of (Rk /Zk ) × R` which corresponds to t under the given isomorphism. Then for every n ∈ Z, the element (n a + Zk , n b) corresponds to tn . If b 6= 0 then {(n a + Zk , n b)}n∈Z is not dense in (Rk /Zk ) × R` . But this contradicts the density of hti in T . Therefore b = 0. Thus T is a torus subgroup of G. The other assertions in statement 2 are straightforward to prove.

155

4.3. Relative periodic orbits

3. Because exp : t → T is a surjective homomorphism, there is a unique ξ ∈ t modulo Λ = ker exp such that t = sn0 = exp ξ. Since exp induces an isomorphism from t/Λ onto T , it follows that {n ξ + Λ}n∈Z is dense in t/Λ. However, the map Rk /Zk → Λ : (θ1 , . . . , θk ) 7→

k X

θj λj

j=1

is an isomorphism. Arguing as in the proof of item 4 of lemma 4.2.2.5, we k conclude that {ξj }j=1 are linearly independent over Q. An element s of G is elliptic if the closed subgroup of G generated by s is compact. Proposition 4.3.2.44. Suppose that γ : R → M : t 7→ ϕt (m) is a relative periodic orbit of V , which starts at m, has relative period τ > 0, and has a shift element s, which is elliptic. Then 1. γ is quasiperiodic with at least k + 1 frequencies, where k = dim T . In particular, we have a smooth map Γ : Rk+1 /Zk+1 → M :

θ = (θ1 , . . . , θk+1 ) 7→ exp

Pk

j=1 θj

(34)

λj − θk+1 ξ · ϕθk+1 n0 τ (m).

with frequency vector ν = (ν1 , . . . νk+1 ), where νj = ξj /(n0 τ ) for 1 ≤ j ≤ k and νk+1 = 1/n0 τ . 2. The closure of γ(R) in M is the image of Γ. As Γ−1 ({m}) is a closed subgroup of Rk+1 /Zk+1 , the group T0 given by the quotient of Rk+1 /Zk+1 by Γ−1 ({m}) is a torus. The map Γ (34) induces a smooth embedding Γ : T0 → M , whose image is the closure in M of γ(R). The minimal number of frequencies of γ is dim T0 . Proof. 1. For 1 ≤ j ≤ k + 1 let θj0 = θj + nj with nj ∈ Z. Then Γ(θ0 ) = exp

k X j=1

= exp

k X j=1

θj λj − θk+1 ξ exp(−nk+1 ξ) · ϕθk+1 n0 τ (ϕnk+1 n0 τ (m)) θj λj − θk+1 ξ ϕθk+1 n0 τ (s−nk+1 n0 · ϕnk+1 n0 τ (m)),

since sn0 = exp ξ = exp

k X j=1

θj λj − θk+1 ξ · ϕθk+1 n0 τ (m) = Γ(θ).

156

Reconstruction, relative equilibria and relative periodic orbits

The third equality above follows using (33) with t = nτ and n = nk+1 n0 . Therefore Γ is a well defined smooth map from Rk+1 /Zk+1 into M . 2. If θj = tξj /n0 τ for 1 ≤ j ≤ k and θk+1 = t/n0 τ , then k ! j=1

θj λj − θk+1 ξ =

k # t "! ξj λj − ξ = 0. n0 τ j=1

(35)

In view of the definition of the map Γ (34), equation (35) implies that γ(t) = Γ(t ν + Zk+1 ) for all t ∈ R, where ν has components νj = ξj /n0 τ k for 1 ≤ j ≤ k and νk+1 = 1/n0 τ . Because {1, {ξj }j=1 } are linearly independent over Q, it follows that the components of ν are linearly independent over Q. Therefore, using item 4 of lemma 4.2.2.5, we see that k+1 . Because the map {t ν + Zk+1 ∈ Rk+1 /Zk+1 t ∈ R} is" dense in Rk+1 # /Z k+1 k+1 is equal to the closure of /Z Γ is continuous, it follows that Γ R γ(R) in M . Since γ(0) = m, the map Ξ : (T × R) × M → M : ((g, t), m) '→ g · ϕt (m) = ϕt (g · m) defines a local smooth action of T × R on M . Therefore its orbit map Ξm : T × R → M : (g, t) '→ Ξ((g, t), m) induces a smooth immersion of (T × R)/(T × R)m into M . Consequently, Γ induces a smooth embedding of T0 = (Rk+1 /Zk+1 )/Γ−1 (m) into M . An argument similar to the one used in the proof of proposition 4.2.2.6 shows that the minimal number of frequencies is dim T0 . Corollary 4.3.2.45. If τ = inf{t ∈ R>0 γ(t) ∈ G · m} and the G-action at m is free, then the map Γ (34) is a smooth embedding and the minimal number of frequencies is k + 1. Proof. Because the G-action is free at m, so is the T -action. Then Γ(θ) = m implies ϕθk+1 n0 τ (m) ∈ T ·m ⊆ G·m. Since the T -action is free at m, there is an n ∈ Z such that θk+1 n0 τ = nτ , which implies that ϕθk+1 n0 τ (m) = sn · m ∈ T · m. Again because of the freeness of the T -action at m, there is a q ∈ Z such that n = qn0 . Consequently, θk+1 = q ∈ Z. However, exp ξ = sn0 and therefore exp(−θk+1 ξ) = s−qn0 . From ϕθk+1 n0 τ (m) = sn · m = sqn0 # " )k and the definition of the map Γ (34), we get exp j=1 θj λj · m = m. Using the freeness of the T -action at m once more, we deduce that θj ∈ Z for every 1 ≤ j ≤ k. Therefore Γ−1 ({m}) = {e}, so T0 = Rk+1 /Zk+1 .

4.3. Relative periodic orbits

4.3.3

157

Runaway relative period orbits

Proposition 4.3.3.46. Let m be a relative periodic point of relative period τ > 0 with shift element s ∈ G. Then the following statements are equivalent. 1. The element s is not elliptic. 2. The integral curve γ : R → M of the G-invariant vector field V , which starts at m, is a runaway curve in M . 3. The integral curve γ of V starting at m is not quasiperiodic. Proof. item ⇒ item 2. If item 1 holds, then from lemma 4.3.2.42 it follows that the mapping ψ : Z → M : n 7→ sn · m is proper. Let K be a compact subset of M . Then K 0 = {ϕ−s (x) ∈ M x ∈ K, s ∈ [0, τ ]} is equal to the image of the compact set [−τ, 0] × K under the continuous map ϕ|([−τ, 0] × K 0 ). Therefore K 0 is a compact subset of M . Because the mapping ψ is proper, there is a positive integer N such that |n| ≤ N whenever sn · m ∈ K 0 . Consequently, if t ∈ [nτ, (n + 1)τ ] and ϕt (m) ∈ K, then sn · m = ϕ−(t−n τ ) (ϕt (m)) ∈ K 0 . Thus |n| ≤ N , which implies |t| = |(t − n τ ) + n τ | ≤ τ + N τ . Consequently, the integral curve γ : R → M is a proper mapping. A similar argument works when τ < 0. ¬ item 3 ⇒ ¬ item 2. If γ is quasiperiodic, then its image is contained in the image of a torus under a continuous map, and thus is contained in a compact subset of M . Thus γ is not a runaway curve. item 3 ⇒ item 1. This implication follows from item 1 of proposition 4.3.2.43. Without additional assumptions on the G-action, not much can be deduced that is in the spirit of proposition 4.3.3.46. For instance, let V be a complete vector field on M with flow ϕt and let τ > 0. Then Ψ : Z × M → M : (n, m) 7→ ϕnτ (m) defines an action of G = (Z, +), which is discrete. Clearly, every integral curve of V is a Z-relative periodic orbit with relative period τ . But it is certainly not the case that for every dynamical system every integral curve is either quasiperiodic or runs away. 4.3.4

When the G-action is not free

The results of proposition 4.3.3.46 are not optimal when the proper Gaction is not free at m. Let H = Gm .

158

Reconstruction, relative equilibria and relative periodic orbits

From (33) and lemma 4.2.5.12 with t = τ and g = s it follows that s ∈ N (H). Because the Lie group GMH = N (H)/H acts freely on the Hisotropy type MH containing m, we obtain the following proposition from proposition 4.3.3.46. Proposition 4.3.4.47. Suppose that m is a relative periodic point with shift element s. Assume that sH is an elliptic element of GHM . Then the integral curve γ : R → MH of the GMH -invariant vector field V |MH , which starts at m is quasiperiodic. If τ = inf{t ∈ R>0 ϕt (m) ∈ G · m}, then the map Γ (34) is a smooth embedding of Rk+1 /Zk+1 into M . Here k = dim T , where T is the torus in GMH formed by taking the identity component of the closure of hsHi in GMH . The minimal number of frequencies of γ is k + 1. Corollary 4.3.4.48. If GMH is compact, then every relative periodic orbit of V |MH is quasiperiodic. 4.3.5

Other relative periodic orbits in the (G × R)-orbit

If m is a relative periodic point of the vector field V on M with relative period τ and shift element s, then for g ∈ G we have ϕτ (g · m) = g · ϕτ (m) = g · s · m = (gsg −1 ) · g · m.

(36)

This shows that g · m is a relative periodic point with relative period τ and with shift element Adg s. An integral curve of V starting at g · m is periodic, quasiperiodic, or runaway, if and only if the integral curve of V starting at m has these properties, respectively. In addition, the closures in M of V -orbits starting at a point in G · m define a smooth fibration of the closure of the V flow out of G · m, that is, the closure in M of {ϕt (G · m) ∈ M t ∈ R}. Let us give some more detail when we are in the situation where τ = inf{t ∈ R>0 ϕt (m) ∈ G · m} and the G-action on M is free at m. In other words, Gm = {e}. This is more complicated than the case of relative equilibria because the closure of a V -orbit is not generated by a torus subgroup of G, since a V -orbit is not contained in a G-orbit. For g, g 0 ∈ G and t, t0 ∈ R we have g · ϕt (m) = g 0 · ϕt0 (m) if and only if ϕt0 −t (m) = (g 0 )−1 g · m ∈ G · m if and only if there is an n ∈ Z such that t0 − t = nτ and (g 0 )−1 g = sn . Define a Z-action on G × R by Z × (G × R) → G × R : (n, (g, t)) 7→ (gs−n , t + nτ )

(37)

159

4.3. Relative periodic orbits

Then the mapping G × R → M : (g, t) 7→ g · ϕt (m) induces a smooth embedding Ψ : (G × R)/Z → (G × R) · m. The space (G × R) · m is the V flow out of the G-orbit G · m. The Z-orbit map ρ : G × R → (G × R)/Z intertwines the G-action on G × R, given by multiplication on the left in the first factor, with the induced smooth G-action on (G × R)/Z. Also it intertwines the R-action on G × R, given by addition on the second factor, with a smooth R-action on (G × R)/Z. The map Ψ intertwines the G and R-actions on (G × R)/Z with the G-action and the flow of V on (G × R) · m, respectively. Note that the projection map G × R onto the first factor induces the bundle projection map from (G × R)/Z onto the circle R/τ Z. Using the map Ψ, the preceding bundle can be identified with the image of the relative periodic orbits of V starting at a point in G · m under the G-orbit map π. Consider the Rk+1 -action on G × R defined by k+1 •:R × (G × R) → G × R : Pk (θ, (g, t)) 7→ g exp( j=1 θj λj − θk+1 ξ), t + θk+1 n0 τ .

(38)

If θ•(g, t) = (t + nτ, g s−n ) for some n ∈ Z, then k X sn = exp(− θj λj + θk+1 ξ) ∈ T,

(39)

If θ ∈ Zk+1 then using exp ξ = sn0 we get θ•(g, t) = (g s−θk+1 n0 , t + θk+1 n0 τ ), which is equal to θk+1 n0 ◦ (g, t), where ◦ is the Z action defined by ◦ : Z × (G × R) → G × R : (n, (g, t)) 7→ (g, t + n). Therefore we have an induced action of Rk+1 /Zk+1 on (G×R)/Z. Moreover, this Rk+1 /Zk+1 -action commutes with the G-action on (G × R)/Z induced from the G action on G × R given by mulitiplication of the first factor on the left.

j=1

which implies that n = qn0 for some q ∈ Z. But then nτ = θk+1 n0 τ , implies that θk+1 = q. Then (39) together with exp ξ = sn0 implies that Pk j=1 θj λj ∈ Λ, the integer lattice in t. Therefore θj ∈ Z for every 1 ≤ j ≤ k. This shows that the Rk+1 /Zk+1 -action on (G × R)/Z is free. It is proper because Rk+1 /Zk+1 is compact. Let {ξj }kj=1 be the components of ξ ∈ t with respect to the Z-basis k

{λj }j=1 of the integer lattice Λ of t. Then k X j=1

θj λj − θk+1 ξ =

k X j=1

(θj − θk+1 ξj )λj .

160

Reconstruction, relative equilibria and relative periodic orbits

Consequently, the vector field V on (G × R) · m is equal to the element θ˙ in the Lie algebra Rk+1 of Rk+1 /Zk+1 , which is defined by θ˙k+1 ξj for 1 ≤ j ≤ k θ˙ = (40) 1/(n0 τ ) for j = k + 1. Because {1, {ξj }kj=1 } are linearly independent over Q, it follows from item 3 of lemma 4.2.2.5 that the orbits of (40) are dense in Rk+1 /Zk+1 . Therefore the orbits of the Rk+1 /Zk+1 -action on (R × G) · m are equal to the closure of the set of integral curves of V in (R × G) · m. Now assume that the G-action is not free at m. Suppose that sH is an ellipitic element of GMH = N (H)/H. Then the V -flow out of G · m is a fiber bundle over G/N (H) with typical fiber equal to the V -flow out of N (H) · m. Because the action of GMH is free at m, we can use the construction of the preceding paragraphs to show that the closure of the V -orbits, starting on N (H) · m, define a GMH -invariant smooth principal Rk+1 /Zk+1 -bundle of the V flow out of N (H) · m. Using lemma 4.2.5.2.22 with L = N (H), K = Rk+1 /Zk+1 and Q = (GMH × R)/Z, we obtain Proposition 4.3.5.49. Suppose that m is a relative periodic point with shift element s ∈ G. Suppose that sH is an elliptic element of GMH . Then the closure of the V -orbits in the V flow out (G × R) · m of G · m define a smooth G-invariant principal Rk+1 /Zk+1 fibration of (G × R) · m. The vector field V on (G×R)·m is equal to the element θ˙ (40) in the Lie algebra Rk+1 of Rk+1 /Zk+1 . 4.3.6

Smooth families of quasiperiodic relative periodic orbits

In this section we consider a smooth family of periodic orbits of the reduced flow ϕt on the orbit space M corresponding to a G-invariant vector field V on M with flow ϕt . 4.3.6.1

Elliptic, regular, and stably elliptic elements of G

In this subsubsection we prove group theoretic analogues of the lemmas 4.2.6.1.24 – 4.2.6.1.26. Let G be a Lie group with Lie algebra g. For s ∈ G, the centralizer gs of s in g is {η ∈ g Ads η = η}. The centralizer Gs of s in G is {g ∈ G g s = s g}. The group Gs is a closed subgroup of G and hence is a Lie group with

4.3. Relative periodic orbits

161

Lie algebra gs . The element s is an elliptic element of G if and only if the closure of {sn ∈ G n ∈ Z} is contained in a compact subgroup of G. The element s is a regular element of G if there is an open neighborhood U of s in G such that dim gs ≤ dim gs" for every s* ∈ U . Because dim gs" ≤ dim gs for every s* in G near s, we see that s is regular if and only if dim gs" is constant for all s* near s. The element s is stably elliptic if there is an open neighborhood U of s in G such that every s* ∈ U is an elliptic element of G. Lemma 4.3.6.1.50. Let G be a Lie group with Lie algebra g. The following statements are equivalent. 1. s is a regular and stably elliptic element of G. 2. The centralizer gs of s in g is the Lie algebra of a torus subgroup T of G. Moreover, there is a nonzero integer n such that sn ∈ T . Proof. The proof of this lemma is analogous to the proof of lemma 4.2.6.1.24. We give the details. property 1⇒ property 2. Suppose that s is an elliptic element of G, then the closure of S = {sn ∈ G n ∈ Z} in G is a compact subgroup of Gl(g, R). Because the map Ad : G → Gl(g, R) is a continuous homomorphism of Lie groups, the group K = Ad(S) is a compact subgroup of Gl(g, R). Averaging an arbitrary inner product on g over K gives a K-invariant inner product β on g. Hence Ads is a β-orthogonal linear transformation of (g, β) onto itself. Therefore, im (I−Ads ) is the β-orthogonal complement of ker(I−Ads ) = gs . So g = im (I − Ads ) ⊕ gs

(41)

and I − Ads is a bijective linear map of im (I − Ads ) onto itself. The tangent of the mapping at (e, 0) is

Ψ : G × gs → g : (g, ξ) '→ (g exp ξ g −1 )s−1 T(e,0) Ψ : g × gs → g : (η, ξ) '→ (Id − Ads )η + ξ.

In view of (41) it follows that T(e,0) Ψ is surjective. Thus by the implicit function theorem there is an open neighborhood U of e in G and a real analytic mappings ϕ : U ⊆ G → G and χ : U ⊆ G → gs such that ϕ(e) = e, χ(e) = 0, and u = ϕ(u)(exp χ(u)s)ϕ(u)

−1 −1

s

(42)

162

Reconstruction, relative equilibria and relative periodic orbits

for every u ∈ U . Writing s0 = us, equation (42) implies that every element s0 of G near s is conjugate to an element of G0s s by means of an element of G which depends real analytically on s0 . Note that G0s s (= s G0s ) is the connected component of Gs which contains s. Let s0 ∈ Gs . From the fact that Ad : G → Gl(g, R) is a group homomorphism, it follows that the linear mappings Ads0 and Ads commute. Consequently, Ads0 leaves gs and im (I − Ads ) invariant. If s0 is sufficiently close to s, then the restriction of I − Ads0 to im (I − Ads ) is bijective. Therefore gs0 = ker(I − Ads0 ) ⊆ gs . If s is a regular element of G, then for s0 ∈ Gs sufficiently close to s we have gs0 = gs . So Ad(s0 s−1 ) = Ads0 ◦ Ads−1 = idgs . In other words, Ad(s0 s−1 ) η = η for every η ∈ gs . Differentiating the last equality in the direction of ξ ∈ gs gives [ξ, η] = 0 for every ξ, η ∈ gs . Therefore gs is commutative, which implies that G0s is commutative. Hence G0s = exp gs , see theorem 1.12.1 of [42]. Because the identity component G0s of Gs is an open and closed subgroup of the closed subgroup Gs of G, the subgroup T = exp gs = G0s is a closed S subgroup of G. Since T ⊆ Gs , we see that H = n∈Z sn T is a commutative subgroup of G, which is the smallest subgroup of G that contains both s and T . In addition, the map λ : Z → H/T : n 7→ sn T

is a homomorphism of groups. Recall that S is the closed subgroup of G generated by s. Because s is elliptic, there is a smallest positive integer n0 such that sn0 ∈ S 0 . Since S 0 ⊆ Gs and S 0 is connected, it follows that S 0 ⊆ G0s = T . Therefore n0 ∈ ker λ, which implies that ker λ is a nontrivial subgroup of Z. So there is a unique positive integer n1 such that ker λ = n1 Z. Consequently, the group H/T is isomorphic to Z/(n1 Z). Note that n1 is the smallest positive integer such that sn1 ∈ T and that S n0 is a positive multiple of n1 . Because H = 0≤n
4.3. Relative periodic orbits

163

k n1 is open. So elements of the form hn1 for all h near s in G form an open neighborhood of sn1 in T . If s is a stably elliptic element of G, and hence one of H, then sn1 is a stably elliptic element of T . Because T is a connected, commutative Lie group, it is isomorphic to (R/Z)k × R` for some k, ` ∈ Z≥0 . But (R/Z)k × R` contains stably elliptic elements only if ` = 0. Therefore T is a torus. Thus property 1⇒ property 2. Suppose that property 2 holds. Because t = gs , for every ξ ∈ t we have s(exp ξ)s−1 = exp Ads ξ = exp ξ. Thus s commutes with every element of S T = exp t. It follows that H = n∈Z sn T is a commutative group. H is compact because there is a minimal positive integer n1 such that sn1 ∈ T , which implies that H/T is isomorphic to Z/n1 Z. Therefore every element of H is elliptic. In particular, s is elliptic. Because every element of G near s is conjugate to an element s0 ∈ sT ⊆ H, we deduce that s0 is elliptic. Therefore s is a stably elliptic element of G. Since every element s0 ∈ H near s commutes with s, we get Ads0 commutes with Ads . Therfore Ads0 maps im(I − Ads ) into itself. Since I − Ads is a bijective linear map of im(I − Ads ) into itself, it follows that I − Ads0 is a bijective linear map of im(I− Ads ) into itself. Therefore dim gs0 ≤ dim gs , if s0 is sufficiently close to s. Because every s00 ∈ G near s is conjugate to an s0 ∈ H near s, we have dim gs00 = dim gs0 ≤ dim gs . Therefore s is a regular element of G. So property 2 ⇒ property 1. Let Gse , Greg , and Grse be the set of stably elliptic, regular, and regular stably elliptic elements of G, respectively. By definition Gse and Greg are open subsets of G. Therefore Grse = Gse ∩ Greg is open. To show that Grse is dense in Gse we argue as follows. Suppose that s ∈ Gse but s 6∈ Greg . Then there is an s0 ∈ Gse arbitrarily close to s such that dim gs0 < dim gs . As dim gs is finite, we can repeat this argument only finitely many times. We eventually obtain an s0 ∈ Gse arbitrarily close to s such that dim gs0 is minimal. Then s0 ∈ Greg . Lemma 4.3.6.1.51. For every s ∈ Grse there is an open neighborhood U of s in G and a real analytic mapping θ : U → G such that for every s0 ∈ U with s00 = Adθ(s0 ) s0 we have G0s00 = G0s and s00 G0s00 = sG0s . Proof. Write s0 = us. Then by the implicit function theorem, see the e proof of property 2 in lemma 4.3.6.1.50, there is an open neighborhood U e e of e in G and real analytic mappings ϕ : U ⊆ G → G and χ : U ⊆ G → g such that ϕ(e) = e, χ(e) = 0, and −1 u = ϕ(u) (exp χ(u))s ϕ(u) s−1 . (43)

164

Reconstruction, relative equilibria and relative periodic orbits −1

' s, θ(s* ) = ϕ(u) , and s** = Adθ(s" ) s* . Then equation (43) Let U = U is equivalent to s** = (exp χ(u))s = s(exp χ(u)). Therefore s** ∈ sG0s . If s** is sufficiently close to s in G, then gs = gs"" = Adθ(s" ) gs" , G0s = G0s"" , and sG0s = s** G0s = s** G0s"" . The last equality follows because s** to the connected component sG0 of Gs which contains s. Let C be a connected component of Grse . Proposition 4.3.6.1.52. Let s, s* ∈ C. Then 1. There is a g ∈ G0 such that gs" = Adg gs . 2. dim gs is constant, say k, for every s ∈ C. The assignment s '→ gs defines a real analytic mapping from C into the Grassman manifold of kdimensional vector subspaces of g. 3. The positvie integer n1 such that sn1 ∈ G0s = exp gs is the same for all s ∈ C. Proof. 1. Two elements s, s* ∈ C are said to be related if there is a g ∈ G0 such that gs" = Adg gs . Being related in an equivalence relation on C. By lemma 4.3.6.1.51, nearby elements of C are related. Since C is connected, we obtain gs" is conjugate to gs by an element of G0 for every s, s* ∈ C. 2. For s* near s, we know that gs" is conjugate to gs by g = g(s* ) ∈ G0 , which depends real analytically on s, the mapping s* '→ gs" is real analytic in an open neighborhood of s. Therefore the map s '→ gs is real analytic on C. 3. The constancy of n1 is proved in a way similar to that given in part 1 above. Example 4.3.6.1.53. 1. If G is a compact Lie group, then every element of G is elliptic. Therefore Gse = G and Grse = Greg . If G is connected and s ∈ Greg , then s ∈ G0s , see proposition 3.1.3 of [42]. Therefore the smallest nonnegative integer n such that sn ∈ G0s is equal to 1. Every s ∈ G belongs to a maximal torus T in Gs . Therefore T ⊆ G0s and T = G0s if s ∈ Greg . If G is compact and connected, then all tori in Grse are maximal tori of dimension equal to the rank of G. 2. Let s = (A, a) ∈ G = E(2), the group of proper Euclidean motions of R2 . Then s is not elliptic if A = I and a -= 0. If A -= I, then I − A is invertible and s has a unique fixed point p = (I − A)−1 a. In this case, the

4.3. Relative periodic orbits

165

isotropy group of s is equal to the group of all rotations about p, which is a circle subgroup of E(2). Therefore Grse = {(A, a) ∈ E(2) A 6= I} is an open dense subset of Gse = G. Therefore every element of Grse lies in a torus. 3. Let G = Sl(2, R). Then Grse = {A ∈ Sl(2, R) |tr A| < 2}. Since the subgroup SO(2, R) meets each conjugacy class of elements of G exactly once, Grse is connected and all the tori given by exp gs , where s ∈ Grse , are conjugate to one another. 4.3.6.2

When the G-action is free

Suppose that the G-action Φ on M is free and proper with orbit map π : M → M = M/G : m 7→ m. Let V be a smooth G-invariant vector field on M with flow ϕt . Since Φ is free and proper, the G-orbit space M is a smooth manifold and the vector field V induces a smooth vector field V on ϕt . Let P be the set of points m ∈ M such that R → M : t 7→ ϕt (m) is a nonconstant periodic integral curve of V , which starts at m and has minimal positive period τ (m). Moreover, suppose that the map τ : P → R>0 is continuous.6 Then P is a smooth submanifold of M . So π −1 (P ) is a locally closed smooth submanifold of M , which is G-invariant. Therefore π|π −1 (P ) : π −1 (P ) → P is a principal G-bundle. Because π −1 (P ) is the set of all relative periodic orbits of V , it is invariant under the flow ϕt of V . In addition, the vector field V |π −1 (P ) is complete. By definition P is invariant under the reduced flow ϕt of the reduced vector field V and V does not have any zeroes on P . Lemma 4.3.6.2.54. The integral curves of V |P define the fibers of a smooth principal R/Z-bundle ν : P → P . Here P is the orbit space of the R/Z-action generated by the flow of V |P . Moreover, there is a smooth function µ : P → R>0 such that τ = µ ◦ ν. Proof. Let p ∈ P . Since V (p) 6= 0, there is a smooth codimension 1 submanifold S of P containing p such that Tp P = Tp S ⊕ span{V (p)}. By the implicit function theorem there is an open neighborhood W of p in S and an ε > 0 such that 6 Note

ϕ : (−ε, ε) × W → U ⊆ P : (t, s) 7→ ϕt (s)

that if m ∈ P , then ϕu (m) ∈ P , because t 7→ ϕt (ϕu (m)) is a periodic integral curve of V of minimal positive period τ (m).

166

Reconstruction, relative equilibria and relative periodic orbits

is a diffeomorphism onto U , which is an open neighborhood of p in P . For any u ∈ U we write ϕ−1 (u) = (t(u), s(u)), where t : U → (−ε, ε) : u 7→ t(u) and s : U → W : u 7→ s(u) are smooth maps.

Because ϕτ (p) (p) = p for every p ∈ U , the set W1 = ϕ−1 τ (p) (U ) is an open neighborhood of p in W . For se ∈ W1 , we have u e = ϕτ (p) (e s) ∈ U s) ∈ W , where µ(e s) = τ (p) − t(e u) is a smooth function and s(e u) = ϕµ(es) (e of se with µ(p) = τ (p). From ϕµ(es) (e s) ∈ W and ϕτ (es) (e s) = se, it follows that ϕµ(s)−τ (s) (s) ∈ W for every s ∈ W1 . Therefore µ(s) = τ (s) if |µ(s) − τ (s)| < ε. Because µ − τ is a continuous function on W1 , which vanishes at p, we conclude that τ is equal to the smooth function µ on some open neighborhood W2 of p in W1 . The mapping Ψ : R/Z × W2 → P : (t, s) 7→ ϕτ (s) t (s) is well defined, smooth, injective, and has bijective tangent map at every point in its domain. Therefore by the inverse function theorem, Ψ is a diffeomorphism onto an open neighborhood U of p in P . Moreover, for each s ∈ W2 , we see that Ψ maps the circle (R/Z) × {s} onto the V |P orbit through Ψ(s, 0) in U . Because τ (Ψ(t, s)) = τ (s) for all (t, s) ∈ R/Z × W2 , we find that τ is a smooth function on U , which is constant on V |P -orbits in U . Since every p ∈ P has an open neighborhood U , which is invariant under the flow ϕt and on which τ is a smooth function, which is constant on V |P -orbits, it follows that τ : P → R>0 is a smooth function, which is constant on V |P -orbits. The mapping e : R/Z × P → P : (t, p) 7→ ϕτ (p) t (τ (p)) Ψ

defines a proper free R/Z-action on P , whose orbits coincide with the orbits of V |P . Example 4.3.6.2.55. Let R =

cos 2π sin 2π

p q p q

− sin 2π cos 2π

p q p q

!

, where q ∈ Z≥2 , p ∈ Z≥0 , 0 < p < q, and p

and q have no common divisor greater than 1. On R2 × R define a Z-action by φ : Z × (R2 × R) → R2 × R : (n, (x, t)) 7→ (Rn x, t + n). Let P be the Z-orbit space (R2 × R)/Z with orbit map π : R2 × R → P . ∂ Then π intertwines the vector field V = ∂t with a real analytic vector field V on P , whose integral curve through π(x(0), t(0)) is t 7→ π(x(0), t(0) + t). When x(0) 6= 0, then all the integral curves of V on P are periodic with minimal positive period 1; whereas when x(0) = 0, the integral curve of V

167

4.3. Relative periodic orbits

has minimal positive period q. Thus the minimal positive period function τ : P → R>0 of V is discontinuous. To avoid this subperiodic behavior , we have made the hypothesis that the minimal positive period function is continuous on the set of relative periodic orbits. Lemma 4.3.6.2.56. Let s(m) be the shift element corresponding to the relative periodic orbit t 7→ ϕt (m) of relative period τ (m) through m ∈ π −1 (P ). Then the mapping π −1 (P ) → G : m 7→ s(m) is smooth. Proof. Write N = π −1 (P ). The mapping Ψ : G × N → N × N : (g, m) 7→ (m, g · m) is an injective immersion, because the G-action is free and proper. Therefore Q = Ψ(G × N ) is a smooth closed submanifold of N × N , on which Ψ has a smooth inverse Ψ−1 : Q → G × N . The map ψ : N → N × N : m 7→ (m, ϕτ (m) (m)) is smooth and its image is contained in Q. Therefore µ = π1 ◦ Ψ−1 ◦ ψ is a smooth map from N to G, where π1 : G × N → N is the projection map on the first factor. Moreover, µ is the assignment m 7→ s(m) as desired. Let P rse = {p ∈ P s(m) ∈ Grse for some m ∈ π −1 ({p})}. Since Grse is invariant under conjugation by element of G and s(g · m) = Adg s(m), it follows that p ∈ P rse if and only if s(m) ∈ Grse for every m ∈ π −1 ({p}). So rse P rse is an open subset of P . Let P = ν(P rse ), where ν : P → P is the orbit map of the smooth fibration of P into periodic orbits corresponding rse to relative periodic orbits of V , see lemma 4.3.6.2.54. Then P is an open subset of P . Because P rse is invariant under the reduced flow of V |P , it rse follows that P rse = ν −1 (P ). rse

Proposition 4.3.6.2.57. Let χ = ν ◦ π : M → P . For m0 ∈ M let rse p0 = χ(m0 ) ∈ P . Let s0 = s(p0 ) and set T = G0s0 . Then there is an rse rse open neighborhood U of p0 ∈ P and a smooth section σ : U ⊆ P → χ−1 (U ) ⊆ M of the fibration χ|χ−1 (U ) such that for every p ∈ U the identity component of the centralizer in G of s(σ(p)) is a torus T and s(σ(p))T = s0 T . Proof. Because χ|χ−1 (P ) is a smooth fibration, there is an open neighborrse hood U1 of p0 in P and a smooth section σ1 : U1 → χ−1 (U ) of χ|χ−1 (U ). From lemma 4.3.6.1.50 there is an open neighborhood Ve of s0 in G and a real −1 analytic mapping θ : Ve → G such that if s0 ∈ Ve and s00 = θ(s0 )s0 θ(s0 ) ,

168

Reconstruction, relative equilibria and relative periodic orbits

then G0s00 = G0s0 = T and s00 T = s0 T . If we write m = σ1 (p), u = s(σ1 (p)), g = θ(u), and set σ(p) = θ(s(σ1 (p))) · σ1 (p), then G0s(σ(p)) = T and s(σ(p))T = s0 T for every p ∈ U = {p0 ∈ U1 s(σ1 (p0 )) ∈ Ve }, because s(g · m) = Adg s(m). Now U is an open neighborhood of p0 in P σ : U → χ−1 (U ) is a smooth section of χ|χ−1 (U ).

rse

and

The map G × R × U → χ−1 (U ) : (g, t, p) 7→ g · ϕt (σ(p))

(44)

is a surjective local diffeomorphism onto χ−1 (U ) = π −1 (ν −1 (U )). If g · ϕt (σ(p)) = g 0 · ϕt (σ(p0 )),

(45)

rse

then applying the bundle projection map χ|χ−1 (P ) to both sides of (45) gives p = p0 . Then (45) is equivalent to (g 0 )−1 g · σ(p) = ϕt0 −t (σ(p)), which in turn is equivalent to t0 − t = nτ (π(σ(p))) and (g 0 )−1 g = s(σ(p))n for some n ∈ Z. Using lemma 4.3.6.2.54 and the fact that σ is a section of the bundle χ|χ−1 (U ), we get τ (π(σ(p))) = µ(ν(π(σ(p)))) = µ(χ(σ(p))) = µ(p). Therefore the map (44) induces a diffeomorphism ψ from (G × R × U )/Z into χ−1 (U ), where (G × R × U )/Z is the orbit space of the Z-action ◦

: Z × (G × R × U ) → G × R × U : (n, (g, t, p)) 7→ g s(σ(p))−n , t + n µ(p), p .

(46)

Using the notation of proposition 4.3.6.2.57, let {λj }kj=1 be a Z-basis for the integer lattice Λ of t, the Lie algebra of T . In view of lemma n 4.3.6.1.50 we have s(σ(p)) 1 ∈ T for every p ∈ U . Because exp : t → T is a local diffeomorphism, shrinking U is necessary, we obtain a smooth map ξ : U → t by requiring s(σ(p)) = exp ξ(p). Define an Rk+1 -action on G × R × U by •

: Rk+1 × (G × R × U ) → G × R × U : (θ, (g, t, p)) 7→ Pk g · exp( j=1 θj λj − θk+1 ξ(p)), t + θk+1 n1 µ(p), p . n1

If θ ∈ Zk+1 , then from s(σ(p)) θ•(g, t, p) = (g · s(σ(p))

(47)

= exp ξ(p) and (47) it follows that

−θk+1 +n1

, t + θk+1 n1 µ(p), p) = θk+1 n1 ◦ (g, t, p),

where the last equality follows because θk+1 n1 ∈ Z.

169

4.3. Relative periodic orbits

Consequently, we have an induced Rk+1 /Zk+1 -action on (G × R × U )/Z, which commutes with the G-action on (G × R × U )/Z induced by the Gaction G × (G × R × U ) → G × R × U : (g 0 , (g, t, p)) 7→ (g 0 g, t, p). −n If θ•(g, t, p) = g · s(σ(p)) , t + n µ(p), p for some n ∈ Z, then s(σ(p))

−n

= exp −

k X j=1

θj λj + θk+1 ξ(p) ∈ T,

(48)

(49)

which implies n = qn1 for some q ∈ Z. But then q µ(p) = θk+1 n1 µ(p), n which implies θk+1 = q. Now (49) together with exp ξ(p) = s(σ(p)) 1 Pk implies that j=1 θj λj ∈ Λ. Therefore θj ∈ Z for every 1 ≤ j ≤ k. Consequently, the induced Rk+1 /Zk+1 -action on (G × R × U )/Z is free. This action is proper, because Rk+1 /Zk+1 is compact. The diffeomorphism ψ : (G × R × U )/Z → χ−1 (U ) intertwines the Rk+1 /Zk+1 -action on (G × R × U )/Z with a uniquely defined Rk+1 /Zk+1 -action on χ−1 (U ), which commutes with the G-action on χ( U ). Let {ξj (p)}kj=1 be the components of ξ(p) with respect to the Z-basis k

{λj }j=1 of the integer lattice Λ of t. Then from k X j=1

θj λj − θk+1 ξ(p) = −1

k X j=1

θj − θk+1 ξj (p) λj

it follows that on χ (U ) the vector field V on (G × R) · m is equal to the element θ˙ of the Lie algebra Rk+1 of Rk+1 /Zk+1 , where θk+1 ξj (p) when 1 ≤ j ≤ k ˙θ = (50) 1/(n1 µ(p)) when j = k + 1.

We summarize the above discussion in the following lengthy proposition. rse

Proposition 4.3.6.2.58. Let C be a connected component of P , where rse ν : P rse → P is the principal R/Z-fibration of the set of relative perirse rse odic orbits in P ⊆ M . Let χ : π(P rse ) → P be (ν ◦ π)|π −1 (P rse ). For −1 0 each m ∈ χ (C) let T (m) = Gs(m) , where s(m) is the shift element corresponding to the integral curve t 7→ ϕt (m) of the G-invariant vector field V starting at m. The disjoint union over all m ∈ χ−1 (C) of the V -flow out of T (m) · m, that is, ϕt (T (m) · m), define a G invariant smooth fibration of χ−1 (C), whose fiber is diffeomorphic to a torus of dimension dim T + 1. Let m0 ∈ M such that p0 = χ(m0 ) ∈ C. Let t be the Lie algebra of T = T (m0 ). Then there is an open neighborhood U of p0 in C, a smooth section

170

Reconstruction, relative equilibria and relative periodic orbits

σ : U → χ−1 (U ), and a smooth map ξ : U → t such that G0s(σ(p)) = T and s(σ(p))n = exp ξ(p) for every p ∈ U . For any m ∈ χ−1 (C) we let n1 be the smallest positive integer such that s(m)n1 ∈ T . Let {λj }kj=1 be a Z-basis of the integer lattice Λ of t and let {ξj (p)}kj=1 be the components of ξ(p) with respect to this basis. Then the Rk+1 -action (47) on G × R × U induces a Rk+1 /Zk+1 -action on G × R × U , which is intertwined by the map G × R × G × U → χ−1 (U ) : (g, t, p) 7→ g · ϕt (σ(p)) with a G-invariant proper free action of Rk+1 /Zk+1 on χ−1 (U ), whose orbits are the fibers of the toral fibration given in the first paragraph above. In ˙ where θ˙ lies in the Lie algebra addition, the vector field V on χ−1 (U ) is θ, k+1 k+1 k+1 R of R /Z and is given by (50). As in proposition 4.2.6.2.36, the Weyl group W (T ) of T measures the obstruction to χ−1 (C) being a G-invariant principal Rk+1 /Zk+1 -bundle. 4.3.6.3

When the G-action is not free

Suppose that the G-action on M is proper, but is not necessarily free. Let m ∈ M and let H = Gm . Let MH be the smooth submanifold of M of points of isotropy H and let MH be its image in the G-orbit space M under the G-orbit map π : M → M : m 7→ m. Let PMH be the smooth submanifold of points m ∈ MH which are starting points of a nonconstant periodic orbits of minimal positive period τ (m) of the reduced flow of the GMH = N (H)/H-invariant vector field V |MH . We also assume that τ : PMH → R>0 is continuous, and thus is smooth. Let νMH : PMH → P MH be the principal R/Z-bundle whose fiber is a periodic orbit of the reduced flow rse rse of V |MH . Analogous to the definition of P M in §3.6.2, we may define PM . rse

H

H

rse

Let C MH be a connected component of P M and let χMH : π(C MH ) → P M H

H

be νMH ◦ πMH , where πMH : MH → MH is the GMH -orbit map. Using the fact that GMH acts freely and properly on MH and the results of §3.6.2 with G replaced by GMH and M replaced by MH , we obtain the following analogue of proposition 4.3.6.2.58. MH Proposition 4.3.6.3.59. For each m ∈ χ−1 ) let T (m) = MH (C 0 (GMH )s(m) , where s(m) is the shift element corresponding to the integral curve t 7→ ϕt (m) of the GMH -invariant vector field V |MH starting MH at m. Then the disjoint union over all mχ−1 ) of the V -flow out MH (C

4.4. Notes

171

of T (m) · m, that is, ϕt (T (m) · m), define a GMH invariant smooth fibraMH ), whose fiber is diffeomorphic to a torus of dimension tion of χ−1 MH (C dim T (m) + 1. Let m0 ∈ MH such that p0 = χMH (m0 ) ∈ CMH . Let t be the Lie algebra of T = T (m0 ). Then there is an open neighborhood U of p0 in C MH , a smooth section σ : U → χ−1 MH (U ), and a smooth map ξ : U → t n ' for every p ∈ U . For such that (GMH )0s(σ(p)) = T and s(σ(p)) = exp ξ(p) −1 MH any m ∈ χ (C ) we let n1 be the smallest positive integer such that n MH ). s(m) 1 ∈ T (m) for every m ∈ χ−1 MH (C k k Let {λj } be a Z-basis of the integer lattice Λ of t and let {ξ'j (p)} j=1

j=1

' be the components of ξ(p) with respect to this basis. Then the Rk+1 -action (47) on GMH × R × U induces a Rk+1 /Zk+1 -action on GMH × R × U , which is intertwined by the map GMH × R × U → χ−1 MH (U ) : (g, t, p) '→ g · ϕt (σ(p))

with a GMH -invariant proper free action of Rk+1 /Zk+1 on χ−1 MH (U ), whose orbits are the fibers of the toral fibration given in the first paragraph above. ˙ where θ˙ lies in the Lie In addition, the vector field V on G × R × U is θ, k+1 k+1 k+1 ' of R /Z and is given by (50) with ξ replaced by ξ. algebra R

As in proposition 4.2.6.2.36, the Weyl group W (T ) of T measures the MH obstruction to χ−1 ) being a GMH -invariant principal Rk+1 /Zk+1 MH (C bundle. 4.4

Notes

The reconstruction procedure given in the text follows Marsden [75], see §7.2. Most of the results on relative equilibria and relative periodic orbits can be found in the literature, especially Field [44] and Krupa [64]. Our discussion is based on that of Duistermaat [41]. The denseness of a straight line on an affine torus with frequency vector having components which are linearly independent over the rational numbers is due to Kronecker [63]. When the Lie group G is compact, the statement of proposition 4.2.2.6 about the minimal number of frequencies of a quasiperiodic relative equilibrium generated by an elliptic element of the Lie algebra of G is due to proposition B1 of Field [44]. A proof of Sard’s theorem may be found in Sard [97]. Proposition 4.2.4.11 is a consequence of proposition B1 of Field [44]. When the Lie group G is compact, corollary 4.3.4.48 follows from

172

Reconstruction, relative equilibria and relative periodic orbits

proposition B2 of Field [44]. For a free action of a connected compact Lie group proposition 4.3.6.2.58 was proved by Hermans [54], who applied it to the nonholonomically constrained system of a ball rolling in an axially symmetric bowl.

Chapter 5

Carath´ eodory’s sleigh

In this chapter we will discuss the classical nonholonomically constrained system known as Carath´eodory’s sleigh. In order to illustrate the theory given in chapters 1, 2, and 4, we will derive the equations of motion in five different ways, construct the reduced system in three different ways, and carry out reconstruction explicitly.

5.1

Basic set up

Carath´eodory’s sleigh is a planar rigid body (where the plane contains the vertical axis) with a sharp strongly convex chisel edge in a vertical plane, which makes contact with a horizontal plane at its lowest point P . The sleigh is kept on the horizontal plane by a constant vertical gravitational force. It moves so that the instantaneous direction is in the vertical plane of the chisel edge, which does not slip sideways. In addition, we assume that the center of mass lies in the plane of the sleigh and is kept at a fixed height above the horizontal plane. This latter condition can be arranged by attaching to the sleigh two massless outriggers at an angle out of the plane of the sleigh, each having a single point of contact with the horizontal plane, which glide without friction. Since no forces act on the point of contact of either outrigger, the outriggers have no effect on the motion of the sleigh.

5.1.1

Configuration space

Suppose that in the reference position in R3 the sleigh is in the vertical e1 –e3 plane with its center of mass at the origin O and that its point of contact P with the horizontal e1 –e2 plane has coordinates (−`, 0, 0). 173

174

eodory’s sleigh Carath´

`

z

}|

{

CM

P Fig. 5.1 Carath´ eodory’s sleigh (without the outriggers). ` is the distance from the center of mass CM along the skate to a point directly over the point of contact P of the chisel edge with the horizontal plane.

The position of the moving sleigh is obtained by applying a proper Euclidean motion (A, a) : R3 → R3 : ξ 7→ Aξ + a to the reference sleigh. Here A is a rotation of R3 about the vertical e3 -axis and a = (x, 0) ∈ R2 × {0} ⊆ R3 . Corresponding to each A is a counter clockwise rotation „ « θ − sin θ Rθ = cos of the horizontal plane R2 × {0} = R2 through an angle sin θ cos θ 1 θ ∈ R/2πZ = S . The map Q = S 1 × R2 → E(2) : q = (θ, x) 7→ (Rθ , x)

is a diffeomorphism, which identifies the configuration space Q of the sleigh with the Euclidean group E(2). 5.1.2

Kinetic energy

Let µ be the mass distribution of the sleigh in its reference position. Suppose that µ is a finite positive measure whose support S is the sleigh, which R is a compact subset of R3 . Then m = S µ(dξ) is the mass of the sleigh. In addition, since the center of mass of the reference sleigh is the origin, we R have S ξ µ(dξ) = 0, Now Aξ + a is a point on the moving sleigh which has ˙ + a˙ and kinetic energy velocity Aξ Z ˙ + a, ˙ + ai T = 21 hAξ ˙ Aξ ˙ µ(dξ). S

Here h , i is the Euclidean inner product on R3 . From follows that Z Z ˙ µ(dξ) = A˙ Aξ ξ µ(dξ) = 0. S

S

ξ µ(dξ) = 0 it

S

Therefore T = Trot + Ttrans , where Z Z Ttrans = 21 ha, ˙ ai ˙ µ(dξ) = 12 (x˙ 21 + x˙ 22 ) µ(dξ) = S

R

S

1 2

m(x˙ 21 + x˙ 22 )

175

5.1. Basic set up

is the translational kinetic energyZand ˙ Aξi ˙ µ(dξ) hAξ, Trot = 12 S

is the rotational kinetic energy of the sleigh both about its center of mass. Since A is a rotation Z 1 ˙ A−1 Aξi ˙ µ(dξ). Trot = 2 hA−1 Aξ, From A =

„ « Rθ 0 0 1

we get

−1

A

S

0

1 0 −1 0 0 0 Aξ 0 00

˙ = θ˙@ 1 Aξ

˙ 3 × ξ). = θ(e

Here × is the vector product on R3 . Therefore Trot = 12 I θ˙2 , where I = R S he3 ×ξ, e3 ×ξi µ(dξ) is the moment of inertia of the reference sleigh about the vertical e3 -axis. Consequently, the kinetic energy of the moving sleigh is (1) T = 21 I θ˙2 + 12 m(x˙ 21 + x˙ 22 ). 5.1.3

Nonholonomic constraint

Because the sleigh is rigid and its chisel edge has the same instantaneous direction as the vertical plane of the sleigh, the point of contact of the edge follows a pursuit curve relative to the center of mass of the sleigh, that is, `θ˙ = −x˙ 1 sin θ + x˙ 2 cos θ. (2) Another way to formulate (2) is to say that the velocity of a motion R → Q : t 7→ q(t) = (θ(t), x1 (t), x2 (t)) of the sleigh must lie in the kernel of the constraint 1-form ϕ(q) = ϕ(θ, x1 , x2 ) = ` dθ + sin θ dx1 − cos θ dx2 (3) on configuration space Q. Physically, ϕ is the infinitesimal work done on the edge of the sleigh to keep it from slipping sideways. The constraint distribution D on configuration space Q is the kernel ker ϕ of the constraint 1-form ϕ (3). Evaluating dϕ = cos θ dθ ∧ dx1 + sin θ dθ ∧ dx2 on the vectors ∂ ∂ ∂ ∂ ∂ v1 = − ` sin θ + ` cos θ and v2 = cos θ + sin θ ∂θ ∂x1 ∂x2 ∂x1 ∂x2 gives dϕ(v1 , v2 ) = 1. Therefore ϕ is a contact 1-form on Q. Hence ker ϕ is a contact structure on Q. Consequently the accessible set of the constraint distribution D on Q is all of Q.1 1 According

to corollary 1.10.46 of chapter 1, this implies that the accessible set of the distribution H on D is all of D.

176

5.2

eodory’s sleigh Carath´

Equations of motion

We derive the equations of motion of Carath´eodory’s sleigh in five different ways: first, using the Lagrange-d’Alembert equations of §3.2 of chapter 1; second, using nonholonomic Dirac brackets of §7.2 of chapter 1; third, using the Lagrange-d’Alembert equations in a trivialization of §4.1 of chapter 1; fourth, using almost Poisson brackets of §7.2 of chapter 1 computed in a trivialization; and finally using the distributional Hamiltonian formulation of §6 in chapter 1. 5.2.1

Lagrange-d’Alembert equations

From the Lagrange-d’Alembert principle of §3.2 of chapter 1 it follows that the equation of motion of the sleigh is δL = λ ϕ.

(4)

Here δL is the Lagrange derivative of the Lagrangian ˙ x˙ 1 , x˙ 2 ) 7→ T = L : T Q → R : (q, q) ˙ = (θ, x1 , x2 , θ,

1 2

I θ˙2 + 12 m(x˙ 21 + x˙ 22 ) (5)

and ϕ is the constraint 1-form (3). The unknown multiplier λ in (4) is determined by the nonholonomic constraint `θ˙ + x˙ 1 sin θ − x˙ 2 cos θ = 0.

(6)

Writing (4) out we obtain d ∂L d ∂L d ∂L dθ + dx1 + dx2 dt ∂ θ˙ dt ∂ x˙ 1 dt ∂ x˙ 2 = λ` dθ + λ sin θ dx1 − λ cos θ dx2 . Using the definition of L (5), the above equation is equivalent to the system I θ¨ = λ` m¨ x1 = λ sin θ

(7)

m¨ x2 = −λ cos θ. To determine the multiplier λ, we multiply the first equation in (7) by ` and then differentiate (6). We get `2 λ = −`I θ¨ = I θ˙x˙ 1 cos θ − I x ¨1 sin θ − I θ˙x˙ 2 sin θ + I x ¨2 cos θ. Using the second and third equations in (7) we obtain λ=−

mI ˙ x˙ 1 cos θ + x˙ 2 sin θ). θ( I + m`2

(8)

177

5 2. Equations of motion

Thus the equations of motion of Carath´edory’s sleigh are m` ˙ x˙ 1 cos θ + x˙ 2 sin θ) θ( I + m`2 I x¨1 = − θ˙ sin θ(x˙ 1 cos θ + x˙ 2 sin θ) I + m`2 I x¨2 = θ˙ cos θ(x˙ 1 cos θ + x˙ 2 sin θ). I + m`2 θ¨ = −

5.2.2

(9)

Nonholonomic Dirac brackets

Next we derive the equations of motion of the sleigh using nonholonomic Dirac brackets from §7.2 of chapter 1. The Hamiltonian of the unconstrained sleigh on T ∗ Q with coordinates (q, p) = (θ, x1 , x2 , pθ , px1 , px2 ) and bundle projection map τ : T ∗ Q → Q : (q, p) 7→ q is h=

2 1 2m (px1

+ p2x2 ) +

1 2 2I pθ .

(10)

This Hamiltonian is obtained from the Lagrangian L (5) using the Legendre transformation ∂L ∂L ∂L ˙ ˙ mx˙ 1 , mx˙ 2 ). , (θ, x˙ 1 , x˙ 2 ) 7→ (pθ , px1 , px2 ) = , = (I θ, ∂ θ˙ ∂ x˙ 1 ∂ x˙ 2 The canonical symplectic form ω = dθ ∧ dpθ + dx1 ∧ dpx1 + dx2 ∧ dpx2 on T ∗ Q induces the bundle isomorphism ω ] : T ∗ Q → T Q, where ω ] (dθ) = − ∂p∂ θ

ω ] (dpθ ) =

∂ ∂θ

ω ] (dx1 ) = − ∂p∂x

ω ] (dpx1 ) =

∂ ∂x1

1

ω ] (dx2 ) = − ∂p∂x

ω ] (dpx2 ) =

2

∂ ∂x2 .

The nonholonomic constraint (6) gives rise to a distribution D on Q defined by the kernel of the constraint 1-form ϕ (3). As a submanifold of T ∗ Q the pushforward (ω ] )−1 (D) of the constraint distribution D is the 0-level set of the constraint function m` pθ + px1 sin θ − px2 cos θ. (11) I To find the equations of motion of the sleigh we use equation (88) of §7.2 of chapter 1 for nonholonomic Dirac brackets. In our case this formula reads 1 {f |D, g|D} = {f, g}|D + (−{f, C}{g, ϕ} + {f, ϕ}{g, C}) |D (12) {C, ϕ} C=

178

eodory’s sleigh Carath´

for every f, g ∈ C ∞ (T ∗ Q), where

{C, ϕ}(d) = ω(d) ω ] (d)(dC(d)), ω ] (d)(ϕ(d))

by definition. But ω ] (d)(dC(d)) = and

∂ m` ∂ ∂ ∂ − px1 cos θ + px2 sin θ + sin θ − cos θ I ∂θ ∂pθ ∂x1 ∂x2

ω ] (d)(ϕ(d)) = −`

∂ ∂ ∂ + cos θ . − sin θ ∂pθ ∂px1 ∂px2

2

So {C, ϕ}(d) = − I+m` . Therefore (12) becomes I {f |D, g|D} = {f, g}|D −

I (−{f, C}{g, ϕ} + {f, ϕ}{g, C}) |D. (13) I + m`2

This allows us to compute the almost Poisson bracket of two smooth functions on the constraint distribution D. A straightforward calculation using (13) gives the following nonholonomic almost Poisson brackets I {px1 , pθ } = − I+m` 2 sin θ(px1 cos θ + px2 sin θ)

{px2 , pθ } =

I I+m`2

{x1 , px1 } = 1 − {x1 , px2 } =

cos θ(px1 cos θ + px2 sin θ)

I I+m`2

I I+m`2

sin2 θ

sin θ cos θ

`I {x1 , pθ } = − I+m` 2 sin θ m` {θ, px1 } = − I+m` 2 cos θ

{x2 , px1 } =

I I+m`2

{x2 , px2 } = 1 −

I I+m`2

{x2 , pθ } =

`I I+m`2

m` I+m`2

cos θ

{θ, px2 } =

sin θ cos θ cos2 θ

cos θ {θ, pθ } =

I I+m`2 ,

where all the coordinate functions (θ, x1 , x2 , pθ , px1 , px2 ) on T ∗ Q are considered to be functions on D. All other nonholonomic Dirac brackets between the restriction of the coordinate functions to D vanish identically. Thus we have determined the structure matrix of the almost Poisson structure { , } on C ∞ (D). Using the Hamiltonian h (10) for Carath´eodory’s sleigh and the above bracket relations, we find that its motions are the integral curves of the

5 2. Equations of motion

179

nonholonomic Hamiltonian vector field Yh on D, which satisfy 1 θ˙ = {θ, h} = pθ I 1 x˙1 = {x1 , h} = px1 m 1 x˙2 = {x2 , h} = px2 (14) m " #" # I − px1 sin θ + px2 cos θ px1 cos θ + px2 sin θ p˙ θ = − m(I + m32 ) " #" # I sin θ − px1 sin θ + px2 cos θ px1 cos θ + px2 sin θ p˙ x1 = − 2 m3(I + m3 ) " #" # I cos θ − px1 sin θ + px2 cos θ px1 cos θ + px2 sin θ . p˙ x2 = 2 m3(I + m3 ) " # I − px1 sin θ + px2 cos θ , the fourth equation in (14) is Because pθ = m& automatically satisfied. Note that the first order equations of motion (14) give rise to the second order equations (9) obtained from the Lagranged’Alembert principle. 5.2.3

Lagrange-d’Alembert in a trivialization

In this subsection we calculate the equations of motion of Carath´eodory’s sleigh using a trivialization of the constraint distribution. Let λD : Q × R2 → D ⊆ T Q : (q, c) = ((θ, x), c) '→ cQ (θ, x) ∂ ∂ ∂ = c2 ∂θ + (c1 cos θ − 3c2 sin θ) ∂x + (c1 sin θ + 3c2 cos θ) ∂x . 1 2

(15)

Since cQ (θ, x) ϕ = 0, the image of λD is contained in ker ϕ = D. But holding q fixed, the map λD becomes (λD )q : R2 → Dq , which is a linear isomorphism. So λD is the inverse of a trivialization of the constraint distribution D. If we interpret tangent vectors to Q as velocity vectors, then (15) amounts to θ˙ = c2 x˙ 1 = c1 cos θ − 3c2 sin θ

(16)

x˙ 2 = c1 sin θ + 3c2 cos θ. Solving the last two equations in (16) for c1 and 3c2 gives c1 = x˙ 1 cos θ + x˙ 2 sin θ ˙ 3c2 = −x˙ 1 sin θ + x˙ 2 cos θ = 3θ.

(17)

180

eodory’s sleigh Carath´

The last equality in the second equation of (17) follows from the defining equation (6) of D. Therefore the Lagrangian L (5) in the trivialization λ−1 D is the smooth function L : Q × R2 → R, where L(q, c) = L ◦ λD (q, c) =

1 2

mc21 + (I + m`2 )c22 .

(18)

Given a smooth curve γ : R → Q let c : R → R2 : t 7→ c(t) be the smooth curve defined by the condition λD (γ(t), c(t)) = γ(t). ˙ From proposition 1.4.6 of chapter 1, it follows that the Lagrange derivative of L along the curve R → Q × R2 : t 7→ (γ(t), c(t)) is hδL(γ(t), c(t))|ai = h

d ∂L (γ(t), c) dt ∂c −h

(L ◦ λ−1 D ) ∂ q˙

|ai − h

c=c(t)

(γ(t), q) ˙

∂L (q, c(t)) |aQ γ(t) i ∂q q=γ(t) | cQ (t), aQ γ(t) i, (19)

q= ˙ γ(t) ˙

2

for every a ∈ R . But d ∂L d ∂L d ∂L = , = mc˙1 , (I + m`2 )c˙2 , dt ∂c dt ∂c1 dt ∂c2 ∂(L ◦ λ−1 ∂L ∂L D ) = , = (mx˙ 1 , mx˙ 2 ), ∂ q˙ ∂ x˙ 1 ∂ x˙ 2

∂L = 0, ∂q

and [aQ , bQ ] = (a1 b2 − a2 b1 ) sin θ

∂ ∂ − cos θ ∂x1 ∂x2

(20)

for any a, b ∈ R2 , see (25) below. Therefore for every a = (a1 , a2 ) ∈ R2 we have ∂L Q Q d ∂L |ai − h |[c , a ]i hδL|ai = h dt ∂c ∂ x˙

= h(mc˙1 , (I + m`2 )c˙2 )|(a1 , a2 )i − h(mx˙ 1 , mx˙ 2 )|(c1 a2 − c2 a1 )(sin θ, − cos θ)i = h mc˙1 − m`c22 , (I + m`2 )c˙2 + m`c1 c2 |(a1 , a2 )i, (21)

using (16). The Lagrange-d’Alembert principle in a trivialization states that δL = 0. Consequently, the equations of motion of Carath´eodory’s sleigh in the trivialization λ−1 D are c˙1 = `c22 m` c˙2 = − I+m` 2 c1 c2 .

(22)

An explanation why c˙ only depends on c is given in the text after equation (73) below.

181

5 2. Equations of motion

5.2.4

Almost Poisson bracket form

In this subsection we find the equation of motion of Carath´eodory’s sleigh using almost Poisson brackets on C ∞ (D), which we compute using the trivialization λ−1 D , see (15). 5.2.4.1

H and $ in a trivialization

First we determine the symplectic distribution (H, $) for the sleigh in the trivialization λ−1 D , see §6.2 of chapter 1. In the trivialization λ−1 D the distribution H is given by Hu = T(q,c) λD (Dq × R2 ),

(23)

where u = λD (q, c) ∈ D, see (39) of §6.2 of chapter 1. We now find an expression for the symplectic form $ in the trivialization λ−1 λD (q, ei ), be a vector field on Q with values in D. For D . Let Xi (q) = P a ∈ R2 , aQ (q) = 2i=1 ai Xi (q) is a vector field on Q with values in D. From (23) it follows that every vector field on D with values in H can be Q Q Q Q written as aQ → (u) + b↑ (u), where a→ (u) = T(q,c) λD (a (q), 0) = a (q) and 2 bQ ↑ (u) = T(q,c) λD (0, b) for some a, b ∈ R , see (45) of §6.2 of chapter 1. We Q compute aQ → and b↑ as follows. Using

˙ x˙ 1 , x˙ 2 ) = (c2 , c1 cos θ − `c2 sin θ, c1 sin θ + `c2 cos θ) u = λD (θ, x, c) = (θ,

it follows from the definition (15) of the map λD that its tangent is given by ˙ x˙ 1 , x˙ 2 , c) ˙ x˙ 1 , x˙ 2 , θ, ¨x T(θ,x,c)λD (θ, ˙ = (θ, ¨1 , x ¨2 ) = c2 , c1 cos θ − `c2 sin θ, c1 sin θ + `c2 cos θ, c˙2 , c˙1 cos θ − c1 θ˙ sin θ − `c˙2 sin θ − `c2 θ˙ cos θ, (24) ˙ ˙ c˙1 sin θ + c1 θ cos θ + `c˙2 cos θ − `c2 θ sin θ .

Therefore

Q Q aQ → (u) = T(θ,x,c) λD (a (θ, x), 0) = a (θ, x) ∂ ∂ ∂ ∂ ∂ = a1 cos θ + sin θ + a2 − ` sin θ + ` cos θ (25) ∂x1 ∂x2 ∂θ ∂x1 ∂x2 and

bQ ↑ (u) = T(θ,x,c)λD (0, b)

∂ ∂ ∂ ˙ ∂ ∂ = b˙ 1 cos θ + b2 . + sin θ − ` sin θ + ` cos θ ∂ x˙ 1 ∂ x˙ 2 ∂ x˙ 1 ∂ x˙ 2 ∂ θ˙

182

eodory’s sleigh Carath´

Since the Lagrangian L (18) in the trivialization λ−1 D is the kinetic energy k(q)(cQ (q), cQ (q)) =

1 2

mc21 +

1 2

(I + m`2 )c22

of the sleigh, the kinetic energy metric k of the moving frames aQ , bQ on Q 2 in the trivialization λ−1 D , which are associated to a, b ∈ R , is k(q)(aQ (q), bQ (q)) = m a1 b1 + (I + m`2 )a2 b2 .

(26)

Note that k is constant on Q. From equation (52) of §6.2 of chapter 1, we obtain Q Q eQ eQ $(u) aQ aQ → (u) + b↑ (u), e → (u) + b↑ (u) = k(q)(a (q), b (q)) −k(q)(e aQ (q), bQ (q)) −

2 X i=1

ci dαi (q)(aQ (q), e aQ (q)),

(27)

where αi (q) = k[ (q)(Xi (q)). Using equation (51) of §6.2 of chapter 1, we obtain 2 X i=1

=

ci dαi (q)(aQ (q), e aQ (q))

2 X

n,m=1

an e am LXn (k(cQ , Xm )) − LXm (k(cQ , Xn )) − k(cQ , [Xn , Xm ])

= −k(cQ , [aQ , e aQ ]), since k(cQ , aQ ) and k(cQ , e aQ ) are constant functions on Q −1 ∗ ∂ ∂ = −(a1 e a2 − a2 e a1 )(λD ) k(v1 , v2 ), where v1 = c2 ∂θ + (c1 cos θ − `c2 sin θ) ∂x 1 ∂ ∂ ∂ +(c1 sin θ + `c2 cos θ) ∂x and, v2 = sin θ ∂x − cos θ ∂x , 2 1 2 using (16) and (20)

= −m(a1 e a2 − a2 e a1 ) sin θ(c1 cos θ − `c2 sin θ) − cos θ(c1 sin θ + `c2 cos θ) = m`c2 (a1 e a2 − a2 e a1 ).

(28)

Therefore from (26), (27), and (28) we obtain Q eQ $(u)(aQ aQ → (u) + b↑ (u), e → (u) + b↑ (u))

= m(a1eb1 − b1 e a1 ) + (I + m`2 )(a2eb2 − b2 e a2 ) − m`c2 (a1 e a2 − a2 e a1 ).

5.2.4.2

(29)

Almost Poisson bracket of c1 and c2

In this subsection we compute the almost Poisson bracket of the smooth functions c1 and c2 (17) using the trivialization λ−1 D (15).

183

5 2. Equations of motion

bQ ↑

First we calculate the distributional Hamiltonian vector field Yc1 = aQ →+ eQ associated to the function c1 . For every e aQ a, eb ∈ R2 we have → + b↑ with e Q Q Q eQ eQ hdc1 |e aQ aQ a→ + ebQ → + b↑ i = $(Yc1 , e → + b↑ ) = $(a→ + b↑ , e ↑)

by definition of Yc1 . In other words, eb1 = m(a1eb1 − b1 e a1 ) + (I + m`2 )(a2eb2 − b2 e a2 ) − m`c2 (a1 e a2 − a2 e a1 ) (30) 2 e for every e a, b ∈ R . Successively setting one of the variables e a1 , e a2 , eb1 , and eb2 equal to 1 and the others equal to 0 we get (a1 , a2 , b1 , b2 ) = `c2 1 (m , 0, 0, − I+m` 2 ). Therefore

Yc1 = =

`c2 1 Q Q (1, 0)→ − (0, 1)↑ m I + m`2 1 m

“ ” ∂ ∂ cos θ ∂x + sin θ ∂x − 1

2

Consequently,

`c2 I+m`2

(31) “

∂ ∂ θ˙

− ` sin θ ∂∂x˙ + ` cos θ ∂∂x˙

{c1 , c2 } = −LYc1 c2 = −LYc1 θ˙ = 5.2.4.3

1

` c2 . I + m`2

2

”

.

(32)

(33)

Equations of motion

The equations of motion of Carath´eodory’s sleigh in almost Poisson bracket form using the trivialization λ−1 D of the constraint distribution D are c˙1 = {c1 , L} = (I + m`2 )c2 {c1 , c2 } = `c22 m` c˙2 = {c2 , L} = −mc1 {c1 , c2 } = − I+m` 2 c1 c2 .

(34)

Here L = 12 mc21 + 21 (I + m`2 )c22 , the Lagrangian in the trivialization λ−1 D , is thought of as a function on the constraint distribution D. 5.2.5

Distributional Hamiltonian system

In this subsection we derive the equations of motion of Carath´eodory’s sleigh from its distributional Hamiltonian formulation as a nonholonomically constrained system on the Lie group E(2) with left invariant constraint 1-form and left invariant kinetic energy. 5.2.5.1

Lie group model

The configuration space of Carath´eodory’s sleigh is the Lie group E(2) of proper Euclidean motions on R2 with Euclidean inner product h , i. The kinetic energy of the unconstrained sleigh S is T = Trot + Ttrans ,

(35)

184

eodory’s sleigh Carath´

where the rotational kinetic energy about the center of mass of the sleigh is Z Z ˙ A−1 Axi ˙ Axi ˙ dm = 1 ˙ dm Trot = 21 hAx, hA−1 Ax, 2 S S Z 1 hΩx, Ωxi dm, where Ω = A−1 A˙ ∈ so(2) = 2 S Z = 12 hx, xi dm)i(Ω)2 = 12 I (( i(Ω), i(Ω) )) = 12 hhI(Ω), Ωii. (36) R

S

Here I = S hx, xi dm is the moment of inertia of the sleigh about the „ « vertical axis about its center of mass. Ω = i(Ω)E, where {E = 01 −1 } 0 is a basis of so(2). The mapping i : so(2) → R : X = xE 7→ x is an isometry from (so(2), hh , ii), where hhX, Y ii = − 21 trXY , to (R, (( , )) ), where ((x, y)) = xy. The translational kinetic energy of the center of mass of the sleigh is Z Z Ttrans = 12 ha, ˙ ai ˙ dm = 12 hA−1 a, ˙ A−1 ai ˙ dm S S Z = 12 dm hb, bi, where b = A−1 a˙ =

1 2

S

m hb, bi.

(37)

Thus the kinetic energy of the sleigh is a homogeneous quadratic function on e(2) with coordinates (Ω, b) given by T =

1 2

hhI(Ω), Ωii +

1 2

hb, bi.

(38)

Here I : so(2) → so(2) is the generalized moment of inertia map i ◦ I ◦ i−1 . Hence the kinetic energy T determines an inner product on e(2) given by k : e(2) × e(2) → R : (Ω, b), (Ω0 , b0 ) 7→ hhI(Ω), Ω0 ii + mhb, b0 i, (39) which extends to a left invariant Riemannian metrix k on E(2) called the kinetic energy metric.

˙ a) We now impose the nonholonomic constraint that the velocity (A, ˙ of the sleigh lies in the kernel of the constraint 1-form φ on E(2) defined by φ(A, a) = `hhdA, AEii − hda, Ae2 i.

(40)

Here {e1 , e2 } is the standard basis of R2 . Lemma 5.2.5.1.1. The 1-form φ (40) is the same as the 1-form ϕ (3).

185

5 2. Equations of motion „

«

θ − sin θ Proof. To see this let A = cos ∈ SO(2), where θ ∈ R/2πZ, and let sin θ cos θ „ « 2 −1 x a = y ∈ R . Then A dA = E dθ and

„ « x , Ae2 i y

hA−1 da, e2 i = h So

„

φ = `hhE, Eii dθ − h

dx dy

« „

,

− sin θ cos θ

«

„

=h

dx dy

« „

,

− sin θ cos θ

«

i.

i = ` dθ + sin θ dx − cos θ dy = ϕ.

Let π : E(2) × e(2) → E(2) : (A, a, Ω, b) 7→ (A, a).

(41)

As a submanifold of E(2) × e(2) with coordinates (A, a, Ω, b) the constraint manifold D = ker π ∗ φ is defined by ˙ Eii − hA−1 a, 0 = `hhA−1 A, ˙ e2 i = `hhΩ, Eii − hb, e2 i = i(Ω)` − b2 , that is, D = {(A, a, `−1 b2 E, b) ∈ E(2) × e(2) (A, a) ∈ E(2), b ∈ R2 }. 5.2.5.2

(42)

The distribution H and its symplectic form $H

In this subsection we determine the distribution H = T D ∩ ker π ∗ φ on the constraint manifold D. Here π is the mapping (41), which is the pull back of ˙ a) the bundle projection T E(2) → E(2) : (A, a, A, ˙ 7→ (A, a) by the inverse of the left trivialization ˙ a) ˙ b = A−1 a). λ : T E(2) → E(2)×e(2) : (A, a, A, ˙ 7→ (A, a, Ω = A−1 A, ˙ (43) Until further mention we assume that ` 6= 0. Let u = (A, a, `−1 b2 E, b) ∈ D and let (Y, y), (Z, z) ∈ e(2). The vector vu = (AY, Ay, Z, z) ∈ Tu (E(2) × e(2)) lies in Tu D if and only if ˙ e2 i = `hhZ, Eii − hz, e2 i = `i(Z) − z2 , ˙ Eii − hb, 0 = hhΩ,

that is, Z = `−1 z2 E. The vector vu lies in ker π ∗ φ if and only if 0 = `hhY, Eii − hy, e2 i = `i(Y ) − y2 , that is, Y = `−1 y2 E. Therefore Hu = {(A(`−1 y2 E), Ay, `−1 z2 E, z) ∈ Tu (E(2) × e(2)) y, z ∈ R2 }, which is 4-dimensional. Thus H is a distribution on D.

(44)

186

eodory’s sleigh Carath´

Next we find the symplectic form $H on H. The pull back of the canonical 1-form θE(2) on T ∗ E(2) by the kinetic energy metric k is the 1-form θ on T E(2). Pulling θ back by λ−1 (43) gives the 1-form θe on E(2) × e(2) defined by e a, Ω, b) AY, Ay, Z, z = k (Ω, b), (Y, y) . θ(A, Thus ω e = − dθe is a symplectic form on E(2) × e(2). Explicitly, ω e (A, a, Ω, b) (AY, Ay, Z, z), (AY 0 , Ay 0 , Z 0 , z 0 )

= k((Z, z), (Y 0 , y 0 )) − k((Z 0 , z 0 ), (Y, y)) + k((Ω, b), [(Y, y), (Y 0 , y 0 )]), see appendix A of [30]

= (( I(i(Z)), i(Y 0 ) )) + mhz, y 0 i − (( I(i(Z 0 )), i(Y ) )) − mh(z 0 , yi + mhb, Y y 0 − Y 0 yi,

(45)

since the Lie bracket in e(2) is [(Y, y), (Y 0 , y 0 )] = ([Y, Y 0 ], Y y 0 − Y 0 y) = (0, Y y 0 − Y 0 y). Here we have used the fact that the Lie bracket [Y, Y 0 ] in so(2) is 0. Restricting ω e (u) to Hu × Hu gives the symplectic form $H (u) on Hu for every u ∈ D. In other words, for u = (A, a, `−1 b2 E, b) ∈ D and vu = (A(`−1 y2 E), Ay, `−1 z2 E, z) and vu0 = (A(`−1 y20 E), Ay 0 , `−1 z20 E, z 0 ) ∈ Hu we have $H (u)(vu , vu0 ) = (( I(i(`−1 z20 E)), i(`−1 y2 E) )) − (( I(i(`−1 z2 E)), i(`−1 y20 E) )) + m(z10 y1 − z1 y10 ) − m(z2 y20 − y2 z20 ) + mhb, `−1 y2 Ey 0 − `−1 y20 Eyi

= (I`−2 + m)(y2 z20 − z2 y20 ) + m(y1 z10 − z1 y10 ) − m`−1 b2 (y1 y20 − y2 y10 ).

5.2.5.3

(46)

Equations of motion

For the distributional Hamiltonian system (D, H, $H , T ) corresponding to Carath´eodory’s sleigh the distributional Hamiltonian vector field YT on D, which governs the motion of the sleigh, satisfies $H )(u) = dT (u)|Hu

(YT

(47)

for every u ∈ D. Because YT (u) ∈ Hu we may write

YT (u) = (`−1 z2 AE, Az, `−1 w2 E, w)

2

(48)

−1

for some z, w ∈ R . At the point u = (A, a, ` b2 E, b) in the constraint distribution D, the kinetic energy of the sleigh is T = Tu (`−1 b2 E, b) = =

1 2 1 2

hhI(`−1 b2 E), `−1 b2 Eii + mb21

+

1 2

(m + I`

−2

)b22 .

1 2

mhb, bi (49)

5.3. Reduction of the E(2) symmetry

187

Therefore dT = mb1 db1 + (m + I`−2 )b2 db2 .

(50)

Evaluating both sides of (47) at (`−1 z20 AE, Az 0 , `−1 w20 E, w0 ) ∈ Hu and using (46) gives mb1 w10 + (m + I`−2 )b2 w20 = (m + I`−2 )(z2 w20 − w2 z20 ) + m(z1 w10 − w1 z10 ) − m`−1 b2 (z1 z20 − z2 z10 ),

(51)

for every z 0 , w0 ∈ R2 . Successively setting all but one of the variables z10 , z20 , w10 , w20 in (51) equal to 0 and the remaining variable equal 1 gives m`−1 (z1 , z2 , w1 , w2 ) = (b1 , b2 , `−1 b22 , − m+I` −2 b1 b2 ). Therefore 0 1 m `−1 b2 2 @ YT (A, a, `−1 b2 E, b) = `−1 b2 AE, Ab, − b b E, m` b b A . 1 2 − I+m`2 1 2 I + m`2 Thus the integral curves of YT satisfy A˙ = A(`−1 b2 E) a˙ = Ab „ « ˙ b 1 ˙ b 2

=

0 @

1 `−1 b2 2 A − m` 2 b1 b2 I+m`

(52) .

˙ = `−1 b˙ 2 E = − m 2 b1 b2 E from (52) as We have omitted the equation Ω I+m` it follows from the equation for b˙ 2 in (52). From (17) we find that „

c1 `c2

«

=

„

cos θ sin θ − sin θ cos θ

«„

x˙ 1 x˙ 2

«

= A−1 a˙ = b.

(53)

Thus the third equation in (52) becomes c˙1 = `c22 m` c˙2 = − I+m` 2 c1 c2 ,

(54)

which is just equation (22). Note that equation (54) holds when ` = 0. This is the reason for introducing the variables c1 and c2 in (53). 5.3

Reduction of the E(2) symmetry

In this section we show that Carath´eodory’s sleigh has an E(2) symmetry. After reducing this symmetry, we calculate the equations of motion of the sleigh after reducing this symmetry in three ways. First, we use the momentum equations of §8 of chapter 1; second, we use almost Poisson brackets of §7 of chapter 1, and third, we use the distributional Hamiltonian approach of §3 of chapter 3.

188

5.3.1

eodory’s sleigh Carath´

The E(2) symmetry

Physically it is obvious that the sleigh is invariant under horizontal translations and rotations about a vertical axis through its center of mass. Mathematically, there is an action of E(2) on configuration space Q = S 1 × R2 given by φ : E(2) × Q → Q : (Rϑ , y), (θ, x) 7→ (θ + ϑ, Rϑ x + y), (55) „

«

ϑ − sin ϑ ∈ SO(2, R) and ϑ ∈ S 1 = R/2π Z. This action lifts where Rϑ = cos sin ϑ cos ϑ to an action of E(2) on T ∗ Q defined by

φe : E(2) × T ∗ Q → T ∗ Q : (Rϑ , y), (θ, x, px , pθ ) 7→

(Rϑ , y) · (θ, x, px , pθ ) = (θ + ϑ, Rϑ x + y, Rϑ px , pθ ).

(56)

∗

Since the constraint distribution D ⊆ T Q, which is defined by pθ = I e m` (−px1 sin θ + px2 cos θ), is invariant under φ, we have an induced E(2)action Φ : E(2) × D → D : (Rϑ , y), (θ, x, px ) 7→ (θ + ϑ, Rϑ x + y, Rϑ px ), (57)

using (θ, x, px ) as coordinates on D.

Proposition 5.3.1.2. The E(2) action Φ (57) is a symmetry of Carath´eodory’s sleigh. Proof. Clearly the Hamiltonian h (10) is invariant under Φ (57), as is the constraint 1-form ϕ (3). Since the constraint distribution D = ker ϕ, it is also invariant. It remains to show that the equations of motion (14) are invariant under Φ. This follows from lemma 3.2.7 of chapter 3 directly. We deduce this from (14) as follows. First note that the functions f1 (θ, x, px ) = −px1 sin θ + px2 cos θ and f2 (θ, x, px ) = px1 cos θ + px2 sin θ

are invariant under Φ. Now

Yh ((Rϑ , y) · (θ, x, px )) = Yh (θ + ϑ, Rϑ x + y, Rϑ px ) 1 1 = f1 (θ + ϑ, Rϑ x + y, Rϑ px ), Rϑ px , m` m I − sin(θ + ϑ) f1 · f2 (θ + ϑ, Rϑ x + y, Rϑ px ) cos(θ + ϑ) m`(I + m`2 ) 1 1 I − sin θ = f1 (θ, x, px ), Rϑ ( px ), Rϑ f · f (θ, x, p ) 1 2 x cos θ m` m m`(I + m`2 ) dθ dx dpx d = , Rϑ ( ), Rϑ ( ) = (Rϑ , y) · (θ, x, px ) . dt dt dt dt

5.3. Reduction of the E(2) symmetry

5.3.2

189

The momentum equation

In this subsection we use the momentum equation of §8 of chapter 1 to reduce the E(2) symmetry from the equations of motion of the sleigh. Before we can apply the momentum equation, we need to verify Lemma 5.3.2.3. The E(2)-action (57) on the constraint distribution D is free and proper. Proof. The action is free, for if (θ + ϑ, Rϑ x + y, Rϑ px ) = (θ, x, px ), then θ = 0 mod 2π, which implies x = Rϑ x + y = x + y, that is, y = 0. The action Φ (57) is proper for if the sequence {(θn , xn , (px )n , θn + 0 0 0 00 00 00 ϑn , Rϑn xn + yn , Rϑn (px )n )}∞ n=1 converges to (θ , x , px , ϕ , x , px ), then the 00 0 sequence {ϑn } converges to ϕ − θ , the sequence {yn } converges to y 0 = x00 − Rϑ0 x0 , and the sequence {Rϑn (px )n } converges to Rϑ0 p0x . In other words, (ϕ00 , x00 , p00x ) = (θ0 + ϑ0 , Rϑ0 x0 + y 0 , Rϑ0 p0x ) = Φ(Rϑ0 ,y0 ) (θ0 , x0 , p0x ).

From lemma 5.3.2.3 it follows that the reduced space D = D/E(2) of orbits of the action Φ is a smooth manifold. Lemma 5.3.2.4. The reduced space D is diffeomorphic to R2 . Proof. Using the coordinates (θ, x, px ) = (θ, x1 , x2 , px1 , px2 ) for D, consider the map π : D → R2 : (θ, x, px ) 7→ R−θ px =

„

px1 cos θ + px2 sin θ −px1 sin θ + px2 cos θ

«

.

Thus the derivative of π at (θ, x, px ) ∈ D is the 5 × 2 matrix „

−px1 cos θ + px2 cos θ 0 0 cos θ sin θ −px1 cos θ − px2 sin θ 0 0 − sin θ cos θ

«

,

which has rank 2, because its 1, 2; 4, 5 minor is 1. Consequently, the tangent of π is surjective at every point of D. In addition, π is constant on E(2)orbits, because π((Rϑ , y) · (θ, x, px )) = π(θ + ϑ, Rϑ x + y, Rϑ px )

= R−(θ+ϑ) (Rϑ px ) = R−θ px = π(θ, x, px ).

To finish the proof it suffices to show that a fiber of π is a single E(2)-orbit on D. For then π induces a mapping π e : D/E(2) → R2 , which is injective

190

eodory’s sleigh Carath´

and smooth. Because π is surjective, π e is a diffeomorphism. Now suppose 0 0 0 that π(θ, x, px ) = π(θ , x , px ). Then R−θ0 p0x = R−θ px . So p0x = Rθ−θ0 px . From (Rθ0 −θ , x0 − Rθ0 −θ x) · (θ, x, px )

= (θ + θ0 − θ, Rθ0 −θ x + x0 − Rθ0 −θ x, Rθ0 −θ px ) = (θ0 , x0 , p0x )

it follows that (θ, x, px ) and (θ0 , x0 , p0x ) lie on the same E(2) orbit on D. Thus a fiber of π is a unique E(2)-orbit. We now use the momentum equation of §7.2 of chapter 3 to compute the E(2)-reduced equation of motion of Carath´eodory’s sleigh. First we find a basis of E(2)-invariant vector fields on the constraint distribution D. Fix a point q = (θ, x) in the configuration space Q = S 1 × R2 . Differentiating the orbit map φq : E(2) → Q : (Rϑ , y) 7→ φ(Rϑ ,y) (q) associated to the E(2)-action φ (55) at the identity element e = (I, 0) of E(2) gives „ « ∂ ∂ ∂ ∂ ∂ − x2 ∂x + x1 ∂x + y˙ 1 ∂x + y˙ 2 ∂x . Te φq : e(2) → Tq Q : ϑ˙ 01 −1 , y˙ 7→ ϑ˙ ∂θ 0 1 2 1 2

Thus the vector fields ∂ ∂ ∂ ∂ ∂ Zθ = − x2 + x1 , Zx 1 = , and Zx2 = ∂θ ∂x1 ∂x2 ∂x1 ∂x2 generate the module of E(2)-invariant vector fields on Q over the ring of smooth E(2)-invariant functions on Q. But none of the vector fields Zθ , Zx1 , or Zx2 leave the constraint distribution D = ker ϕ (3) invariant. However, a short calculation shows that the vector fields ∂ ∂ Z1 = cos θ Zx1 + sin θ Zx2 = cos θ + sin θ ∂x1 ∂x2 and Z2 = Zθ + (x2 − ` sin θ)Zx1 − (x1 − ` cos θ)Zx2 ∂ ∂ ∂ = − ` sin θ + ` cos θ ∂θ ∂x1 ∂x2 lie in ker ϕ, are linearly independent on D, and are E(2)-invariant. Recall that the canonical 1-form on T ∗ Q is θQ = pθ dθ + px1 dx1 + px2 dx2 . Therefore the momenta corresponding to the infinitesimal symmetries Z1 and Z2 are PZ1 = Z1

θQ = px1 cos θ + px2 sin θ

(58a)

5.3. Reduction of the E(2) symmetry

191

and PZ2 = Z2

θQ = pθ − `px1 sin θ + `px2 cos θ

(58b)

2

I + m` (−px1 sin θ + px2 cos θ), m` I since pθ = m` (−px1 sin θ+px2 cos θ). On (T ∗ Q, ωQ = dθ∧dpθ +dx1 ∧dpx1 + dx2 ∧ dpx2 ) the Hamiltonian vector fields corresponding to the momenta PZ1 and PZ2 are ∂ ∂ ∂ + sin θ − (px1 sin θ + px2 cos θ) XPZ1 = cos θ ∂x1 ∂x2 ∂pθ and ∂ ∂ ∂ I + m`2 XPZ2 = − sin θ + cos θ + (px1 cos θ + px2 sin θ) , m` ∂x1 ∂x2 ∂pθ respectively. Therefore the momentum equations on D = R2 are 1 P˙ Z1 = −LXPZ h = (−px1 sin θ + px2 cos θ) pθ 1 I 1 2 = (−px1 sin θ + px2 cos θ) , m` I since pθ = m` (−px1 sin θ + px2 cos θ) m` = P 2 , using the definition of PZ2 (I + m`2 )2 Z2 and P˙Z2 = −LXPZ h =

2

I + m`2 1 =− (px1 cos θ + px2 sin θ) (−px1 sin θ + px2 cos θ) m` m` 1 =− PZ PZ , using the definition of PZ1 and PZ2 . m` 1 2 Lemma 5.3.2.5. Rescaling the momentum equations gives the E(2)reduced equations of motion of Carath´eodory’s sleigh. Proof. Let c1 = αP˙Z1 and c2 = P˙Z2 for some α, β ∈ R. Then αm` αm` 2 2 c˙1 = αP˙Z1 = (I+m` 2 )2 PZ2 = β 2 (I+m`2 )2 c2 But (22) reads

(59)

β 1 c˙2 = β P˙Z2 = − m` PZ1 PZ2 = − αm` c1 c2 .

c˙1 = `c22 m` c˙2 = − I+m` 2 c1 c2 .

Comparing (59) with (60) we obtain Therefore α =

I+m`2 (m`)2

1 m(I+m`2

and β = √ `

αm` β 2 (I+m`2 )2

(60) = ` and

1 αm`

=

is the desired rescaling.

m` I+m`2 .

192

eodory’s sleigh Carath´

5.3.3

E(2)-reduced equations of motion

To remove the E(2) symmetry (55) of Carath´eodory’s sleigh we use the Lie group model of §2.5 and follow the pattern of reduction of symmetries for distributional Hamiltonian systems described in §3 of chapter 3. This involves constructing certain distributions V , V ∩ H, and U on the constraint manifold D, whose definitions we recall below. Until further mention in this subsection we assume that ` 6= 0. The E(2)-action on E(2) of left multiplication lifts to an action on T E(2), which using the trivialization λ−1 (43) is given by ψ : E(2) × (E(2) × e(2))→ E(2) × e(2) : (A0 , a0 ), (A, a, Ω, b) 7→ (A0 A, A0 a + a0 , Ω, b).

This action induces an E(2)-action on the constraint distribution D (42) given by Ψ : E(2) × D → D: (A0 , a0 ), u = (A, a, `−1 b2 E, b) 7→ (A0 A, A0 a + a0 , `−1 b2 E, b).

(61)

The reduced space D = D/E(2) of E(2) orbits on D is clearly R2 and the associated orbit map ρ : D ⊆ E(2) × D → D : u 7→ b is a surjective submersion. In other words, the maps ρ and Tu ρ : Tu D → Tρ(u) D : (AY, Ay, `−1 z2 E, z) 7→ z,

(62)

for every u ∈ D are both surjective. Fix u ∈ D. Differentiating the E(2)orbit map Ψu : E(2) → D : (A0 , a0 ) 7→ (A0 A, A0 a + a0 , `−1 b2 E, b) at the identity element e = (I, 0) of E(2) gives Tu Ψu : e(2) = Te E(2) → Tu D : (A0 , a0 ) 7→ (A˙ 0 A, A˙ 0 a + a, ˙ 0, 0). Let Ou be the E(2) orbit of the action Ψ through u ∈ D. Then Tu Ou = Te Ψu e(2). Thus the distribution V : D → T D : u 7→ Vu where Vu = Tu Ou = {(λ AE, λ Ea + a˙ 0 , 0, 0) ∈ Tu D (λE, a) ∈ e(2), λ ∈ R}.

We now find the distribution V ∩ H, where u 7→ Hu is the distribution H on D with Hu determined by (44). The vector wu = (AY, Ay, Z, z) lies in Vu if and only if (λAE, λEa + a˙ 0 , 0, 0) = (AY, Ay, Z, z)

(63)

for some (Y, y), (Z, z) ∈ e(2). From (63) and the fact that SO(2) is abelian it follows that (Z, z) = (0, 0), Y = λE, and y = A−1 (λEa + a˙ 0 ). For

5.3. Reduction of the E(2) symmetry

193

(A(λE), λEa + a˙ 0 , 0, 0) to lie in Hu we must have Y = `−1 y2 E, see (44). In other words, λ = `−1 y2 = `−1 hy, e2 i = `−1 λhEA−1 a, e2 i + hA−1 a˙ 0 , e2 i = `−1 λhA−1 a, Ee2 i + hA−1 a˙ 0 , e2 i = `−1 − λhA−1 a, e1 i + hA−1 a˙ 0 , e2 i .

Therefore ha˙ 0 , Ae2 i = λ(` − ha, Ae1 i), which determines a 2 dimensional subspace Vu ∩ Hu of Hu , which is spanned by the linearly independent vectors (AE, Aw, 0, 0) and (0, Ae1 , 0, 0). Here w = A−1 Ea + (` − ha, Ae1 i)e2 =

„

hA−1 Ea, e1 i hA−1 Ea, e2 i

«

+

„

« „ « 0 ha, Ae2 i = . ` − ha, Ae1 i `

(64)

Thus V ∩ H is a distribution on D. We now prove

Lemma 5.3.3.6. Let u = (A, a, (`−1 b2 E, b)) ∈ Du with b2 6= 0. Then Vu ∩ Hu is a symplectic subspace of (Hu , $H (u)). If b2 = 0, then Vu ∩ Hu is a Lagrangian subspace of (Hu , $H (u)). Proof. Let u = (A, a, `−1 b2 E, b) ∈ D. Let vu = λ(AE, Aw, 0, 0) + µ(0, e1 , 0, 0) and vu0 = λ0 (AE, Aw, 0, 0) + µ0 (0, e1 , 0, 0) be linearly independent vectors in Vu ∩ Hu , that is, λµ0 − λ0 µ 6= 0. Then using (45), we get $H (u)(vu , vu0 ) = mhb, λE(λ0 w + µ0 e1 ) − λ0 E(λw + µe1 )i = mb2 (λµ0 − λ0 µ).

(65)

The conclusions of the lemma follow immediately. By definition the distribution U on D is given by u 7→ Uu = Hu ∩ (Vu ∩ Hu )$H (u) . The following calculation determines Uu . The vector vu0 = (A(`−1 y20 E, Ay 0 , `−1 z20 E, z 0 )), where y 0 , z 0 ∈ R2 lies in Hu . It lies in Uu if and only if for every λ, µ ∈ R 0= =

` ´ $H (u) (A(λE), A(λw + µe1 ), 0, 0), (A(`−1 ye0 E), Ay 0 , `−1 z20 E, z 0 ) ` ´ ` ´ k (`−1 z20 E, z 0 ), (λE, λw + µe1 ) + k (`−1 b2 E, b), [(λE, λw + µe1 ), (`−1 y20 E, y 0 )]

= hhI(`−1 z20 E), λEii + mhz 0 , λw + µe1 i + mhb, λEy0 − `−1 y20 E(λw + µe1 )i

„ « „ « b1 −λy20 + `−1 y20 w2 , i 0 −1 0 −1 0 λy1 − λ` y2 w1 − µ` y2 b2

= Iλ`−1 z20 + mz10 (λw1 + µe1 ) + mz20 λw2 + h

= λ`−1 Iz20 + mz10 (λw1 + µ) + mλ`z20 + mb2 (λy10 − `−1 y20 µ − λ`−1 y20 w1 ),

(66)

since w2 = ` by (64). Successively setting one of the variables λ or µ in (66) m` 0 equal to 1 and the other equal to 0 gives (z10 , z20 ) = (`−1 b2 y20 , − I+m` 2 b2 y1 ). Therefore Uu =

0 ˘` m 0 @ A(`−1 y20 E), Ay 0 , − I+m` 2 b2 y1 E, −

1 0 `−1 b2 y2 m` b y0 A I+m`2 2 1

´

∈ Hu y 0 ∈ R 2

¯

,

(67)

194

eodory’s sleigh Carath´

which shows that U is a distribution on D. Next we calculate the reduced subspace H b . By definition ! `−1 y 0 H b=ρ(u) = Tu ρ(Uu ) = b2 − m` 2 y0 ∈ R2 y0 ∈ R2 . I+m`2 1

The last equality follows from (62) and (67). When b2 = 0, it follows „ « that H b = {0}. This occurs when u = (A, a, 0, b01 ) ∈ D, for then Uu = { A(`−1 y20 E), Ay 0 , 0, 0 ∈ Hu y 0 ∈ R2 } and therefore Uu ⊆ ker Tu ρ. Because dim H b changes, the mapping H given by b 7→ H b does not define a distribution on the E(2)-reduced phase space D = R2 , but does define a reduced generalized distribution on D. Finally, we compute the reduced symplectic form $H on the reduced generalized distribution H. Fix u = (A, a, `−1 b2 E, b) ∈ D. The 2-form $H is defined by $H (ρ(u))(Tu ρ(vu ), Tu ρ(vu0 )) = $H (u)(vu , vu0 ), where vu , vu0 ∈ Uu . Suppose that b2 6= 0. Let

(68)

vu = A(`−1 y2 E), Ay, −

m (b2 y1 E), I + m`2

`−1 b2 y2 m` − I+m` 2 b2 y1

!

vu = A(`−1 y20 E), Ay, −

m (b2 y10 E), I + m`2

`−1 b2 y20 m` 0 − I+m` 2 b2 y1

!

and

be vectors in Uu . Then z = Tu ρ(vu ) =

`−1 b2 y2 m` − I+m` 2 b2 y1

!

and z 0 = Tu ρ(vu0 ) =

`−1 b2 y20 m` 0 − I+m` 2 b2 y1

!

.

Using (68) and (46) we get $H (b) = $H (u)(vu , vu0 ) = (`−2 I + m)(y2 z20 − z2 y20 ) + m(y1 z10 − z1 y10 ) − m`−1 b2 (y1 y20 − y2 y10 ) 0 0 = (`−2 I + m)b−1 2 `(z1 z2 − z2 z1 ) +

−

I + m`2 −1 b2 (z1 z20 − z2 z10 ) `

I + m`2 −1 b2 (z1 z20 − z2 z10 ) `

I + m`2 −1 (69) b2 (z1 z20 − z2 z10 ). ` Hence $H is a symplectic form on R2 with coordinates (b1 , b2 ) when b2 6= 0. If b2 = 0, then H (b1 ,0) = {0}. Thus the statement that $H (b1 , 0) is nondegenerate is vacuously satisfied. =

5.3. Reduction of the E(2) symmetry

195

We are now in position to derive the E(2)-reduced equations of motion for Carath´eodory’s sleigh. Since the kinetic energy T (49) on D (42) is invariant under the E(2)-action Ψ (61), it induces a function T (b) = 12 mb21 + 12 (m + I`−2 )b22 (70) 2 on the reduced phase space D = R . Because the vector field YT (52) on D is invariant under the E(2)-action Ψ, it induces a vector field ! YT (b = ρ(u)) = Tu ρ YT (u) =

`−1 b22 m` − I+m` 2 b1 b2

.

(71)

In other words, on R2 with coordinates (b1 , b2 ) the integral curves of the E(2)-reduced vector field YT satisfy b˙ 1 = `−1 b22 (72) m` b˙ 2 = − I+m` 2 b1 b2 In terms of the variables c = (c1 , c2 ) = (b1 , `−1 b2 ) equation (72) may be written as c˙1 = `c22 (73) m` c˙2 = − I+m` 2 c1 c2 . Note that (73) holds when ` = 0. This explains why the equations of motion of Carath´eodory’s sleigh (22) in the trivialization λ−1 D depend only on c, namely, they are the E(2)-reduced equations of motion. The distributional form of the E(2)-reduced Hamiltonian vector field on the E(2)-reduced phase space R2 with coordinates (b1 , b2 ) corresponding to the E(2)-reduced Hamiltonian function T (70) is (74) dT (b) = YT (b) $H (b), that is, for every vector field Z on R2 dT Z(b) = $H (b)(YT (b), Z(b)). (75) When b2 6= 0, using (69) we see that equation (75) is equivalent to I + m`2 −1 1 2 mb1 Z 1 + (m + I`−2 )mb2 Z 2 = b2 (YT Z − YT2 Z 1 ), (76) ` 1 2 2 for every Z = (Z , Z ) ∈ R . Successively setting one of the variables Z 1 , Z 2 equal to 1 and the other equal to 0 gives (YT1 (b), YT2 (b)) = m` (`−1 b22 , − I+m` 2 b1 b2 ). When b2 = 0 we obtain YT (b1 , 0) = (0, 0), because YT (b1 , 0) ∈ H (b1 ,0) = {0}. Therefore ∂ m` ∂ YT (b) = `−1 b22 − b1 b2 , (77) ∂b1 I + m`2 ∂b2 or in terms of the variables c1 and c2 ∂ m` ∂ YT (c) = `c22 − c1 c2 , (78) ∂c1 I + m`2 ∂c2 as in (22) and (73).

196

eodory’s sleigh Carath´

5.3.3.1

The E(2)-reduced equations using almost Poisson brackets

On R2 with coordinates (c1 , c2 ) we have an almost Poisson bracket { , } whose structure matrix is „

« {c1 , c1 } {c1 , c2 } {c2 , c1 } {c2 , c2 }

=

0 `c2 − I+m` 2

`c2 I+m`2

0

!

,

see (33). Since the E(2) reduced kinetic energy is T = 12 mc21 + 12 (I +m`2 )c22 , the E(2)-reduced equations of motion of Carath´eodory’s sleigh are c˙1 = {c1 , T } = ` c22

(79)

m` c˙2 = {c2 , T } = − I+m` 2 c1 c2 .

c2

c1

Fig. 5.2

5.4

The integral curves of the E(2)-reduced vector field YT on R2 .

Motion on the E(2) reduced phase space

Here we discuss the qualitative properties of the E(2)-reduced motion of Carath´eodory’s sleigh, which is governed by the vector field ∂ m` ∂ − c1 c2 . (80) YT (c1 , c2 ) = `c22 2 ∂c1 I + m` ∂c2 Theorem 5.4.7. The E(2)-reduced vector field YT (80) on R2 has the following properties. A. When ` = 0, every point of the reduced phase space R2 is an equilibrium point of YT .

5.4. Motion on the E(2) reduced phase space

197

B. When ` 6= 0,

1. the E(2)-reduced kinetic energy T = 12 mc21 + 12 (I + m`2 )c22 is an integral of YT ; 2. the line {c2 = 0} consists of equilibrium points of YT ; 3. every integral curve of YT is asymptotic to an equilibrium point on {c2 = 0} as t → ±∞; 4. on R2 \ {c2 = 0} with symplectic form dc1 ∧ dc2 , the E(2) reduced vector field is Hamiltonian, after rescaling time. Proof. A. This assertion is clear. B. We now verify the statements. 1. The calculation LYT T = mc1 c˙1 + (I + m`2 )c2 c˙2 = m`c1 c22 − m`c1 c22 = 0 proves the assertion. 2. This follows from the definition (80) of YT . 3. The c2 -axis is transverse to every positive valued level set of T . Since YT (c1 , −c2 ) = YT (c1 , c2 ), it suffices to look at integral curves t 7→ c(t) = (c1 (t), c2 (t)) of YT of energy 12 mh2 , h > 0, which cross the positive c2 -axis hk m` ), where k 2 = I+m` at time t0 , that is, (c1 (t0 ), c2 (t0 )) = (0, √ 2 . We find ` c(t) as follows. Solving 1 2 2

mc21 +

1 2

(I + m`2 )c22 =

1 2

mh2

for c2 gives `c22 = k 2 (h −c21 ). Separating variables in and integrating yields

dc1 dt

(81) = `c22 = k 2 (h2 −c21 )

c1 (t) = h tanh(hk 2 (t − t0 ))

(82)

and therefore hk c2 (t) = √ sech(hk 2 (t − t0 )), (83) ` using (81). From (82) and (83) it follows that limt→±∞ c(t) = (±h, 0). 4. Define a new time scale s by ds dt = `c2 . Using the time scale s the integral curves of YT on R2 \ {c2 = 0} satisfy 1 dc1 dc1 dt = = `c22 = c2 ds dt ds `c2 dc2 dc2 dt m` 1 = =− c1 c2 = −λ2 c1 , ds dt ds I + m`2 `c2

198

eodory’s sleigh Carath´

which is Hamiltonian on (R2 \ {c2 = 0}, dc1 ∧ dc2 ) with Hamiltonian H(c1 , c2 ) = 12 λ2 c21 + 12 c22 . The Hamiltonian vector field XH on R2 \{c2 = 0} is incomplete. To see this consider an integral curve s 7→ γ(s), which starts at (0, λh). Then γ(s) = (c1 (s), c2 (s)) = (h sin λs, λh cos λs). When s = π/2λ, we have c2 (π/2λ) = 0. Thus γ runs off R2 \ {c2 = 0} in finite time.

5.5

Reconstruction

In this section we reconstruct the motion of Carath´eodory’s sleigh from the E(2)-reduced motion, which we have described in §4. 5.5.1

Relative equilibria

First we assume that ` = 0. Then every point (c1 , c2 ) of the reduced space R2 is an equilbrium point of the E(2)-reduced vector field YT (c) (78). Therefore the reconstructed motion of the sleigh is given by the action Φ (57) on the constraint distribution D restricted to a one parameter subgroup of E(2). Every one parameter subgroup of E(2) is either a translation of R2 with constant velocity or a rotation about some fixed point p in R2 with constant rotational velocity θ˙ = c2 6= 0. Suppose that θ˙ = c2 = 0. Then θ is constant and equation (17) reads d „x1 « „cos θ = sin θ dt x2

− sin θ cos θ

«„ « c1 0

= c1

„

cos θ sin θ

«

.

Therefore „

x1 (t) x2 (t)

«

= tc1

„

cos θ sin θ

«

+

„

x1 (0) x2 (0)

«

.

We determine the point p as follows. Let ξ be a fixed point of the reference ˙ + a˙ of the moving sleigh. The horizontal projection of the velocity v = Aξ sleigh is vhor =

d „cos θ dt sin θ

= θ˙

„

− sin θ cos θ

«„ « ξ1 + ξ2

− sin θ − cos θ cos θ − sin θ

„ d „x 1 « ˙ − sin θ = θ cos θ dt x2

− cos θ − sin θ

«„ « „ «„ « ξ1 cos θ − sin θ c1 + , ξ2 sin θ cos θ `c2

«„ « „ « ξ1 x˙ 1 + ξ2 x˙ 2

(84)

5.5. Reconstruction

199

using (17). If θ˙ = - 0, then v = (vhor, 0) = 0 if and only if „ « ξ

„

«„ «„ « cos θ − sin θ c1 #c2 sin θ cos θ „ « −# . −1 c2 c1

p = ξ1 = θ˙−1 cos θ sin θ 2 „

«„

= θ˙−1 0 −1 1 0

sin θ − cos θ

c1 #c2

«

=

(85)

This means that the sleigh either 1) rigidly translates with constant velocity along a straight line through the horizontal projection of its center of mass or 2) rotates rigidly with constant angular velocity about the point p, which lies on line through the point of contact P of the sleigh and perpendicular to the chisel edge. Case 1) occurs when θ˙ = c2 = 0 and case 2) when θ˙ = c2 -= 0.

Now assume that 3 -= 0 and suppose that (c1 , c2 ) lies on the c1 -axis, which consists of equilibrium points of the E(2)-reduced vector field. From θ˙ = c2 = 0 it follows that θ is constant. Then c˙1 = c2 = 0, which implies that c1 is constant. Hence „ « d „x1 « „cos θ − sin θ« „c1 « cos θ = = c . 1 0 sin θ cos θ sin θ dt x2 Therefore the reconstructed motion of the center of mass of the sleigh corresponding to the relative equilibrium (c1 , 0) is rectilinear with constant velocity if c1 -= 0 or is stationary if c1 = 0. 5.5.2

General motions

Now suppose that c2 -= 0 in addition to 3 -= 0. From (82) and (83) we find that t '→ (c1 (t), c2 (t)), where c1 (t) = 3κc2 (0) tanh rt c2 (t) = c2 (0) sech rt,

(86)

is an integral curve of the E(2)-reduced vector field with starting point c2 (0) I 2 (0, c2 (0)). Here κ2 = 1 + m& 2, r = ± κ , and ± is the sign of c2 (0). Integrating θ˙ = c2 , using the second equation in (86), gives . t . t 1 1 dt = ±2κr θ(t) − θ(0) = ±κr ds rs + e−rs sech rt e 0 0 . ert 1 1 dη, where η = ert = ±2κr η + η −1 rη 1 . ert " # 1 = ±2κ dη = ±2κ tan−1 (ert ) − π/4 . (87) 2 1 + η 1 2 Compared

to (82) and (83) we have r = hk 2 , κ2 =

1 , !k2

and c2 (0) =

hk √ . !

200

eodory’s sleigh Carath´

From (87) we obtain θ(+∞) − θ(−∞) = ±κ[(π/2 − π/4) − (0 − π/4)] = ±κπ.

(88)

Substituting (86) and (87) into x˙ =

d „x1 « „cos θ = sin θ dt x2

− sin θ cos θ

«„

c1 `c2

«

(89)

we get an explicit formula for x˙ as a function of t. Integrating gives x as a function of t. For t → ±∞, we see that x(t) ˙ converges exponentially fast to ˙ x(±∞) ˙ = ±`|θ(0)|(cos θ(±∞), sin θ(±∞)). Thus asymptotically the center of mass moves along a straight line with constant velocity. These asymptotic motions are relative equilibria. We retain the hypothesis c2 6= 0. For a given kinetic energy, all the E(2)-reduced motions can be mapped onto each other by a time translation together with a time reversal (c1 , c2 , t) 7→ (c1 , −c2 , −t) if the signs of c2 differ. This implies that the reconstructed motions are mapped to each other either by a time translation (and possibly a time reversal) and an application of the E(2) symmetry group on the constraint distribution. Because the E(2)-reduced solutions of different kinetic energy are mapped onto each other by a constant time rescaling t 7→ τ t, we obtain Proposition 5.5.2.8. For given values of `, m, and I, every trajectory of the center of mass and the point of contact of the sleigh is congruent to another such trajectory, except for the relative equilibria corresponding to (c1 , 0). When c1 6= 0, the motion is asymptotic to trajectories with c2 6= 0; while when c1 = 0 the sleigh is stationary. 5.5.3

Motion of a material point on the sleigh

In this subsection we describe the motion of a material point on the sleigh. Throughout we assume that ` 6= 0 and c2 6= 0. Let ξ be a material point on the reference sleigh. The curve γ : R → R3 : t 7→ γ(t) = A(t)ξ + a(t)

(90)

gives the motion of ξ on the sleigh with respect to a spatially fixed coordinate system. Since the center of mass of the moving sleigh has a constant

201

5.5. Reconstruction !

0 height above the horizontal plane, and the matrix A(t) = Rθ(t) is a ro0 1 tation around the vertical axis, (A(t)ξ + a(t))3 is constant. Therefore we need only consider the curve

γhor : R → R2 : t 7→ γ(t)hor , where γ(t)hor is the horizontal projection of γ(t). Recall that in §5.1 we have shown that the horizontal component of the velocity of the moving material point ξ is vhor = γ˙ hor = Rθ(t) ξhor + x(t) ˙ = −θ˙

„

sin θ cos θ − cos θ sin θ

«„ « „ «„ « ξ1 cos θ − sin θ c1 + . ξ2 sin θ cos θ `c2

(91)

Hence the velocity of ξ is 0 if and only if vhor = 0, that is, if and only if ξ1 = −` and c2 ξ2 − c1 = 0 with c2 = θ˙ 6= 0. Now every solution of the E(2)-reduced equations of motion c˙1 = `c22 m` c˙2 = − I+m` 2 c1 c2

(92)

intersects the line c2 ξ2 − c1 = 0 at a unique time t0 . Therefore the curve γhor is a smooth immersion of R \ {t0 } into R2 with a singularity at t = t0 . „ « „ « θ(t) sin θ(t) Let µ = cos , ν = −cos be a time dependent basis of R2 . Then sin θ(t) θ(t) µ˙ = θ˙ ν = c2 (t) ν and ν˙ = −θ˙ µ = −c2 (t) µ.

(93)

To determine the nature of the singularity of γhor at t = t0 we take the first and second derivatives of hγ˙ hor , µi = −c2 (t)ξ2 + c1 (t) and hγ˙ hor , νi = 0

(94)

and then evaluate at t = t0 , when γhor crosses the line c2 ξ2 − c1 = 0. First we verify (94). From (91) and the definition of µ we get

= =

„

«„ « „ «„ « „ « sin θ cos θ ξ1 cos θ − sin θ c1 cos θ + , i − cos θ sin θ ξ2 sin θ cos θ `c2 sin θ „ «„ «„ « „ « „ «„ « cos θ sin θ sin θ cos θ ξ1 c1 cos θ sin θ cos θ h−θ˙ − sin θ cos θ + , i − cos θ sin θ ξ2 `c2 − sin θ cos θ sin θ „ «„ « „ « „ « ξ1 ˙ 2 + c1 (t) = −c2 (t)ξ2 + c1 (t) hθ˙ 0 −1 + c1 , 1 i = −θξ 1 0 ξ2 `c2 0

hγ˙ hor, µi = h−θ˙

and similarly ˙ 1 + `c2 (t) = c2 (t)(ξ1 + `) = 0. hγ˙ hor , νi = θξ

202

eodory’s sleigh Carath´

Thus (94) holds. Now d m` hγ˙ hor , µi = −c˙2 ξ2 + c˙1 = ξ2 c1 (t)c2 (t) + `c22 (t), dt I + m`2 using (92), which evaluated at t = t0 gives d m` 2 hγ˙ hor , µi(t0 ) = c1 (t0 )c−1 2 (t0 )c1 (t0 )c2 (t0 ) + `c2 (t0 ) dt I + m`2 since c2 (t0 )ξ2 − c1 (t0 ) = 0 m` = c2 (t) + ` c22 (t0 ) > 0. I + m`2 1 But d hγ˙ hor , µi = h¨ γhor , µi + hγ˙ hor , µi ˙ = h¨ γhor , µi + hγ˙ hor , c2 (t) νi dt = h¨ γhor , µi, using the second equation in (94). Therefore h¨ γhor (t0 ), µ(t0 )i > 0.

(95)

Differentiating hγ˙ hor , νi = 0, we get

d hγ˙ hor , νi = h¨ γhor , νi + hγ˙ hor , νi ˙ = h¨ γhor , νi − c2 (t) hγ˙ hor , µi dt = h¨ γhor , νi − c2 (t) c1 (t) − ξ2 c2 (t) ,

0=

(96)

which evaluated at t = t0 gives

h¨ γhor (t0 ), ν(t0 )i = 0.

(97)

In other words, γ¨hor (t0 ) is a multiple of µ(t0 ). Differentiating (96) once again gives ... 0 = h γ hor , νi + h¨ γhor , νi ˙ − c˙2 hγ˙ hor , µi − c2 (t)h¨ γhor , µi − c2 (t)hγ˙ hor , µi ˙ ... m` = h γ hor , νi − 2c2 (t)h¨ γhor , µi + c1 (t)c2 (t) hγ˙ hor , µi − c22 (t) hγ˙ hor , νi, I + m`2 which, because hγ˙ hor (t0 ), ν(t0 )i = 0 = hγ˙ hor (t0 ), µ(t0 )i, becomes ... h γ hor (t0 ), ν(t0 )i = 2c2 (t0 ) h¨ γhor (t0 ), µ(t0 )i 6= 0, ... using (95). Therefore γ hor (t0 ) and γ¨hor (t0 ) are linearly independent. This together with γ˙ hor (t0 ) = 0 proves Proposition 5.5.3.9. The planar curve γhor : R → R2 : t 7→ γhor (t) has an ordinary cusp singularity at t = t0 when the integral curve of the E(2)reduced vector field (92) crosses the line c2 ξ2 − c1 = 0.

203

5.6. Notes

Proposition 5.5.3.10. The curve R → R2 : t 7→ P (t) traced by the point of contact of the sleigh with the horizontal plane has self intersections if I θ(+∞) − θ(−∞) is larger than 2π, that is, if κ2 = 1 + m` 2 > 2. Proof. This follows because asymptotically the center of mass of the sleigh moves along straight lines, which subtend an angle θ(+∞) − θ(−∞). When this angle is larger than 2π the pursuit curve traced by the point of contact, which tracks the center of mass, must have self intersections. When κ = 1, then I = 0, that is, the mass of the sleigh is concentrated along a vertical straight line through its center of mass. Differentiating d „x1 « „cos θ − sin θ« „ c1 « = sin θ cos θ `c2 dt x2 and using (92) gives ˙ 2 sin θ + c˙1 cos θ − θ`c ˙ 2 cos θ − `c˙2 sin θ x ¨1 = −θc

= −c1 c2 sin θ + `c22 cos θ − `c22 cos θ − κc1 c2 sin θ = 0

and ˙ 1 cos θ + c˙1 sin θ − θ`c ˙ 2 sin θ + `c˙2 cos θ x ¨2 = θc

= c1 c2 cos θ + `c22 sin θ − `c22 sin θ + κc1 c2 cos θ = 0.

In other words, the center of mass of the sleigh moves in a straight line with constant velocity. Because the point of contact follows a pursuit curve relative to the center of mass, we deduce that the point of contact curve is a tratrix. 5.6

Notes

In §6 of [24] Chaplygin treated the skate for the first time. Chaplygin’s skate is a massless wheel whose horizontal axle is rigidly attached to a rigid body like a pizza knife. It is assumed that the wheel has a sharp edge which prevents it from slipping sideways in a direction perpendicular to the plane of the edge. In [21] Carath´eodory published a detailed investigation of a special case of Chaplygin’s skate, called Carath´eodory’s sleigh. Here he assumed that the center of mass of the sleigh lies in the plane of the chisel edge. He did not mention Chaplygin. We now show that the dynamics of Chaplygin’s skate and Carath´eodory’s sleigh are the same. Assume that in the reference position the center of mass of the skate lies over the origin of R2 , that is, x = (0, 0).

204

Carath´ eodory’s sleigh

Moreover, assume that in the reference position the sharp edge of the wheel of the skate is parallel to the e1 –e3 plane. Write −(3, d) for the coordinates of the point of contact Q of the sharp edge of the skate in the reference position with the horizontal „ «plane. Because „ « #

−# cos θ + d sin θ + x

Q = −Rθ d + x = −# sin θ − d cos θ + x1 , 2 the nonholonomic constraint„ for the skate is: « ˙ Q˙ = (# sin θ + d cos θ)θ ˙+ x˙ 1

is a multiple of uθ =

„ « cos θ sin θ

(−# cos θ + d sin θ)θ + x˙ 2

. This condition is equivalent to ˙ uθ ) = 3θ˙ + x˙ 1 sin θ − x˙ 2 cos θ, 0 = det(Q,

(98) which is the same as the nonholonomic constraint (2) for Carath´eodory’s sleigh. Because the equations of motion of the skate are given in terms of its kinetic energy and the nonholonomic constraint (98), the equations of motion of the skate depend only on the parmeters 3, m, and I. Thus the dynamics of the skate and the sleigh are the same, where the point of contact of the chisel edge of the sleigh lies on a line perpendicular to the sharp edge of the skate and through the center of rotation p of the skate, see (85), when θ˙ -= 0. Chaplygin’s skate is discussed in chapter II §4 of Neimark and Fufaev [82] who in a footnote on page 71 noted that “Chaplygin stated and solved in quadratures the problem [Chaplygin’s skate] in this section in 1911”. Several classic sources treat Chaplygin’s skate without mentioning Chaplygin, for instance, pages 465–470 of Hamel [51] and page 334 of Rosenberg [95]. For a modern treatment see Bates [10] and page 126 in Koiller [59]. Our treatment of removing the E(2)-symmetry of Carath´eodory’s sleigh, which is a special case of regular nonholonomic reduction treated in chapter 2, is new, see also [34]. The relative equilibria with 3 = 0 in §5.1 are described in Chaplygin [24]. Equation (88) was obtained by Carath´eodory [21]. Integrating (89) completes the integration by quadratures of Chaplygin’s skate, and thereby Carath´eodory’s sleigh, as observed by Chaplygin [24]. Proposition 5.5.2.8 is due to Carath´eodory [21]. In [21] Carath´eodory illustrated the conclusion of proposition 5.5.3.10 by giving pictures of the curve traced out by the point of contact of the sleigh when κ = 1, 1, 5, 2, and 3. The fact that when κ = 1 the point of contact of the sleigh traces out a tratrix was known to Carath´eodory, see [21]. Chaplygin’s skate is often called a dynamic hatchet planimeter. See Foote [45] for an explanation how one can use a hatchet planimeter to measure the area enclosed by a simple planar curve.

Chapter 6

Smooth strongly convex rolling rigid body

In this chapter we treat the example of a smooth strongly convex rigid body rolling without slipping on a horizontal plane under the influence of a constant vertical gravitational force. We will use traditional notation, which sometimes clashes with that of the preceding chapters.

6.1

Basic set up

Let B be a reference body in a fixed reference frame in R3 . Assume that its mass R distribution is given by a nonnegative finite measure dm such that m = B dm > 0 is the mass of B. Suppose that the center of mass of B is the origin in R3 , that is, Z x dm = 0. (1) B

The position x(t) of the point x on the moving body at time t is given by applying a Euclidean motion x 7→ A(t)x + a(t) = x(t) to x ∈ B. Here A(t) ∈ SO(3) is a rotation and a(t) ∈ R3 is a translation vector. In other words, the configuration space of the moving body is identified with the 3-dimensional group E(3) of Euclidean motions (A, a) : R3 → R3 : x 7→ Ax + a.

(2)

In particular, as a set E(3) equals SO(3) × R3 , but as a group E(3) has multiplication defined by (A, a) · (A0 , a0 ) = (AA0 , Aa0 + a), 205

(A, a), (A0 , a0 ) ∈ SO(3) × R3 .

(3)

206

Convex rolling rigid body

E(3) is a Lie group with Lie algebra e(3) = {(X, b) ∈ so(3) × R3 }, whose Lie bracket is [(X, b), (X 0 , b0 )] = ([X, X 0 ] , Xb0 − X 0 b) .1

(4)

The bracket [ , ] on the right hand side of (4) is the Lie bracket in so(3), that is, [X, X 0 ] = XX 0 − X 0 X. ˙ Denoting differentiation with respect to time by ˙, we see that x˙ = Ax+ a˙ is the velocity of a point on the moving body with respect to the fixed reference frame. Thus the kinetic energy of the moving body is Z 1 ˙ + a, ˙ + ai T = 2 hAx ˙ Ax ˙ dm. (5) B

Here h , i is the standard Euclidean inner product on R3 . Because the center of mass of the reference body is the origin, we find that T = Trot + Ttrans , where the rotational kinetic energy is Z Z ˙ Axi ˙ dm = 1 ˙ A−1 Axi ˙ dm Trot = 21 hAx, hA−1 Ax, (6) 2 B

B

and the translational kinetic energy is Z Z ha, ˙ ai ˙ dm = 21 hA−1 a, ˙ A−1 ai ˙ dm = Ttrans = 12 B

B

1 2

m hb, bi.

(7)

Here b = A−1 a˙ ∈ R3 and m is the mass of the reference body.

We identify the 3 × 3 skew symmetric matrix ω × = A−1 A˙ with the vector ω ∈ R3 using the mapping 3

×

i : so(3) → R : ω = whose inverse is × 1 Equation

0 @

1 0 −ω3 ω2 ω3 0 −ω1 A −ω2 ω1 0

7→ ω =

0

1 ω1 @ ω2 A , ω3

: R3 → so(3) : ω 7→ ω × .

(8)

(9)

(4) is obtained by first differentiating

(A, a) · (exp tX 0 , tb0 ) · (A, a)−1 = (A exp tX 0 A−1 , −(A exp tX 0 A−1 )a + tAb0 + a) with respect to t and then evaluating at t = 0. This gives Ad(A,a) (X 0 , b0 ) = (AdA X 0 , −(AdA X 0 ) a + Ab0 ). Replacing (A, a) in the above equation by (exp tX, tb) and then differentiating the result with respect to t and finally evaluating at t = 0 gives ˆ ˜ ˆ ˜ (X, b), (X 0 , b0 ) = ad(X,b) (X 0 , b0 ) = ( X, X 0 , Xb0 − X 0 b).

207

6.1. Basic set up

Fact 6.1.1. The maps i and × have the following properties. 1. i : (so(3), hh , ii) → (R3 , h , i) is an isometry. hh , ii is the Killing metric on so(3) defined by hhX, Y ii = − 21 tr(XY ), for every X, Y ∈ so(3). Equivalently, hx, yi = hhx× , y × ii. 2. For every (ω × , x) ∈ so(3) × R3 , we have ω × x = ω × x. Here × is the vector product on R3 . 3. For every (A, ω × ) ∈ SO(3) × so(3), we have AdA ω × = Aω × A−1 = (Aω)× . 4. For every x, y ∈ R3 , we have (x × y)× = [x× , y × ]. Using the Grassmann identity ξ × (η × ζ) = hξ, ζi η − hξ, ηi ζ 3 for vectors ξ, (6) can be written as Z η, ζ ∈ R , we see that equation Z 2 1 1 Trot = 2 hω × x, ω × xi dm = 2 hω, ωihx, xi − hω, xi dm, B

because hω × x, ω × xi = hω, x × (ω × x)i

(10) (11)

B

2

= hω, hx, xiω − hx, ωixi = hω, ωi hx, xi − hω, xi . Let M = B x ⊗ x dm be the tensor of second moments of the mass distribution R dm. As a 3 × 3 semidefinite symmetric matrix, M has entries Mij = B xi xj dm. Let I = tr(M ) id − M . Then hI(ω), ωi = tr(M )hω, ωi − hM ω, ωi Z = hx, xihω, ωi − h(x ⊗ x)ω, ωi dm ZB 2 = hω, ωi hx, xi − hω, xi dm = 2 Trot . (12) R

B

I is the moment of inertia tensor of B. In a suitable orthonormal basis, which defines the principal axis frame, M = diag(M1 , M2 , M3 ), where Mj ≥ 0 for j = 1, 2, 3. Hence I = diag(I1 , I2 , I3 ), where I1 = M2 + M3 , I2 = M1 + M3 , and I3 = M1 + M2 . The Ij , j = 1, 2, 3 are the principal moments of inertia. They satisfy the inequalities 0 ≤ I1 ≤ I2 + I3 , 0 ≤ I2 ≤ I1 + I3 , and 0 ≤ I3 ≤ I1 + I2 . (13) To avoid I being singular we will assume that Ij > 0 for j = 1, 2, 3. This means that the mass of B is not concentrated along a straight line through its center of mass. The potential energy of the moving body in a constant vertical gravitation field of strength g is Z V (A, a) =

B

g hAx + a, e3 i dm = mg ha, e3 i,

where e3 is the third standard basis vector of R3 .

(14)

208

6.2

Convex rolling rigid body

Unconstrained motion

To describe the unconstrained rolling rigid body, we specify its Lagrangian 3. As usual 3 is a function on the velocity space T E(3) of the Euclidean group E(3) and, as expected, it is the difference of the kinetic and po-# " ˙ a) tential energies. Letting (A, a) be coordinates on E(3) and (A, a), (A, ˙ coordinates on T E(3), the Lagrangian is ˙ a)) 3((A, a), (A, ˙ =

1 2

""I(ω × ), ω × ## − mg "a, e3 #,

(15)

˙ Here I = i−1 ◦ I ◦ i : so(3) → so(3) is the generalized where ω × = A−1 A. moment of inertia tensor on so(3). The trivialization of T E(3) we use is the mapping T E(3) → E(3) × (R3 × R3 ) : " # " # ˙ a) (A, a), (A, ˙ '→ (A, a), (ω, b) ,

where ω × = A−1 A˙ and b = A−1 a. ˙ The inverse of this trivialization is the mapping λ : E(3) × R3 × R3 → T E(3) : " # " # (A, a), (ω, b) '→ (A, a), (A˙ = Aω × , a˙ = Ab) .

(16)

Pulling back 3 by the map λ we obtain the Lagrangian in the trivialization λ−1 L = 3 ◦ λ = T − V : E(3) × (R3 × R3 ) → R,

(17)

where T = T ((A, a), (ω, b)) =

1 2

"I(ω), ω# +

1 2

m "b, b#

(18)

is the kinetic energy and V = V ((A, a), (ω, b)) = mg "a, e3 #

(19)

is the potential energy. Lemma 6.2.2. At (ω, b) ∈ T(A,a) E(3) the Lagrange derivative δL of the Lagrangian L (17) is the 1-form " # δL (A, a), (ω, b) : R3 × R3 → R : (20) (η, β) '→ "I ω˙ − Iω × ω | η# + "m b˙ − m b × ω + mg u | β#.

209

6 2. Unconstrained motion

Proof. We use (17) of proposition 1.4.6 in chapter 1, namely, hδL(γ(t), c(t)) | ei = h

d ∂L (γ(t), c) dt ∂c ∂L −h (γ(t), c) ∂c

c=c(t)

c=c(t)

| ei − h

∂L (q, c(t)) ∂q

q=γ(t)

Q Q )i, | λ−1 γ(t) ( c(t) , e γ(t)

| eQ γ(t) i (21)

where q ∈ Q = E(3), γ(t) = (A(t), a(t)) ∈ E(3), c(t) = (ω(t), b(t)) and e = (η, β) ∈ R3 × R3 . The first term in (21) is h

d ∂L (γ(t), c) dt ∂c

c=c(t)

| ei =

d hIω(t), ηi + m hb(t), βi , dt

(22)

˙ The vector field on E(3), which corresponds to the linear form (I ω, ˙ m b). 3 3 which corresponds to (η, β) ∈ R ×R under the inverse of the trivialization λ (16), is eQ : E(3) → R3 × R3 : (A, a) 7→ (η, β).

(23)

Therefore the second term in (21) is −h

∂L (q, c(t)) ∂q

q=γ(t)

| eQ γ(t) i =

d dt

V (γ(t)) = mg ha, ˙ e3 i

t=0

= mg hAβ, e3 i = mg hu | βi,

(24)

which corresponds to the linear form (0, mg u). The vector field on E(3) corresponding to (ω(t), b(t)) is c(t)Q : E(3) → R3 × R3 : (A, a) 7→ (ω(t), β(t)).

(25)

The Lie bracket of (25) with (23) is the vector field × E(3) → R3 × R3 : (A, a) 7→ ω(t) × η, ω(t) β − η × × b(t) ,

see (4). Therefore the third term in (21) is −h

∂L (γ(t), c) ∂c

c=c(t)

Q Q i | λ−1 γ(t) ( c(t) , e γ(t)

= −h(Iω, m b) | [(ω, b), (η, β)]i = −h(Iω, m b) | (ω × η, ω × β − η × b)i = −h(Iω × ω, m b × ω) | (η, β)i.

Adding (22), (24), and (26) together gives (20).

(26)

210

6.3

Convex rolling rigid body

Constraint distribution

In this section we describe the constraint distribution of a rigid body which rolls without slipping on a horizontal plane. We also find its accessible sets. Let S be the boundary of the reference rigid body B. By hypothesis S is a smooth compact strongly convex 2 surface in R3 . Let n(s) be the inward unit normal to S at the point s. The map n : S → S 2 : s 7→ n(s) is the Gauss map of the surface S. Because S is smooth, compact, and strongly convex, the Gauss map is a diffeomorphism.

e3 a

s u

Ae1 e2

e1

Au = e3 As

Reference convex body

Ae3

P

Ae2

Rolling convex body

Fig. 6.1

The reference and moving strongly convex rigid body.

The condition that the body lies on a horizontal plane means that if s is the point on the reference surface such that the inward normal to the surface at the position As is pointing vertically upward, that is, A n(s) = e3 , then the vector P = A s + a lies in the horizontal plane. Because A n(s) = e3 is equivalent to n(s) = A−1 e3 = u, we see that s : S 2 → S : u 7→ s(u) is the inverse of the Gauss map. Therefore the condition that the body lies on a horizontal plane can be written as f (A, a) = hA s(u) + a, e3 i = 0.

(27)

This constraint on the position coordinates of the moving body is a holonomic constraint . The zero set N of the function f : E(3) → R, called the position space of the rolling body, is a smooth submanifold of E(3) which is diffeomorphic to SO(3) × R2 . 2 That

is, the tangent space at a point of S meets S only at that point.

211

6.3. Constraint distribution

In order that the moving body rolls without slipping, the nonholonomic constraint ˙ P˙ = As(u) + a˙ = 0

(28)

must hold. In words, equation (28) states that the velocity (with respect to the fixed reference frame) of the point of contact of the moving body with the horizontal plane vanishes. Equation (28) determines the constraint distribution D on E(3) defined by n o ˙ a) ˙ D(A,a) = (A, ˙ ∈ T(A,a) E(3) As(u) + a˙ = 0 . (29)

Using the inverse of the trivialization (16), the constraint distribution becomes n o (A, a, ω, b) ∈ E(3) × R6 ω × s(u) + b = 0 , (30) where u = A−1 e3 . In what follows when we refer only to the manifold structure of D we shall use the term constraint manifold . ˙ a) We now give an alternative description of D. Let (A, ˙ ∈ T(A,a) E(3). ˙ a) Then for some (Y, y) ∈ e(3) we have (A, ˙ = (AY, Ay). Define an R3 -valued 1-form ϕ on E(3) by ϕ(A, a)(AY, Ay) = (dA ⊗ s(u) + da)(AY, Ay)

= (A−1 dA ⊗ s(u))Y + (A−1 da)y

(31)

= Y (s(u)) + y.

Then the constraint distribution D is ker ϕ. To see this apply A−1 to the ˙ equation As(u) + a˙ = 0 defining the rolling constraint. Note ϕ is invariant under the action of SO(3) on E(3) given by (B, (A, a)) → (BA, a). Having identified the constraint distribution, we find its accessible sets. Lemma 6.3.3. We have D ⊆ ker df . At every point of E(3) the kernel of df is spanned by D and the Lie brackets of smooth vector fields which take values in D. Proof. Differentiating f (27) along a curve in D gives ds(u) ds(u) ds(u) f˙ = hA˙ s(u) + A + a, ˙ e3 i = hA , e3 i = h , ui = 0. dt dt dt Here the second equality follows from (29). The third equality follows from the orthogonality of A and the fact that A−1 e3 = u. To prove the fourth equality we observe that ds(u) belongs to the tangent space of the surface dt

212

Convex rolling rigid body

S at the point s = s(u), and that s = s(u) is equivalent to the condition that u = n(s) = the interior normal to S at the point s. This shows that D ⊆ ker df . To prove the second statement in the lemma we choose Xj ∈ so(3), for j = 1, 2, that is, the Xj are anti-symmetric 3 × 3-matrices. Define the vector fields Yj on E(3) = SO(3) × R3 by Yj (A, a) = (A Xj , −A Xj s(u)), where u = A−1 e3 and (A, a) ∈ SO(3) × R3 . Then the vector fields Yj are tangent to D and [Y1 , Y2 ] = (A [X1 , X2 ] , −A [X1 , X2 ] s(u))

+ (0, A (X2 D s(u) X1 u − X1 D s(u) X2 u)) .

The first term belongs to D. Therefore it is sufficent to show that the vectors X2 Ds(u) X1 u − X1 Ds(u) X2 u

(32)

span a two-dimensional linear subspace of R3 . The tangent at u of the mapping s : S 2 → S is a linear mapping from Tu S 2 to Ts(u) S. Because u = n(s(u)) is equal to the interior normal of S at s(u), both tangent spaces can be identified with the orthogonal complement u⊥ of u in R3 . Moreover, Ds(u) is equal to the tangent mapping of s, viewed as a linear transformation in u⊥ . Because s is equal to the inverse of the Gauss mapping n, Ds(u) is equal to the inverse of the derivative of the Gauss mapping, which is a positive definite self-adjoint transformation, since S is a strongly convex surface. From this it follows that Ds(u) is a positive definite self-adjoint transformation of u⊥ as well. Note that the antisymmetry of Xj implies that hXj u, ui = 0, which means that Xj u ∈ u⊥ . Now define X2 by u× . Then (32) is equal to X2 D s(u) X1 u. The space of all X1 u with X1 ranging over so(3), is equal to u⊥ . In addition, the mappings Ds(u) and X2 are bijective linear transformations of u⊥ . Therefore the space of vectors (32) where X1 ranges over so(3) contains the two-dimensional linear space A(u⊥ ). This proves the lemma. The fact D ⊆ ker df implies that for each n ∈ N = f −1 (0) we have Dn ⊆ Tn N . In other words, our phase space, which consists of all (q, q) ˙ ∈ T E(3) such that q ∈ N and q˙ ∈ Dn , is equal to the smooth vector subbundle [ DN = Dn (33) n∈N

of T N .

213

6.3. Constraint distribution

Corollary 6.3.4. If for each n ∈ N the tangent space Tn N is equal to the span of Dn and the Lie brackets of smooth vector fields on N which take values in D, then N is the accessible set of the distribution DN on N . Moreover, if we view DN as a submanifold of T N and let τ : T N → N be the tangent bundle projection map, then DN is the unique accessible set of the distribution H = (T τ )−1 (DN ) ∩ T (DN ) in DN . Proof. The first statement in the corollary follows immediately from lemma 6.3.3. To prove the second statement we use the classical result that if the Lie bracket of vector fields on a connected manifold with values in a distribution span the tangent space at every point, then an accessible set of the distribution is the entire manifold. This follows from Chow’s theorem [26], because our assumption implies that the tangent space at every point is spanned by the vector fields with values in the distribution and the transports of X by means of the flow of Y , where X and Y are two such vector fields. The second statement implies the third one, see proposition 1.10.45 in chapter 1. From corollary 6.3.4 it follows that every position of the body lying on the horizontal plane can be reached from any other such position by means of a rolling motion without slipping. Thus the nonholonomic constraint does not lead to any holonomic ones. Alternatively, we can prove lemma 6.3.3 as follows. The group E(3) has the vector subgroup R3 as a closed normal subgroup. The quotient is canonically isomorphic to SO(3). In other words, we have the principal R3 -fibration E(3) → SO(3). The R3 -valued 1-form ϕ (31) is a connection 1form for this principal fibration for which D = ker ϕ. Because R3 is abelian, the curvature of ϕ is equal to its exterior derivative dϕ = d((dA) s + da) = dA ∧ ds,

where s = s(u).

In turn, dϕ is equal to the pull-back under the bundle projection map from E(3) onto SO(3) of a closed 2-form on SO(3). Evaluated at (A X1 , A X2 ), this 2-form is equal to the A-image of (31). For any distribution D on a smooth manifold M , a smooth submanifold I of M is called an integral manifold of D if Tm I ⊆ Dm for every m ∈ I. A distribution D in M is called integrable if for each m ∈ M there exists an integral manifold I of D through m such that Tm I = Dm . The following

214

Convex rolling rigid body

proposition shows that the distribution D (29) on E(3) is non-integrable in a very strong sense. Proposition 6.3.5. If I is an integral manifold of D, then dim I ≤ 1. Proof. For any distribution D on a smooth manifold M and each m ∈ M there is an antisymmetric bilinear mapping µm : Dm × Dm → Tm M/Dm such that for each pair of smooth vector fields Y1 , Y2 on M which take values in D, one has [Y1 , Y2 ] (m) + Dm = µm (Y1 (m), Y2 (m)) ,

m ∈ M.

If I is an integral manifold of D, then [Y1 , Y2 ] (m) ∈ Tm I for any pair of smooth vector fields Y1 , Y2 in I and any m ∈ I. Using local extensions of these vector fields to smooth vector fields on N with values in D, we find that µm (u, v) = 0 for every pair of vectors u, v ∈ Tm I and every m ∈ I. In the case of the distribution D (29) on E(3) = SO(3) × R3 , the # from R3 mapping v #→ (0, A v) + D(A, a) leads to an isomorphism µ onto T(A, a) E(3)/D(A, a) . We use µ # to identify T(A, a) E(3)/D(A, a) with R3 . Also we use the mapping X #→ (A X, −A X s(u)) to identify D(A, a) with so(3). With these identifications, the proof of lemma 6.3.3 shows that µ is equal to the antisymmetric bilinear mapping which assigns to (X1 , X2 ) ∈ so(3) × so(3) the vector in R3 given by (32). Choose an orthonormal basis f1 , f2 , f3 in R3 such that f3 = u and f1 , f2 are eigenvectors of Ds(u) with eigenvalues s1 , s2 , respectively. Let Xjkl denote the antisymmetic matrix of Xj with respect to this basis. Then the vector given by (32) is equal to % $ 12 (34) X2 s2 X123 − X112 s2 X223 , X221 s1 X113 − X121 s1 X213 , 0 . The third component of the vector above is zero because (X2 Ds(u) X1 u − X1 Ds(u) X2 u, u)

= −(Ds(u) X1 u, X2 u) + (Ds(u) X2 u, X1 u) = 0.

To obtain the first equality we have used the antisymmetry of the Xj ’s; while the second follows from the symmetry of Ds(u). Because s1 > 0 and s2 > 0, it follows that the vector (34) is equal to zero if and only if X212 X123 = X112 X223

and X212 X113 = X112 X213 .

(35)

If X212 *= 0, then (35) implies that X1 = (X112 /X212 ) X2 ; whereas if X112 *= 0, then X2 = (X212 /X112 ) X1 . On the other hand, if X112 = X212 = 0, then the vector (34) is equal to zero. Therefore for every (A, a) ∈ E(3) there is

6.4. Constrained equations of motion

215

exactly one linear subspace of D(A, a) of dimension > 1 on which µ(A, a) is equal to zero, namely, the two-dimensional space of all vectors of the form (A X, −A X s(u)) such that X = A−1 A˙ ∈ so(3) and ˙ = tr((u× ) ◦ (A−1 A)) ˙ = 0. αA (A)

(36)

Here u = A−1 e3 and u× ∈ so(3). Equation (36) defines a smooth 1-form α on SO(3). The argument of the preceding paragraph implies that for any integral manifold I of D the restriction to I of the bundle projection map from E(3) to SO(3) is a local diffeomorphism onto an integral manifold of the distribution ker α on SO(3). Now for every x ∈ R3 we have (A−1 e3 ) × (A−1 A˙ x) = A−1 (e3 × (A˙ x)), which shows that ˙ = tr(A−1 ◦ (e× ) ◦ A) ˙ = tr(e× ◦ (A˙ A−1 )). tr((u× ) ◦ (A−1 A)) 3 3 Thus the 1-form α is invariant under right multiplication by SO(3). Therefore the distribution ker α is invariant under right multiplication. Because the mapping X 7→ X(e) is an isomorphism from the Lie algebra of all right invariant vector fields on SO(3) onto so(3), this implies the condition that I, and thus ker α, is integrable at any point is equivalent to the condition that ker α(e) is a Lie subalgebra of so(3). However, so(3) has no two-dimensional Lie subalgebras. This proves the proposition. 6.4

Constrained equations of motion

In this section we derive the Lagrange-d’Alembert equations of motion for a smooth compact strongly convex rigid body which rolls without slipping on a horizontal plane. We also give the distributional Hamiltonian form of these equations. 6.4.1

Vector field on D

The motion of the rolling rigid body is governed by the Lagrange-d’Alem bert principle theorem 1.3.4 §3.2 of chapter 1, which states that at each point p of the constraint manifold DN , the Lagrangian derivative (20) of the unconstrained Lagrangian (17) annihilates the linear subspace Dn of Tn N . The subspace D(A,a) (29) is the image under the map λ(A,a) (16) of {(ω, b) ∈ R3 × R3 0 = A(ω × s + b)} or equivalently {(ω, −ω × s) ∈ R3 × R3 ω ∈ R3 }. Using b = −ω × s and formula (20) for the Lagrange

216

Convex rolling rigid body

derivative, it follows that the Lagrange-d’Alembert principle is equivalent to dω 0 = hI − I(ω) × ω, ωi + dt

d − m (ω × s) + m ((ω × s) × ω) + mg u, s × ω , (37) dt for every ω ∈ R3 . Here s = s(u). Since Grassmann’s identity implies ((ω × s) × ω) × s = −(s × (ω × s)) × ω, equation (37) is equivalent to

dω d 0= I − I(ω) × ω − hm (ω × s) × s − m ((s × ω) × s) × ω dt dt + mg u × s, ω for every ω ∈ R3 , that is,

dω d = Iω × ω + m (ω × s) × s + m (s × (ω × s)) × ω − mg u × s. (38) dt dt Using dot to denote derivative with respect to t and differentiating the second term on the right hand side, equation (38) may be written as I

Y(s)(ω) ˙ = (Y(s)ω) × ω + m (ω × s) ˙ × s − mg u × s. 3

(39)

3

Here Y(s) : R → R is the linear map defined by Y(s)v = I(v) + m (s × (v × s)).

(40)

Physically, Y(s) is the moment of inertia of the body about its point of contact with the horizontal plane. Fact 6.4.1.6. Y(s) is invertible and symmetric. Proof. From the definition of Y(s), it follows that for every v ∈ R3 hY(s)v, vi = hIv, vi + m hs × (v × s), vi = hIv, vi + m hv × s, v × si. (41) Hence Y(s) is symmetric. Since the moment of inertia tensor I is positive definite, it follows that hY(s)v, vi ≥ 0. If hY(s)v, vi = 0, then hIv, vi = 0 which implies that v = 0. Thus Y(s) is positive definite and hence is invertible. Using (27) we obtain ˙ b = A−1 a˙ = −A−1 As(u) = −ω × s,

which substituted into (18) gives T = 21 hY(s)ω, ωi. In other words, Y(s) is a position dependent symmetric tensor which defines the kinetic energy as a quadratic function of the angular velocity in body coordinates.

217

6.4. Constrained equations of motion

From the above discussion we see that the motion of the rolling rigid body is governed by a vector field V on the constraint manifold D, whose integral curves satisfy A˙ = Aω × a˙ = −Aω × s = −A(ω × s) ω˙ = Y(s)−1 (Y(s)ω) × ω + m ((ω × s) ˙ × s) − mg u × s .

(42)

Here s = s(u). Moreover,

˙ −1 e3 = −ω × u = u × ω u˙ = −A−1 AA

and s˙ = Ds(u)u˙ = Ds(u)(u × ω).

The total energy of the rolling rigid body is E : D → R : c → (T + V )(c) =

1 2

hI(ω), ωi +

1 2

m hb, bi + mg ha, e3 i. (43)

Fact 6.4.1.7. E is an integral of the vector field V . Proof. This follows from corollary 1.3.5 of chapter 1. We give a direct computation of the Lie derivative of E with respect to V . At (A, a, ω, b) ∈ D we have E˙ = LV E = hI(ω), ˙ ωi + m hω × s, ω˙ × si + +mhω × s, ω × si ˙ + mg ha, ˙ e3 i = hY(s)ω, ˙ ωi + m hω × s, ω × si ˙ + mg ha, ˙ e3 i

= hY(s)(ω) × ω, ωi + m h(ω × s) ˙ × s, ωi − mg hu × s, ωi + m hω × s, ω × si ˙ − mg hA(ω × s), e3 i

= 0.

To obtain the right hand side of the first equality above, we have used b = −ω × s; to obtain the second, the definition of Y(s); to obtain the third, the third equation in (42), and to obtain the fourth, the fact that ha, ˙ e3 i = hb, ui = −hω × s, ui. Proposition 6.4.1.8. The integral curves of the vector field V on the constraint manifold D are defined for all time. Proof. Let ˙ a) π : D → Π = R2 : (A, a, A, ˙ 7→ As + a be the projection from the constraint manifold to the horizontal plane Π. Suppose that γ is an integral curve of V (42) which does not exist for all time. Since the energy E (43) is conserved, the image of γ lies in E −1 (e) for some e ∈ R. Because γ does not exist for all time, it runs out of every

218

Convex rolling rigid body

compact subset of E −1 (e) in finite time and hence out of every compact d subset of Π in finite time. However, this cannot happen because dt π(γ(t)) remains bounded. To see this, note that e= ≥

1 2 1 2

hIω, ωi + hIω, ωi,

≥ c hω, ωi,

1 2

m hω × s, ω × si + mg ha, e3 i

since ha, e3 i > 0

for some c > 0, since I is positive definite.

Because the body rolls without slipping, d d ˙ + As˙ + a˙ = ADs(u)(u × ω). (π(γ(t))) = (As + a) = As dt dt Consequently, k

d (π(γ(t))k ≤ kAk kDs(u)k kωk, dt

which is bounded on E −1 (e) since u ∈ S 2 , Ds(u) is continuous, and kωk ≤ pe c.

Note that the vector field V (42) governing the motion of the rolling rigid body also can be derived by applying the Hamilton-d’Alembert principle 1.5.12 of chapter 1 to the Hamiltonian E (43) on the symplectic manifold (E(3) × e(3), ω e ). Here ω e is the pull back of the symplectic form ω on T E(3) by the map λ (16). 6.4.2

Computation of H and $ in a trivialization

In this subsection we calculate the distribution H and its symplectic form $H in a trivialization, see §6.2 of chapter 1. We start by finding an expression for the 2-form $DN on the constraint distribution DN . We will use the inverse of the trivialization λDN : E(3) × R3 → DN ⊆ T E(3) : (A, a), ω 7→ (A, a), (Aω × , −A(ω × s(u))) .

(44)

Since λDN ((A, a), ei ) is the vector field Xi on E(3) given by Xi (A, a) = P3 3 (Ae× i , −A(ei × s)), it follows that for every η = i=1 ηi ei in R the vector field η E(3) on E(3) is given by η E(3) (A, a) =

3 X i=1

ηi Xi (A, a) = Aη × , −A(η × s(u)) .

(45)

6.4. Constrained equations of motion

219

We now calculate (DN in the trivialization (44) by evaluating " Q # Q + ζ↑Q , η'→ + ζ'↑Q = k(q)(η Q (q), ζ'Q (q)) (Q×Rd η→

− k(q)(' η Q (q), ζ Q (q)) − dk" (cQ )(q, c)([η Q (q), η'Q (q)]),

(46)

where Q = E(3), c = ω, and d = 3, see (52) of §6.2 in chapter 1. The kinetic energy metric k on E(3) of the strongly convex rolling rigid body is # " (47) k(A, a) (Aω × , η), (Aµ× , ζ) = "Iω, µ# + m "η, ζ#.

Therefore

# " k(A, a) η E(3) (A, a), ζ E(3) (A, a) # " = k(A, a) (Aη × , −A(η × s(u))), (Aζ × , −A(ζ × s(u)))

= "Iη, ζ# + m "A(η × s), A(ζ × s)# = "η, ζ# + m "η × s, ζ × s#

Now

= "Iη, ζ# + m "s × (η × s), ζ# = "Y(s)η, ζ#.

" LηE(3) ζ E(3)

# k" (ω E(3) ) (A, a) = D"Y(s(u))ω, ζ# η E(3) (A, a),

(48)

using (48)

= D1 "Y(s(u))ω, ζ# Aη , because u = A e3 and therefore "Y(s(u))ω, ζ# depends only on A × d "Y(s(e−tη u))ω, ζ#, because the flow of the vector field Aη × on = dt t=0 ×

−1

×

=m

d dt

×

×

×

SO(3) is t '→ Aetη and (Aetη )−1 e3 = e−tη A−1 e3 = e−tη u ×

×

"ω × s(e−tη u), ζ × s(e−tη u)#,

using (40)

t=0

= m "ω × Ds(u)(u × η), ζ × s(u)# + m "ω × s(u), ζ × Ds(u)(u × η)#.

(49)

In local coordinates we have

[η E(3) , ζ E(3) ](A, a) = Dζ E(3) (A, a)η E(3) (A, a) − Dη E(3) (A, a)ζ E(3) (A, a).

But η E(3) and ζ E(3) depend only on A, see (45). So

Dζ E(3) (A, a)η E(3) (A, a) = D1 ζ E(3) (A, a)Aη × " # × × × × × d (Aetη )ζ η , −(Aetη )ζ η s(e−tη u) = dt t=0 # " × × = Aη ζ , −Aη × ζ × s(u) − Aζ × Ds(u)(u × η) .

Consequently,

" # [η E(3) , ζ E(3) ](A, a) = A(η × ζ × − ζ × η × ), −A(η × ζ × − ζ × η × )s(u) " # + 0, A(η × Ds(u)(u × ζ) − ζ × Ds(u)(u × η)) " # = (η × ζ)E(3) (A, a) + 0, A(η × Ds(u)(u × ζ) − ζ × Ds(u)(u × η)) , because [η , ζ ] = η ζ − ζ η = (η × ζ) . ×

×

× ×

× ×

×

(50)

220

Convex rolling rigid body

Using (48) and (50) we get [η E(3) , ηe E(3) ]

k[ (ω E(3) ) (A, a)

= hY(s)ω, η × ηei + m hω × s, ηe × Ds(u)(u × η) − η × Ds(u)(u × ηe)i

(51)

Therefore

d(k[ (ω E(3) ))(A, a) η E(3) (A, a), ηe E(3) (A, a) = LηE(3) ηe E(3) k[ (ω E(3) ) − Lηe E(3) η E(3) − [η E(3) , ηe E(3) ]

k[ (ω E(3) ),

k[ (ω E(3) )

using (51) of chapter 1

= m hω × Ds(u)(u × η), ηe × si − m hω × Ds(u)(u × ηe), η × si − hY(s)ω, η × ηei.

(52)

E(3)

Using (46) and the abbreviations η→ and ζ↑ for (η (A, a), 0) and (0, ζ), respectively, we obtain $E(3)×R3 ((A, a), ω)(η→ + ζ↑ , ηe→ + ζe↑ ) = k(A, a) η E(3) (A, a), ζeE(3) (A, a) − k(A, a) ηe E(3) (A, a), ζ E(3) (A, a) − d(k[ (ω E(3) ))(A, a) η E(3) (A, a), ηe E(3) (A, a) e − hY(s)e = hY(s)η, ζi η , ζi + hY(s)ω, η × ηei

+ m hω × Ds(u)(u × ηe), η × si − m hω × Ds(u)(u × η), ηe × si

e − hY(s)e = hY(s)η, ζi η , ζi + hY(s)ω, η × ηei

− m hu × ηe, Ds(u)(ω × (η × s))i − m hs × (ω × Ds(u)(u × η)), ηei, ⊥

since Ds(u) is self adjoint on span{u}

e − hY(s)e = hY(s)η, ζi η , ζi + hY(s)ω, η × ηei

+ m hu × Ds(u)(ω × (η × s)), ηei − m hs × (ω × Ds(u)(u × η)), ηei.

Thus we have proved

Proposition 6.4.2.9. In the trivialization λ−1 DN the distribution H is HE(3)×R3 where (HE(3)×R3 )(A,a) = (DN )(A,a) × R3 ,

(53) E(3) η→ (A, a)

see (42) of §6 of chapter 1. Using the abbreviations η→ for = E(3) × (Aη , −A(η × s)) and ζ↑ for ζ↑ (A, a) = (0, ζ), see (45) of chapter 1, the nondegenerate 2-form $E(3)×R3 in the trivialization λ−1 DN is given by e − hY(s)e $E(3)×R3 ((A, a), ω)(η→ + ζ↑ , ηe→ + ζe↑ ) = hY(s)η, ζi η , ζi + hY(s)ω, η × ηei + m hu × Ds(u)(ω × (η × s)), ηei − m hs × (ω × Ds(u)(u × η)), ηei.

(54)

221

6.4. Constrained equations of motion

6.4.3

Distributional vector field in a trivialization E(3)

E(3) In the trivialization λ−1 (A, a) are DN (44) η→ = η→ (A, a) and ζ↑ = ζ↑ 3 vector fields on E(3) × R with values in (HE(3)×R3 )(A,a) . By definition YEE(3)×R3 ((A, a), ω) = η→ + ζ↑ , is the distributional Hamiltonian vector field, which corresponds to the energy function

EE(3)×R3 ((A, a), ω) = E(λDN ((A, a), ω)) = E(A, a)(Aω × , −A(ω × s)),

where E is the energy function (43) on DN

=

1 2

hY(s(u))ω, ωi − mg hs(u), ui,

(55)

since ha, e3 i = hAs(u), e3 i = −hs(u), A−1 e3 i = −hs(u), ui. Using (55) we find that for any ηe, ζe ∈ R3

e dEE(3)×R3 ((A, a), ω)(e η→ + ζe↑ ) = D1 EE(3)×R3 ((A, a), ω)Aη × + hY(s)ω, ζi, since EE(3)×R3 does not depend on a

=

d dt

t=0

1 2

× × × e hY(s(e−teη u))ω, ωi − mg hs(e−teη u), e−teη ui + hY(s)ω, ζi

= m hω × Ds(u)(u × ηe), ω × s(u)i − mg hDs(u)(u × ηe), ui

e since Y(s)ω = Iω + m s × (ω × s) − mg hs(u), u × ηei + hY(s)ω, ζi,

e = m hω × Ds(u)(u × ηe), ω × si − mg hs, u × ηei + hY(s)ω, ζi.

(56)

The last equality above follows because Ds(u)(u× ηe) ∈ Ts(u) S is orthogonal to u. The distributional Hamiltonian vector field YEE(3)×R3 on E(3) × R3 satisfies YEE(3)×R3 $E(3)×R3 ((A, a), ω) = dEE(3)×R3 ((A, a), ω), (57)

which evaluated at ηe→ + ζe↑ gives

$E(3)×R3 ((A, a), ω)(η→ + ζ↑ , ηe→ + ζe↑ )

= dEE(3)×R3 ((A, a), ω)(e η→ + ζe↑ ).

(58)

Using (54) and (56), we see that (58) is equivalent to e − hY(s)e hY(s)η, ζi η , ζi + hY(s)ω, η × ηei

+ m hη × s, ω × Ds(u)(u × ηe)i − m he η × s, ω × Ds(u)(u × η)i e + m hω × Ds(u)(u × ηe), ω × s)i + mg hs, u × ηei, = hY(s)ω, ζi

(59)

222

Convex rolling rigid body

that is, for every η', ζ' ∈ R3 we have

' − "Y(s)ζ, η'# + "(Y(s)ω) × η, η'# "Y(s)η, ζ#

+ m "u × Ds(u)(ω × (η × s)), η'# − m "s × (ω × Ds(u)(u × η)), η'# ' + m "u × Ds(u)(ω × (η × s)), η'# + mg "u × s, η'#. = "Y(s)ω, ζ#

Setting η' = 0, equation (60) becomes

' = "Y(s)ω, ζ#, ' "Y(s)η, ζ#

Hence Y(s)η = Y(s)ω, that is,

η = ω,

for every ζ' ∈ R3 .

(60)

(61)

since Y(s) is invertible. Setting ζ' = 0 and using (61), equation (60) reads −"Y(s)ζ, η'# + "(Y(s)ω) × ω, η'# − m "s × (ω × Ds(u)(u × ω)), η'# = mg "u × s, η'#,

for every η' ∈ R3 . In other words, So

−Y(s)ζ + (Y(s)ω) × ω − m s × (ω × Ds(u)(u × ω)) = mg u × s.

ζ = Y(s)

−1

Therefore

[(Y(s)ω) × ω + m (ω × Ds(u)(u × ω)) × s − mg u × s] . (62)

" # YEE(3)×R3 ((A, a), ω) = Aω × , −A(ω × s) + (0, ζ),

(63)

where ζ is given by (62). In other words, the integral curves of YEE(3)×R3 on DN satisfy A˙ = Aω × a˙ = −A(ω × s) −1

ω˙ = Y(s) 6.5

(64)

[(Y(s)ω) × ω + m (ω × Ds(u)(u × ω)) × s − mg u × s] .

Reduction of the translational R2 symmetry

In this subsection we reduce the translational R2 symmetry of the strongly convex rolling rigid body. The subgroup R2 = R2 × {0} ⊆ R3 of E(3) formed by translations of the horizontal plane R2 × {0} acts on T E(3) by " # " # ˙ a)) ˙ a) / : R2 × T E(3) → T E(3) : b, ((A, a), (A, ˙ '→ (A, a + b), (A, ˙ . (65) Φ

6.5. Reduction of the translational R2 symmetry

223

The holonomic constraint "As(u) + a, e3 # = 0, the nonholonomic con˙ straint As(u) + a˙ = 0, the kinetic energy T = 12 "Y(s)ω, ω#, and potential energy V = −mg "s(u), u# with u = A−1 e3 are all invariant under the R2 / Therefore the phase space DN given by action Φ. ˙ a)) ˙ {((A, a), (A, ˙ ∈ T E(3) "As(u) + a, e3 # = 0, and As(u) + a˙ = 0}, (66) which is diffeomorphic to T SO(3) × R2 , and the equations of motion A˙ = Aω × a˙ = −A(ω × s) −1

ω˙ = Y(s)

(67)

[(Y(s)ω) × ω + m ((ω × Ds(u)(u × ω)) × s) − mg u × s]

/ Consequently, the distributional Hamiltonian are also invariant under Φ. system (DN , HDN , (DN , hDN ), where DN = T SO(3) × R2 ⊆ T E(3) × R3 , HDN = {(HE(3)×R3 )u u = (A, ω) ∈ SO(3) × R3 ⊆ E(3) × R3 }, see (53), (DN = (E(3)×R3 |(HDN × HDN ), see (54), and hDN = (T + V )|DN , has an R2 -symmetry. 6.5.1

The R2 -reduced equations of motion

Consider the action of R2 on T SO(3) × R2 defined by

ϕ / : R2 × (T SO(3) × R2 ) → T SO(3) × R2 : " # ˙ c) '→ ((A, A), ˙ c + b). b, ((A, A),

(68)

This action is transitive. Its orbit space T SO(3) × R3 = (T SO(3)×R2 )/R2 may be identified with T SO(3), being the R2 -orbit through T SO(3) × {0}. The orbit map of this R2 -action is ˙ c) '→ (A, A). ˙ π : T SO(3) × R2 → T SO(3) : ((A, A), (69) Now " # ˙ c) '→ (A, a = „ c «), (A, ˙ a˙ = −As) ψ/ : T SO(3) × R2 → DN : ((A, A), −(As)3

is a diffeomorphism, which intertwines the R2 -action ϕ / (68) with the R2 2 / /b = action Φ|(R × DN ) (65). In other words, for every b ∈ R2 we have ψ/ ◦ ϕ / b ◦ ψ/ on T SO(3) × R2 . Consequently, ψ/ induces a smooth mapping Φ ψ : T SO(3) = T SO(3) × R2 → DN = DN /R2 : / ˙ = ((A, A), ˙ c) '→ ψ((A, ˙ c), (A, A) A),

(70)

˙ c) = {ψ/b ((A, A), ˙ c) ∈ T SO(3) × R2 b ∈ R2 } is the orbit of where ((A, A), / ˙ c) and ψ((A, ˙ c) = the R2 -action ϕ / on T SO(3) × R2 through ((A, A), A),

224

Convex rolling rigid body

˙ c)) ∈ DN b ∈ R2 } is the orbit of the R2 -action Φ|(R2 × {Φb (ψ((A, A), ˙ c). To verify that ψ is a diffeomorphism it DN ) on DN through ψ((A, A), suffices to show that it is injective, because then the map (ψ)−1 induced b b ˙ c) = ψ((B, ˙ d) for by ψ −1 is equal to (ψ)−1 . Suppose that ψ((A, A), B), 2 ˙ ˙ b ˙ some ((A, A), c) and ((B, B), d) in T SO(3) × R . Then ψ((A, A), c) and b ˙ d) lie in the same R2 -orbit on DN , that is, there is a b ∈ R2 such ψ((B, B), that

˙ d) = Φ b b (ψ((A, A), ˙ c)) = ψ(ϕ ˙ c)) = ψ((A, A), ˙ b + c). ψ((B, B), bb ((A, A),

˙ d) = ((A, A), ˙ b + c). Consequently, (B, B) ˙ = But ψ is injective, so ((B, B), ˙ (A, A), that is, the map ψ is injective. Let λ : SO(3) × R3 → T SO(3) : (A, ω) 7→ (A, Aω × )

(71)

be the inverse of a trivialization of T SO(3). Since π (69) is the orbit map of the R2 action on T SO(3) × R2 , the R2 -reduced vector field X on the R2 -orbit space T SO(3) is given by 2 × ˙ T πYE ˙ 3 |(T SO(3) × R )((A, A = Aω ), c) ((A,A),c)

E(3)×R

= X(π((A, Aω × ), c)) = X(A, ω)

whose integral curves satisfy A˙ = Aω × Y(s)ω˙ = (Y(s)ω) × ω + m ((ω × Ds(u)(u × ω)) × s) − mg u × s.

(72)

Thus we have shown

Proposition 6.5.1.10. The distributional Hamiltonian system (DN , HDN , $DN , hDN ) is a Chaplygin system (D, `, G), where D = DN , G = R2 , and ` = (T − V )|DN .3 6.5.2

Comparison with the Euler-Lagrange equations

It is tempting to view the kinetic T = 21 hY(s)ω, ωi and potential V = −mg hs(u), ui energies as functions on T SO(3) with coordinates (A, A˙ = Aω × ) and u = A−1 e3 . Using the trivialization λ−1 (71), we compute the Euler-Lagrange equations of motion δL = 0, when L = ` ◦ λ and ` = T − V are thought of as functions on T SO(3). In other words, 3 See

L : SO(3) × R3 → R : (A, ω) 7→

1 2

hY(s)ω, ωi + mg hs(u), ui.

(73)

§4 of chapter 3. The proof of proposition 6.5.1.10 shows that Q = N = {(A, a) ∈

SO(3) × R2 hAs(u) + a, e3 i = 0} and R = Q = Q/R2 = SO(3).

6.5. Reduction of the translational R2 symmetry

225

Using equation (17) of chapter 1 for every a ∈ R3 the Lagrange derivative hδL(A(t),“ω(t))| ai is equal to ” d h dt

∂L ∂ω

(A(t), ω)

ω=ω(t)

| ai −

∂L ˙ (A(t), A) ∂A ˙ dA(t) A= dt

∂L −h ∂A (A, ω(t))

A=A(t)

| [ω(t)SO(3) , aSO(3) ]i

| aSO(3) i.

(74)

But

d ∂L d(Y(s)ω) | ai = h , ai = hY(s)ω˙ + m Ds(u)(u × ω) × (ω × s) dt ∂ω dt + m s × (ω × Ds(u)(u × ω)), ai. (75) Also × ∂L d h L(Aeta , ω) (A, ω(t)) | Aa× i = ∂A dt t=0 h × × d 1 = hω, ωi + 12 hω × s(e−ta u), ω × s(e−ta u) 2 dt t=0 i × × +mg hs(e−ta u), e−ta ui h

= m hω × Ds(u)(u × a), ω × si + mg hs(u), ui

since Ds(u)(u × a) ∈ Ts(u) S is perpendicular to u

= m hu × Ds(u)(ω × (ω × s)), ai − mg hu × s, ai, (76) because Ds(u) is self adjoint. From L(A, A˙ = Aω × ) = 21 hh(Y(s)ω)× , ω × ii + mg hs(A−1 e3 ), A−1 e3 i, where hhX, Y ii = 12 X t Y for X, Y ∈ so(3), we find that ∂L h | [ω SO(3) , aSO(3) ]i = hh(Y(s)ω)× , (ω × a)× ii ∂ A˙ = hY(s)ω, ω × ai = h(Y(s)ω) × ω, ai. (77) Substituting (75), (76), and (77) into (74) we get hδL | ai = hY(s)ω˙ + m Ds(u)(u × ω) × (ω × s) + m s × (ω × Ds(u)(u × a))

− (Y(s)ω) × ω − m u × Ds(u)(ω × (ω × s)) + mg u × s, ai. (78) Consequently, the Euler-Lagrange equations of motion δL = 0 are A˙ = Aω × −1 ω˙ = Y(s) [(Y(s)ω) × ω − m Ds(u)(u × ω) × (ω × s) (79) −m s × (ω × Ds(u)(u × ω)) + m u × Ds(u)(ω × (ω × s)) −mg u × s] The second equation in (79) differs from the second equation in (72) by m (ω × s) × Ds(u)(u × ω) + m u × Ds(u)(ω × (ω × s)). (80) Taking the inner product with η, the expression in (80) becomes m hω × s, ω × Ds(u)(u × η) − η × Ds(u)(u × ω)i, (81) which is the correction term for a Chaplygin system, see (20) of chapter 3.

226

6.5.3

Convex rolling rigid body

The R2 -reduced distribution H DN and the 2-form $DN

In this subsection we prove Proposition 6.5.3.11. On the orbit space DN = T SO(3) = SO(3) × R3 b 2 × DN ) (65) the R2 -reduced distribution of the R2 -action Φ|(R H DN = H SO(3)×R3 = T SO(3) × R3 .

(82)

The R2 -reduced 2-form $DN on H DN is the 2-form $E(3)×R3 (54) restricted to SO(3) × R3 . More explicitly, for every (A, ω) ∈ SO(3) × R3 and every η, ζ, ηe, ζe ∈ R3 we have e − hY(s)ζ, ηei $DN (A, ω) η→ + ζ↑ , ηe→ + ζe↑ = hY(s)η, ζi + hY(s)ω, η × ηei − m hs × (ω × Ds(u)(u × η)), ηei + m hu × (Ds(u)(ω × (η × s)), ηei. SO(3)

SO(3) Here η→ = η→ and ζ↑ = ζ↑ values in T SO(3) × R3 given by

(83)

are vector fields on SO(3) ⊆ E(3) with SO(3)

SO(3) η→ (A) = (Aη × , 0) and ζ↑

(A) = (0, ζ).

(84)

Proof. Recall that N = {(A, a) ∈ SO(3) × R3 hAs + a, e3 i = 0} and n DN = (n, v) ∈ T N = T (SO(3) × R3 ) n ∈ N and o ˙ a) ˙ + a˙ = 0 . v = (A, ˙ ∈ T(n,v) (SO(3) × R3 ) with As

b c (p) ∈ DN for all c ∈ R2 } be the Fix p = (n, v) ∈ DN and let Op = {Φ 2 2 b orbit through p of the R -action Φ|(R × DN ) (65). Then b p d ∈ Tp DN for every d ∈ R2 } Tp Op = {Te Φ ={

d dt

={

d dt

t=0

b c+td (p) ∈ Tp DN for every d ∈ R2 } Φ (A,

t=0

= { (0,

„

c + td −(As)3

„ « d ), (0, 0) 0

«

˙ −As) ˙ ), (A, ∈ Tp DN for every d ∈ R2 }

∈ Tp DN for every d ∈ R2 }.

(85)

From equation (41) of chapter 1 we know that e n × R3 ), (HDN )p = T(n,v) λ(D

(86)

6.5. Reduction of the translational R2 symmetry

227

e is the map given by where λ

e : N × R3 → DN : (n, c) 7→ n, (A˙ = Ac× , a˙ = −Ac× s) , λ

(87) which is the inverse of a trivialization of DN . For (Ac , −Ac s(u)), f ∈ Dn × R3 we have e (Ac× , −Ac× s(u)), f T(n,c) λ ! d b e (Aetc× , ), c + tf λ = × tc −(Ae s(u))3 dt t=0 ! d b tc× (Ae , ), (A(c + tf )× , −A(c + tf )× s(u)) = × −(Aetc s(u))3 dt t=0 „ « 0 ), (Af × , −Af × s(u)) ∈ (HDN )p , (88) = (Ac× , × ×

×

−(Ac s(u))3

e c). From (86) and (88) we see that Tp Op ∩ (HD )p = {0} where p = λ(n, N and Tp DN = (HDN )p ⊕ Tp Op . Let ˙ a) ˙ ˙ 7→ (A, A) ρ : DN → D N = T SO(3) : (A, a), (A, b 2 × DN ) (65) on DN . On DN the be the orbit map of the R2 -action Φ|(R distribution U : DN → T DN : u 7→ Uu defined by Up = wp ∈ (HDN )p $DN (p)(vp , wp ) = 0, for every wp ∈ ker Tp ρ ∩ (HDN )p ,

see proposition 3.3.25 of chapter 3, equals (HDN )p , because ker Tp ρ = Tp Op and Tp Op ∩ (HDN )p = {0}. By definition the R2 -reduced distribution (H D N )p = Tp ρ(p)(HDN )p = Tp=(A,A) ˙ (T SO(3)), since ker Tp ρ = Tp Op , Tp DN = ker Tp Op ⊕ (HDN )p , and the map ρ is a surjective submersion. This proves (82). To prove (83) we note (84). So (83) follows immediately from (54). Let SO(2) = {R ∈ SO(3) Re3 = e3 }. We have an SO(2)-action on T SO(3) defined by ˙ 7→ (RA, RA). ˙ ϕ : SO(2) × T SO(3) → T SO(3) : (R, (A, A))

(89)

Identifying T SO(3) with SO(3) × R3 using the map

λ : SO(3) × R3 → T SO(3) : (A, ω) 7→ (A, Aω × ),

the SO(2)-action (89) becomes the action ϕ e : SO(2) × (SO(3) × R3 ) : (R, (A, ω)) 7→ (RA, ω).

(90)

228

Convex rolling rigid body

We now prove Lemma 6.5.3.12. The R2 -reduced 2-form $SO(3)×R3 (83) on the R2 reduced space T SO(3) = SO(3) × R3 is invariant under the SO(2)-action (90). Proof. Because (RA)−1 e3 = A−1 R−1 e3 = A−1 e3 = u, the point u = A−1 e3 on the 2-sphere S 2 is fixed by the SO(2)-action (90). From (83) we get ϕ∗R $SO(3)×R3 = $SO(3)×R3 for every R ∈ SO(2). SO(3)

Corresponding to e3 ∈ R3 there is the vector field e3 SO(3) defined by e3 (A) = e× 3 A. Because

on SO(3)

−1 × e× e3 A) = A(A−1 e3 )× = Au× , 3 A = A(A SO(3)

we see that e3 = uSO(3) = u→ . From the invariance of $SO(3)×R3 under → the SO(2)-action (90) it follows that 0 = Lu→ $SO(3)×R3 = u→

(d$SO(3)×R3 ) + d(u→

$SO(3)×R3 ),

that is, u→

(d$SO(3)×R3 ) = − d(u→

$SO(3)×R3 ).

(91)

We now prove Proposition 6.5.3.13. At each point of SO(3)×R3 where hs(u), ui 6= 0 the kernel of the 2-form u→ (d$SO(3)×R3 ) is a 2-dimensional vector subspace of the tangent space of T SO(3), which contains the vector u→ ; otherwise it is at least 4-dimensional. Proof. Consider the kinetic energy metric k on SO(3) defined by k(A)(Aη × , Aζ × ) = hY(s)η, ζi. Clearly k is invariant under the SO(2)-action (89) on T SO(3). Let ωSO(3) be the canonical symplectic form on T ∗ SO(3). The 2-form ω = (λ ◦ k[ )∗ ωSO(3) on SO(3) × R3 is invariant under the SO(2)-action (90) because λ, k, and ωSO(3) are. Therefore 0 = Lu→ ω = d(u→ because u→

dω = u→ d(u→

ω),

(λ ◦ k[ )∗ dωSO(3) = 0. So

$SO(3)×R3 ) = d u→

($SO(3)×R3 − ω) .

6.5. Reduction of the translational R2 symmetry

229

Using (54) we obtain ω(A, ω) η→ + ζ↑ , ηe→ + ζe↑ )

e − hY(s)ζ, ηei + hY(s)ω, η × ηei = hY(s)η, ζi

− m hω × Ds(u)(u × η), ηe × si − m hω × s, ηe × Ds(u)(u × η)i

+ m hω × Ds(u)(u × ηe), η × si + hω × s, ηe × Ds(u)(u × ηe)i.

Using (83) we get

($SO(3)×R3 − ω)(A, a) η→ + ζ↑ , ηe→ + ζe↑ )

e − hY(s)ζ, ηei + hY(s)ω, η × ηei = hY(s)η, ζi

− m hs × Ds(u)(u × η), ηei − m hDs(u)(u × η), ηei

+ m hu × (Ds(u)(ω × (η × s))), ηei − m hu × (Ds(u)((ω × s) × η), ηei e − hY(s)ζ, ηei + hY(s)ω, η × ηei − hY(s)η, ζi

− m hs × Ds(u)(u × η), ηei + m hu × Ds(u)(ω × (η × s)), ηei

= m hu × Ds(u)(η × (ω × s)), ηei − m hDs(u)(u × η) × (ω × s), ηei, e Therefore which does not depend on either ζ or ζ. Ω = u→ ($SO(3)×R3 − ω) (A, a) η→ + ζ↑ , ηe→ + ζe↑ ) = −m hω × s, u × Ds(u)(u × ηe)i,

(92)

(93)

setting ηe = u in (92) and using u × u = 0. Consequently, the matrix 3 of d(u→ $SO(3)×R3 ) with respect to the basis {(ei , 0), (0, ei )}i=1 of the space T(A,ω) (SO(3) ×R3 ) = R3 × R3 , equals the matrix of dΩ, which is „ « given by mBC −m0C t . Here C is the matrix of the linear transformation R3 → R3 : ηe 7→ s × (u × Ds(u)(u × ηe))

= hs, Ds(u)(u × ηe)i u − hs, ui Ds(u)(u × ηe).

(94)

If C ηe = 0 and hs, ui 6= 0, then (94) implies that Ds(u)(u × ηe) is a multiple µ of u. Because Ds(u)(u × ηe) ∈ Ts(u) S, which is perpendicular to u, it follows that µ = 0. Hence u × ηe = 0, since Ds(u) is invertible, being symmetric and positive definite. Therefore ηe = µ eu. Consequently, ker C is 1-dimensional, which implies that ker C t is 1-dimensional also. Therefore for any (A, ω) ∈ SO(3) × R3 the kernel of − d(u→ $SO(3)×R3 (A, ω) 3 (A, ω) is skew symmetis at most 2-dimensional. Because d$SO(3)×R ric, it follows that u→ u→ d$SO(3)×R3 (A, ω) = 0. In other words, u→ ∈ ker u→ d$SO(3)×R3 (A, ω). Since the kernel of a 2-form is even dimensional, we find that the kernel of u→ d$SO(3)×R3 is 2-dimensional

230

Convex rolling rigid body

at any (A, ω) ∈ SO(3) × R3 where hs(u), ui 6= 0. This proves the first conclusion of the proposition. If hs(u), ui = 0, then C is the linear map

C : R3 → R3 : ηe 7→ hs, Ds(u)(u × ηe)i u.

Hence ker C is the kernel of the linear form

R3 → R : ηe 7→ hs, Ds(u)(u × ηe)i,

which is at least 2-dimensional. So dim ker C t ≥ 2. Because u→ lies in ker u→ d$SO(3)×R3 , it follows that the kernel of the matrix M at (A, ω) contains the vector space span{u→ , ker C}, whose dimension is at least 3. Because d(u→ d$SO(3)×R3 ) is skew symmetric, its kernel is even dimensional and hence has dimension greater than or equal to 4. This proves the second assertion and thus the proposition. If the center of mass of the body B, which is assumed to be at the origin, is in the interior of its bounding surface S, then hs(u), ui < 0 at every point of S, because we assumed that u is an inward pointing normal to S. If the center of mass of B is on S, then the locus of points on S where hs(u), ui = 0 is a codimension 2 submanifold defined by s = s(u) = 0. If the center of mass of B is outside of S, then the locus of points on S, where hs(u), ui = 0 is a codimension 1 submanifold traced out by all normals to S which pass through the origin, which is the center of mass of B. 6.6

Reduction of E(2) symmetry

In this section we remove the symmetry group E(2) of Euclidean motions of the plane from the smooth strongly convex rigid body rolling without slipping on a horizontal plane by applying the reduction process described in chapter 3. E(2) is the subgroup of E(3) consisting of all (R, x) ∈ SO(3)×R3 = E(3) such that R is a rotation about the e3 -axis in R3 and x ∈ R2 × {0} ⊆ R3 . E(2) acts on R3 by sending ((R, x), y) to Ry + x. We let E(2) act on E(3) by E(2) × E(3) → E(3) : (R, x), (A, a) 7→ (RA, Ra + x).

This induces an E(2) action on T E(3) given by ˙ a)) ˙ Ra) E(2)×T E(3) → T E(3) : (R, x), ((A, a), (A, ˙ 7→ (Ra, Ra+x), (RA, ˙ .

6.6. Reduction of E(2) symmetry

231

Finally, using the trivialization ˙ a) T E(3) → (SO(3)×R3 )×R3 ×R3 = E(3)×R6 : (A, a), (A, ˙ 7→ (A, a, ω, b),

where A˙ = Aω × and a˙ = Ab, leads to the action

Φ : E(2) × (E(3) × R6 ) → E(3) × R6 : (R, x), (A, a, ω, b) 7→ (RA, Ra + x, ω, b). 6.6.1

(95)

E(2) symmetry

We now explain what we mean by E(2) being a symmetry group of the strongly convex rolling rigid body. The smooth strongly convex body rolling on a horizontal plane is a distributional Hamiltonian system (DN , HDN , $DN , EDN ), where 1. DN is the accessible set of the constraint distribution D on E(3) defined by DN = {(A, a, ω, b) ∈ E(3) × e(3) b = −ω × s(u), ha, e3 i = −hs(u), ui},

(96)

see lemma 6.3.3; 2. HDN is the $DN -symplectic distribution on DN , which at v ∈ DN is (HDN )v =

(97)

{(Ay × , −A(y

×

s(u)), z × , −z

× (s(u)) + ω × Ds(u)(u × y)) ∈ Tv DN y, z ∈ R3 }.

3. $DN is the symplectic form $DN (v)

(Ay × , −A(y × s), z × , −z × s + ω × Ds(u)(u × y),

(Ae y × , −A(e y × s, ze× , −e z × s + ωDs(u)(u × ye)

=

−hY(s)(z), yei + hY(s)(e z , yi + hY(s)ω, y × yei

− m h(ω × Ds(u)(u × y)) × s, yei + m h(ω × Ds(u)(u × ye)) × s, yi.

(98)

4. EDN is the Hamiltonian

EDN : DN ⊆ D → R : v →

1 2

hI(ω), ωi +

1 2

m hb, bi + mg ha, e3 i.

(99)

We now prove Proposition 6.6.1.14. E(2) is a symmetry of the distributional Hamiltonian system (DN , HDN , $DN , EDN ).

232

Convex rolling rigid body

Proof. 1. From the definition of Φ (95) it follows the Φ leaves b and ω unchanged. However, the SO(2) action R · u on u leaves u fixed, since (RA)−1 e3 = A−1 R−1 e3 = A−1 e3 . Because the action Φ maps A˙ to RA˙ and a˙ to Ra, ˙ it maps the defining ˙ ˙ equation As(u) + a˙ = 0 of DN to RAs(u) + Ra˙ = 0. Therefore DN is invariant under Φ. 2. Differentiating the map Φ(R,x) gives ˙ = (RA, ˙ ˙ a, ˙ Ra, Tv Φ(R,x) (A, ˙ ω, ˙ b) ˙ ω, ˙ b),

(100)

˙ ∈ T v DN . ˙ a, for every v ∈ DN , every (R, x) ∈ E(2), and every (A, ˙ ω, ˙ b) Applying Tv Φ(Rθ ,x) to wv = (Ay × , −A(y × s(u)), z × , −z × s(u) + ω × Ds(u)(u × y) ∈ Hv gives Tv Φ(R,x) wv = RAy × , −RA(y × s(R · u)), z × , −z × s(R · u) + ω × Ds(R · u)(R · u × y) .

(101)

From the definition of the distribution HDN (97) it follows that Tv Φ(R,x) wv ∈ HDN . Thus HDN is Φ-invariant. 3. From the definition of the 2-form $DN (v) (98) we see that $DN (Φ(R,x) (v))(Tv Φ(R,x) wv , Tv Φ(R,x) wv0 ) = − hY(s(R · u))(z), yei + hY(s(R · u))(e z , yi + hY(s(R · u))ω, y × yei − m h(ω × Ds(R · u)(R · u × y) × s(R · u), yei

+ m h(ω × Ds(R · u)(R · u × ye)) × s(R · u), yi

= $DN (v)(wv , wv0 ),

since R · u = u. Thus $DN is Φ-invariant. 4. We compute EDN (RA, Ra + x, ω, b) =

1 2

hI(ω), ωi +

1 2

m hb, bi + mg hRa + x, e3 i

= EDN (A, a, ω, b).

(102)

Thus the Hamiltonian EDN is Φ-invariant. Consequently, E(2) is a symmetry of the distributional Hamiltonian system (DN , HDN , $DN , EDN ) which describes the strongly convex rolling rigid body.

6.6. Reduction of E(2) symmetry

6.6.2

233

E(2)-orbit space

In this subsection we reduce the E(2)-symmetry of the rolling rigid body. The first step is to construct the space of E(2)-orbits DN = DN /E(2) of the E(2)-action Φ restricted to DN . Towards this goal we prove Lemma 6.6.2.15. The E(2)-action " # ' : E(2) × DN → DN : (R, x), (A, a, ω, b) '→ (RA, Ra + x, ω, b) Φ

is free and proper.

Proof. Suppose that (RA, Ra + x, ω, b) = (A, a, ω, b). Then R = e and ' is free. x = 0. Thus the action Φ ' on DN is proper. We Next we need to show that the E(2)-action Φ will use the criterion given in paragraph 4 of §2 of chapter 2. Suppose that the sequences {(An , an , ωn , bn )} and {(Rn An , Rn an + xn , ωn , bn )} in DN converge. We must show that there is a subsequence of the sequence {(Rn , xn )} in E(2) which converges. Towards this goal, we observe that the sequence {Rn = (Rn An )A−1 n } in SO(3, R) converges, since by hypothesis the sequence {Rn An } in SO(3, R) converges and the sequence {A−1 n } in SO(3, R) converges because by hypothesis the sequence {An } in SO(3, R) converges. Next the sequence {xn = (Rn an + xn ) − Rn an } in R3 converges, since the sequences {Rn an +xn } and {an } converge in R3 by hypothesis. So the sequence {Rn an } in R3 converges. Therefore the sequence {(Rn , xn )} ' is proper. in E(2) converges. This shows that the E(2)-action Φ From lemma 6.6.2.15 we see that the orbit space DN is a smooth manifold such that the canonical projection map ρ : DN → DN /E(2) = DN defines an E(2) principal bundle. More precisely, we have Lemma 6.6.2.16. Let π : DN → S 2 × R3 : (A, a, ω, b) '→ (u, ω),

(103)

where u = A−1 e3 , and let ρ : DN → DN /E(2) = DN be the orbit map of the E(2)-action Φ (95). Then there is a unique map ψ : DN → S 2 × R3 such that π = ψ ◦ ρ. Moreover, ψ is a diffeomorphism. Proof. The map π is constant on E(2)-orbits because π(RA, Ra + x, ω, b) = ((RA)−1 e3 , ω) = (A−1 e3 , ω) = (u, ω) = π(A, a, ω, b).

234

Convex rolling rigid body

This proves the existence of the map ψ. Also ψ is unique because ρ is surjective. Next we show that each fiber of π is a single E(2)-orbit on DN , which implies that ψ is injective. Suppose that (A, a, ω, b), (A* , a* , ω * , b* ) ∈ DN and that (u, ω) = (A−1 e3 , ω) = π(A, a, ω, b) = π(A* , a* , ω * , b* ) = ((A* )−1 e3 , ω * ) = (u* , ω * ). Then ω = ω * and u = u* . Written out the second equation reads A−1 e3 = u = u* = (A* )−1 e3 , which implies that Re3 = e3 , where R = A(A* )−1 . Because A and A* lie in SO(3, R), the matrix of R with respect to the orthonormal basis {e1 , e2 , e3 } „ « ' ∈ SO(2, R). of R3 is R0e 01 , where R Since (A, a, ω, b) and (A* , a* , ω * , b* ) lie in DN , it follows that

and

b = −ω × s(u) = −ω * × s(u* ) = b*

"a, e3 # + "s(u), u# = 0 = "a* , e3 # + "s(u* ), u* # = "a* , e3 # + "s(u), u#,

since u = u* .

Therefore "a, e3 # = "a* , e3 #. Now

"Ra, e3 # = "a, R−1 e3 # = "a, e3 # = "a* , e3 #,

which implies "a* − Ra, e3 # = 0. So there is a vector x ∈ span{e1 , e2 } such that a* = Ra + x. « „ e 0 R 0 1

, where * * * * ' R ∈ SO(2, R), and a vector x ∈ span{e1 , e2 } such that (A , a ω , b ) = (RA, Ra + x, ω, b). In other words, (A, a, ω, b) and (A* , a* , ω * , b* ) lie in the same E(2)-orbit on DN . The above argument shows that there is a rotation R =

Because the mapping SO(3) → S 2 ⊆ R3 : A '→ A−1 e3

is a submersion, the smooth map π (103) is a submersion. Let σ : U ⊆ DN → DN be a smooth local section of ρ, defined on an open neighborhood U of a point v ∈ DN . Then on U we have ψ = ψ ◦ ρ ◦ σ = π ◦ σ, which shows that the mapping ψ is smooth. Using a local section of π, a similar argument shows that ψ −1 is smooth as well. Consequently, ψ is a diffeomorphism.

6.6. Reduction of E(2) symmetry

235

Another way to find the E(2)-reduced space DN /E(2) is first to note that R2 is a normal subgroup of E(2). Then the E(2)-action on DN induces an action of the quotient group E(2)/R2 , which is isomorphic to SO(2), on the R2 -orbit space DN /R2 = T SO(3) (70). In particular, the induced SO(2)-action on T SO(3) is given by ˙ 7→ (RA, RA), ˙ SO(2) × T SO(3) → T SO(3) : (R, (A, A)) which using ˙ SO(3) × R3 → T SO(3) : (A, ω) 7→ (A, Aω × ) = (A, A), becomes the SO(2) action SO(2) × (SO(3) × R3 ) → SO(3) × R3 : (R, (A, ω)) 7→ (RA, ω).

(104)

Therefore the orbit space (SO(3) × R3 )/ SO(2) of the SO(2)-action (104) is S 2 × R3 , since ρe : SO(3) × R3 → S 2 × R3 : (A, ω) 7→ (A−1 e3 = u, ω)

(105)

is the orbitmap of the action (104). Hence the E(2)-orbit space DN /E(2) is DN /R2 / SO(2), which in turn is (SO(3) × R3 )/ SO(2) = S 2 × R3 . 6.6.3

E(2)-reduced distribution and 2-form

In this subsection we determine the E(2)-reduced distribution H S 2 ×R3 and 2-form $S 2 ×R3 . We show that the accessible set of H S 2 ×R3 is S 2 × R3 and give conditions when S 2 × R3 has a contact structure. Because each point u = A−1 e3 of the 2-sphere is left fixed by the SO(2) action (104) the total energy E=

1 2

hY(s(u))ω, ωi + mg hs(u), ui

(106)

on SO(3) × R3 , the 2-form e − hY(s(u))ζ, ηei $SO(3)×R3 (A, ω)(η→ + ζ↑ , ηe→ + ζe↑ ) = hY(s(u))η, ζi + hY(s(u))ω, η × ηei − m hs × (ω × Ds(u)(u × η)), ηei + m hu × Ds(u)(ω × (η × s)), ηei,

(107)

see (83), and the distribution HSO(3)×R3 (82) on SO(3) × R3 are invariant under the SO(2) action. Thus we obtain the E(2)-reduced distributional Hamiltonian system (H S 2 ×R3 , T (S 2 × R2 ), $ S 2 ×R3 , ES 2 ×R3 = E) by reducing the SO(2)-symmetry (104) on SO(3) × R3 .

236

Convex rolling rigid body

We start by finding the SO(2)-reduced distribution H S 2 ×R3 . Let γ be the 1-form on R9 with coordinates (u, s, ω) given by hY(s)u, dωi + hY(s)ω, dui + m hu, ωi hs, dsi.

(108)

Lemma 6.6.3.17. Consider the mapping µ : S 2 × R3 → R9 : (u, ω) 7→ (u, s(u), ω). Let α be the 1-form on S 2 × R3 given by µ∗ γ. Then ρe∗ α = u→

$SO(3)×R3 ,

(109)

where ρe is the SO(2) orbit map (105) on SO(3) × R3 . Proof. Using (107) we find that (u→

$SO(3)×R3 (A, ω)(e η→ + ζe↑ )

e + hY(s)ω, u × ηei + m hu × Ds(u)(ω × (u × s)), ηei, = hY(s)u, ζi since u→ = (Au× , 0), which implies η = u and ζ = 0 in (107) e + hY(s)ω, u × ηei − m hω × (u × s), Ds(u)(u × ηe)i = hY(s)u, ζi e + hY(s)ω, u × ηei − m hω, si hu, Ds(u)(u × ηe)i = hY(s)u, ζi + m hu, ωi hs, Ds(u)(u × ηe)i

e + hY(s)ω, u × ηei + m hu, ωi hs, Ds(u)(u × ηe)i, = hY(s)u, ζi

(110)

since Ds(u)(u × ηe) ∈ Ts(u) S and u ⊥ Ts(u) S.

Under the tangent T ρe of the SO(2)-orbit map ρe (105), the image of the e is the vector field vector field ηe→ + ζe↑ given by (A, ω) 7→ (Ae η × , ζ) (u, ω) 7→

d × ρe Aeteη , ω + tζe = dt t=0 × e e = (−e η u, ζ) = (u × ηe, ζ)

d dt

t=0

× (Aeteη )−1 e3 , ω + tζe

(111)

e that on S 2 × R3 . Note that ηe→ + ζe↑ ∈ ker T ρe if and only if u × ηe = 0 = ζ, t ∗ t ∗ e ◦ ◦ is, ker T ρe = span{u→ }. Let t ζ = (µ ρe) dω, (u × ηe) = (µ ρe) du, and ∗ ◦ (µ ρe) ds = Ds(u)(u × ηe) . Then ρe∗ α = (µ ◦ ρe)∗ γ

e + hY(s)ω, u × ηei + hu, ωi hs, Ds(u)(u × ηe)i. = hY(s)u, ζi

(112)

Since the right hand sides of (110) and (112) are equal, we have proved (109).

6.6. Reduction of E(2) symmetry

237

Corollary 6.6.3.18. The E(2)-reduced distribution H S 2 ×R3 is ker α. Proof. Since ker T(A,ω) ρe = span{u→ (A, ω)}, the distribution U on SO(3)× R3 is defined by U(A,ω) = {v(A,ω) ∈ HSO(3)×R3 (A,ω) $SO(3)×R3 (A, ω) v(A,ω) , w(A,ω) = 0, for all w(A,ω) ∈ ker T(A,ω) ρe ∩ HSO(3)×R3 (A,ω) } = span{u→ (A, ω)} = ker u→ $SO(3)×R3 (A, ω).

From (109) it follows that

α(A, ω)T(A,ω) ρe U(A,ω) = ρe∗ α|U(A,ω) = u→

$SO(3)×R3 (A, ω)|U(A,ω) = 0.

(113)

Using proposition 3.3.25 of chapter 3, we see that the SO(2)-reduced distribution H S 2 ×R3 (A,ω) equals T(A,ω) ρe U(A,ω) . From (113) we deduce that H S 2 ×R3 = ker α, which is a codimension 1 subbundle of T (S 2 × R3 ).

By proposition 3.3.26 of chapter 3 the SO(2)-reduced 2-form $S 2 ×R3 on H S 2 ×R3 = ker α is $S 2 ×R3 (u, ω) (v, w), (e v , w) e = $SO(3)×R3 η→ + ζ↑ , ηe→ + ζe↑ ,

where ρe(A, ω) = (u, ω). In addition, η→ + ζ↑ = (Aη × , ζ) and ηe→ + e lie in U(A,ω) = ker(u→ $SO(3)×R3 ). Also (v, w) = ζe↑ = (Ae η × , ζ) × e = (u × ηe, ζ) e T(A,ω) ρe(Aη , ζ) = (u × η, ζ) and (e v , w) e = T(A,ω) ρe(Ae η × , ζ) lie in H S 2 ×R3 (u,ω) . Using (107) and (110) we have proved Lemma 6.6.3.19. The SO(2)-reduced 2-form $S 2 ×R3 on H S 2 ×R3 is given by e $S 2 ×R3 (u, ω) (u × η, ζ), (u × ηe, ζ) where

e + hY(s)ζ, ηei + hY(s)ω, η × ηei = hY(s)η, ζi

(114)

− m hs × (ω × Ds(u)(u × η)), ηei + m hu × Ds(u)(ω × (η × s)), ηei,

0 = hY(s(u))u, ζi + hY(s(u))ω, u × ηi + m hu, ωi hs(u), Ds(u)(u × η)i and e + hY(s(u))ω, u × ηei + m hu, ωi hs(u), Ds(u)(u × ηe)i. 0 = hY(s(u))u, ζi

238

Convex rolling rigid body

Until the end of this subsection we will simplify notation by writing H for H S 2 ×R3 and $ for $ S 2 ×R3 . Next we prove Proposition 6.6.3.20. The accessible set of the distribution H is all of S 2 × R3 . Proof. First assume that we are at a point ρe(A, ω) = (u, ω) ∈ S 2 × R3 where hs(u), ui 6= 0. Then ker d(u→

$) = d(e ρ∗ α) = ρe∗ (dα)

is a 2-dimensional subspace of T(A,ω) (SO(3) × R3 ), which contains the subspace span{u→ (A, ω)}. Let β = dα. Because ker T ρe = span{u→ }, it follows that Z = ker β is 1-dimensional. Let K ⊆ H be the kernel of the 2-form dα|(H × H). If Z ∩ H = {0} and v ∈ K, then v is β perpendicular to H + Z = T (S 2 × R3 ). Therefore v ∈ Z ∩ H = {0}, that is, v = 0. Hence K = {0}. In other words, β is nondegenerate. If Z ⊆ H, then choose a β vector w 6∈ H. This implies w 6∈ Z. Consequently, L = span{w} is a codimension 1 subspace of H which contains Z. Because K ∩ L ⊆ Z, we conclude that dim K ≤ 2. Since Z ⊆ K and dim K is even, we find that dim K = 2. But dim H = 4. Thus the 2-form β is nonzero. Since dα(X, Y ) = X(Y

α) − Y (X

α) + α([X, Y ]),

we see that at points of S 2 × R3 where hs(u), ui 6= 0, the tangent space T (S 2 × R3 ) is spanned by H and vector fields on S 2 × R3 whose Lie brackets take values in H. By Chow’s theorem it follows that each connected component of the subset of points of S 2 × R3 , where hs(u), ui 6= 0, is an accessible set of H. If the center of mass of the strongly convex body B is in the interior of its boundary S, then hs(u), ui < 0 on S. Consequently, the accessible set of the distribution H is all of S 2 × R3 . If the center of mass of B is on S, then M = {s ∈ S hs, ui = 0} is a smooth codimension 2 submanifold of S defined by s = 0. Its complement S \ M is connected, and thus is contained in an accessible set of H in view of Chow’s theorem. At every point m ∈ M the codimension 1 subspace Hm is not contained in the codimension 2 subspace Tm M of Tm (S 2 × R3 ). Therefore we can move out of M at m by using a motion which is tangent to Hm . Thus the accessible set of H is all of S 2 × R3 . Finally, if the center of mass of B is in the exterior of S,

6.6. Reduction of E(2) symmetry

239

then C = {s ∈ S hs, ui = 0} is a smooth codimension 1 submanifold of S. By Chow’s theorem each connected component of S \ C is an accessible set. For each c ∈ C we see that both Tc C and Hc are codimension 1 subspaces of Tc (S 2 × R3 ). So dim Hc = dim Tc C. If Hc ⊆ Tc C, then Hc = Tc C. From (109) it follows that belonging to H = ker α is a nontrivial restriction on the R3 component of a tangent vector to S 2 × R3 ; whereas belonging to Tc C is a restriction only on the S 2 component. Therefore for every c ∈ C, we have Hc 6= Tc C. So Hc is not contained in Tc C. Consequently, from c ∈ C we can move off C into each of the connected components of S \ C by a motion which is tangent to Hc . Thus the accessible set of H is all of S 2 × R3 . One may wonder if the codimension 1 distribution H = ker α is a contact structure on S 2 × R3 , that is, if the restriction to H × H of dα is nondegenerate. From lemma 6.6.3.17 it follows that ρe∗ (dα) = d(e ρ∗ α) = d(u→

$).

From proposition 6.6.3.20 we know that if hs, ui = 0 then dim ker d(u→ $) ≥ 4 and contains the vector u→ . Because ker T ρe = span{u→ }, we see that dim ker dα ≥ 3 and dim(H ∩ ker dα) ≥ 2. Consequently, H is not a contact structure at any point of S 2 × R3 where hs, ui = 0. Now suppose that hs, ui 6= 0. The fact that u→ ∈ ker d(u→ $) implies that (u, 0) ∈ ker d(u→ $), that is, Cu = 0 = Bu, where C and B are the linear maps defined in (94) and the matrix M above this equation. Conversely, ker d(u→ $) = span{u} × ker C t . Now ker C t = (im C)⊥ . So C t ζe = 0 if and only if e 0 = hhs, Ds(u)(u × ηe)iu − hs, ui Ds(u)(u × ηe), ζi e − hs, uiζi e = hDs(u)(u × ηe), hu, ζis

(115)

for every ηe ∈ R3 . Because Ds(u) is a positive definite symmetric linear ⊥ transformation of span{u} into itself, condition (115) is equivalent to e − hs, uiζe = µ u hu, ζis

(116)

for some µ ∈ R. Taking the inner product on both sides of (116) with u gives µ = 0. Because hs, ui 6= 0, we deduce that ζe is a multiple of s. Therefore ker dα = span{(0, s)}. From the proof of proposition 6.6.3.20 it follows that H is not a contact structure at a given point of S 2 × R3 if at

240

Convex rolling rigid body

that point ker dα ⊆ H = ker α. In view of the definition of α (109) this condition amounts to 0 = hY(s)u, si = hu, Y(s)si = hu, Is + m s × (s × s)i = hu, Isi. Thus we have proved Proposition 6.6.3.21. H is a contact structure on all of S 2 × R3 if and only if hs, ui 6= 0 and hu, Isi 6= 0 at every point of S 2 × R3 . Assume that the center of mass of B is in interior of S. Then hs, ui < 0 at S × R3 . The condition hu, Isi 6= 0 means that the function u 7→ hs(u), Iui on S 2 has no zeroes. For u = ej we have Iu = Ij ej = Ij u. Therefore hs(u), Iui = Ij hs(u), ui < 0, when u = ej . Thus the condition hu, si 6= 0 on S 2 × R3 means that hIs, ui < 0 everywhere S 2 × R3 . 2

The hypotheses of proposition 6.6.3.21 are satisfied for the Chaplygin sphere, where S is a sphere with its center of mass at its geometric center. In this case s(u) = −ru, where r is the radius of sphere. Therefore hIs, ui = −rhIu, ui < 0. This situation also occurs when the moment of inertia tensor of S is a multiple of the identity and the center of mass is in the interior of S. The collection of strongly convex rigid bodies rolling without slipping on a horizontal plane for which H is a contact structure for all S 2 × R3 is an open set in the sense that a small perturbation of the surface S and its moment of inertia tensor I does not destroy this property. 6.6.4

Reduced distributional system

In this section we complete the reduction of the E(2) symmetry of the rigid body rolling on a horizontal plane, by computing the E(2)-reduced distributional Hamiltonian system on the E(2)-orbit space S 2 × R3 . First we prove Fact 6.6.4.22. On the accessible set DN the vector field V (42) induces the vector field V on S 2 × R3 whose integral curves satisfy u˙ = u × ω

Y(s)ω˙ = Y(s)ω × ω + m (ω × Ds(u)(u × ω)) × s − mg u × s.

(117a) (117b)

6.6. Reduction of E(2) symmetry

241

Proof. The E(2) action e(3)) → E(3) × e(3) : Φ : E(2) " × (E(3) × # (118) (R, x), (A, a, ω, b) '→ (RA, Ra + x, ω, b) given by (95) leaves the constraint manifold DN (96) invariant. Thus Φ restricts to an E(2)-action on DN . The vector field V on DN , which governs the motion of the rolling rigid body, has integral curves which satisfy (42) A˙ = Aω × a˙ = −Aω × s = −A(ω × s) (119) # −1 " (Y(s)ω) × ω + m ((ω × s) ˙ × s) − mg u × s . ω˙ = Y(s) Note that V preserves DN because b = A−1 a˙ = −ω × s, using the second equation in (119); while # d" "a, e3 # + "s(u), u# = "a, ˙ e3 # + "Ds(u)(u × ω), u# + "s(u), u × ω#, dt d ˙ −1 e3 = −ω × u = u × ω (A−1 )e3 = −A−1 AA since u˙ = dt ˙ A−1 e3 # + "s(u), u#, = "A−1 a,

since Ds(u)(u × ω) ∈ Ts(u) S, which is perpendicular to u

= "b, u# + "ω × s, u# = 0, since b = −ω × s. The vector field V on DN is invariant under the E(2)-action Φ|(E(2)×DN ), because for ((A, a), ω) ∈ E(3) × R3 , which parametrizes DN , we have ˙ Ra, ˙ ω) ˙ T(A,a,ω) Φ(R,x) V (A, a, ω) = (RA, = V (RA, Ra + x, ω) = V (Φ(R,x) (A, a, ω)), since s(R · u) = s(u). Now ρ : DN → DN = S 2 × R3 : (A, a, ω) '→ (u = A−1 e3 , ω) ˙ a, ˙ ω) ˙ = is the orbit map of the E(2)-action Φ|(E(2) × DN ). So T(A,a,ω) ρ(A, 2 3 (u, ˙ ω). ˙ Thus the E(2)-reduced vector field V on DN = S × R is given by V (u, ω) = V (ρ(A, a, ω)) = T(A,a,ω) ρV (A, a, ω), whose integral curves satisfy (117a) and (117b). Proposition 6.6.4.23. The E(2) reduced vector field V on S 2 × R3 is the distributional Hamiltonian vector field YE associated to the E(2) reduced Hamiltonian E = E S 2 ×R3 : S 2 × R3 → R : (120) (u, ω) '→ 12 "Y(s)ω, ω) − mg "s(u), u# In other words, V (H = ∂H E, (121) where H = H S 2 ×R3 .

242

Convex rolling rigid body

Proof. Instead of appealing to proposition 3.3.27 of chapter 3, we will verify (121) directly. Differentiating E : S 2 × R3 → R : (u, ω) 7→

1 2

hIω, ωi + m hω × s, ω × si − mg hs, ui (122)

in the direction (u × w, v) ∈ H (u,ω) gives ∂H E(u, ω)(u × w, v) = hI(ω), vi + m hs × (ω × s), vi

− mg hs × u, wi − m hu × Ds(u)((ω × s) × ω), wi,

(123)

using (117b). Since (u × w, v) ∈ H (u,ω) , from lemma 6.6.3.19 it follows that 0 = hu, Y(s)vi + hu × w, Y(s)ω + m (u × Ds(u)(ω × (s × u))i.

(124)

Let Y(s)v = Y(s)ω × ω + m (ω × Ds(u)(u × ω)) × s − mg u × s w=v

(125)

A straightforward calculation shows that if v and w are defined by (125), then equation (124) holds. Thus V (u, ω) = (u × w, v) ∈ H (u,ω) . From lemma 6.6.3.19 for every (u × w0 , v 0 ) ∈ H (u,ω) we get $H )(u, ω)(u × w0 , v 0 ) = hY(s)v 0 , wi − hY(s)v, w0 i

(V (u, ω)

+ hY(s)ω, w × w0 i − m (ω × Ds(u)(w × u)) × s, w0 i + m h(ω × Ds(u)(w0 × u)) × s, wi

= hY(s)v 0 , ωi − h(Y(s)ω) × ω, w0 i − m h(ω × Ds(u)(u × ω)) × s, w0 i

+ mg hu × s, w0 i + h(Y(s)ω) × ω, w0 i − m (ω × Ds(u)(ω × u)) × s, w0 i + m h(ω × Ds(u)(w0 × u)) × s, ωi,

using (125)

0

= hY(s)ω, v i − m hu × Ds(u)((ω × s) × ω), w0 i − mg hs × u, w0 i = ∂H E(u, ω)(u × w0 , v 0 ),

using (123). After removal of the E(2) symmetry, the rolling strongly convex rigid body is the distributional Hamiltonian system S 2 × R3 , H S 2 ×R3 , $H 2 3 , S ×R E S 2 ×R3 .

6.7. Body of revolution

6.7

243

Body of revolution

In this section we discuss the motion of a strongly convex body of revolution which rolls without slipping on a horizontal plane under the influence of a constant vertical gravitational force. In body coordinates a rotation R ∈ SO(3) sends the reference position of the strongly convex rigid body B to its actual position acted on by the rotation R−1 . Under this action of R the vector u = A−1 e3 is sent to (A R−1 )−1 e3 = R A−1 e3 = R u. Also the vectors A s(u) and A˙ s(u) are sent to A R−1 s(R u) and A˙ R−1 s(R u), respectively. It follows that the nonholonomic constraint (28) is invariant if and only if s(R u) = R s(u) for every u ∈ S 2 . This means that the surface S of B is invariant under R. If this is the case, then the holonomic constraint (27) is invariant as well and we will say the strongly convex rigid body is geometrically symmetric under R. Since the action of R maps the rotation A to A R−1 , it follows that A˙ is mapped to A˙ R−1 . Therefore ω × = A−1 A˙ is mapped to R A−1 A˙ R−1 = R(ω × )R−1 = (R ω)× . This implies that ω is mapped to R ω. Therefore from the geometric invariance of the body it follows that the first two equations defining the vector field V (42), which governs its motion, are invariant. If IR = RI, then from the definition of Y(s) (40) it follows that Y(R s) R v = IRv + m Rs × (Rv × Rs) = RIv + m Rs × R(v × s) = RIv + m R(s × (v × s)) = R(Iv + m s × (v × s)) = RY(s)v

for every v ∈ R3 , that is, Y(Rs) = RY(s)R−1 . Therefore the third equation defining the vector field V (42) is invariant under the substitutions ω 7→ Rω, s 7→ Rs, and u 7→ Ru. The body B is dynamically symmetric under R if IR = RI. The body B is symmetric under R if it is both geometrically and dynamically symmetric under R. When this is the case, the holonomic constraint (27), the nonholonomic constraint (28), and the equations of motion (42) are invariant under the action of R. The body B is a body of revolution if it is both geometrically and dynamically symmetric under all rotations about a given axis. We may choose this axis to be the e3 -axis. Then dynamic symmetry means that in the principal axis frame its moment of inertia tensor I is diagonal, with eigenvalues I1 ,

244

Convex rolling rigid body

I2 , I3 , and I1 = I2 . Note that the inequalities (13) satisfied by the principal moments of inertia imply that I3 ≤ 2 I1 . 6.7.1

Geometric and dynamic symmetry

Let B be the reference body with reference frame {e1 , e2 , e3 } having its origin at the center of mass. Since S, the boundary of B, is a smooth strictly convex surface of revolution, it can be parametrized by x : S 1 × [−1, 1] → R3 : (θ, x3 ) 7→ (f (x3 ) cos θ, f (x3 ) sin θ, x3 ),

(126)

where f : [−1, 1] → R≥0 is continuous with f (±1) = 0 and is smooth on (−1, 1) with f 00 < 0. Let g± be a local inverse to f with g± (0) = ±1, which in a neighborhood of 0 extends to a smooth even function. Then limx3 →±1∓ f 0 (x3 ) = ∓∞. Let n(x) be the inward pointing unit normal to S at x = x(θ, x3 ). A short calculation shows that 1 n(x(θ, x3 )) = p (127) (cos θ, sin θ, −f 0 (x3 )). 1 + (f 0 (x3 ))2

Thus the Gauss map of S is

n : S ⊆ R3 → S 2 ⊆ R3 : x 7→ n(x) = (u1 , u2 , u3 ) = u.

(128)

Its inverse is the map s : S 2 → S : u 7→ (ρ(u3 ) u1 , ρ(u3 ) u2 , ζ(u3 )),

(129)

where ρ : (−1, 1) → R and ζ : (−1, 1) → R are smooth functions defined by f 0 (ζ(u3 )) = −u3 (1 − u23 )−1/2 ρ(u3 ) = f (ζ(u3 )) (1 −

u23 )−1/2 .

(130a) (130b)

Proof. Using the fact that x = s(n(x)), we see that the components of (126) and (129) are equal. This gives f 2 (x3 ) = ρ2 (u3 )(1 − u23 ), since u21 + u22 = 1 − u23 . Because f ≥ 0 and x3 = ζ(u3 ), we obtain (130b). Since u = n(x) we get f 0 (x3 ) u3 = − p 1 + (f 0 (x3 ))2

from which (130a) follows.

(131)

245

6.7. Body of revolution

Because f 00 < 0 on (−1, 1), the function f 0 is strictly decreasing on (−1, 1). Thus the function ζ in (130a) is well defined and smooth on (−1, 1). Consequently, the function ρ in (130b) is smooth on (−1, 1). Moreover, differentiating (130a) and using the fact that f 00 < 0 shows that ζ 0 > 0 on (−1, 1). Thus ζ is monotonically increasing on (−1, 1). Although the right hand sides of (130a) and (130b) are not defined at u3 = ±1, the functions ζ and ρ extend to smooth functions in a neighborhood of ±1. Proof. Because y 7→ g± (y) = x3 is a smooth local inverse of f with g± (0) = 1 0 ±1, using the chain rule we get g± (y) = f 0 (x . From (131) it follows that 3) 0 u3 = −(1 + (g± (y))2 )−1/2 ,

which is equivalent to 0 g± (y) = −

(1 − u23 )1/2 . u3

(132)

0 The inverse of g± extends to a smooth odd function in a neighborhood of −1 −1 0. Since y = g± (x3 ) = g± (ζ(u3 )), equation (132) reads

−

(1 − u23 )1/2 1 0 ◦ −1 = 0 = (g± g± )(ζ(u3 )), u3 f (ζ(u3 ))

that is, (1 − u23 )1/2 0 −1 ζ(u3 ) = g± ◦ (g± ) − . u3

(133)

Thus ζ is a smooth function near u3 = ±1. Composing both sides of (133) with f and using (130b), we obtain (1 − u23 )1/2 2 −1/2 0 −1 ρ(u3 ) = f (ζ(u3 ))(1 − u3 ) = (g± ) − (1 − u23 )−1/2 . u3 0 −1 Because (g± ) is a smooth odd function near ±1, it follows that ρ is smooth near ±1.

The following argument shows that the functions ρ and ζ satisfy (1 − u23 ) ρ0 (u3 ) − u3 ρ(u3 ) = −u3 ζ 0 (u3 ).

(134)

246

Convex rolling rigid body

Proof. Multiplying both sides of (130a) by ζ * (u3 ) gives (f ◦ ζ)* (u3 ) = −u3 (1 − u23 )−1/2 ζ * (u3 ).

Differentiating (130b) we obtain

ρ* (u3 ) = (f ◦ ζ)* (u3 ) (1 − u23 )−1/2 + (f ◦ ζ)(u3 ) u3 (1 − u23 )−3/2 = (−ζ * (u3 ) + ρ(u3 )) u3 (1 − u23 )−1 .

try

Since S is a surface of revolution, it is invariant under the axial symmeS × R → R : (ψ, x) '→ Rx = 1

3

3

0

1 cos ψ − sin ψ 0 @ sin ψ cos ψ 0 A 0 0 1

(135)

of rotation about the e3 -axis. From (127) it is easy to see that the Gauss map n is equivariant under this axial symmetry, that is, n(R(x)) = R(n(x)), for every rotation R about e3 . Consequently, the inverse s of the Gauss map of S is also equivariant, that is, s(R(u)) = R(s(u)) for every u ∈ S 2 and every ψ ∈ S 1 . Thus a body of revolution is geometricially symmetric.

Next to ensure the dynamic symmetry of the rolling body of revolution B, we assume that in the principal axis frame {e1 , e2 , e3 } the moment of inertia tensor I of B is diag(I1 , I1 , I3 ). Here 0 < I3 ≤ 2I1 . Thus, for every R ∈ S 1 we have R ◦ I = I ◦ R.

(136)

From equation (96) of §6 we know that the motion of B takes place on the constraint manifold DN = {(A, a, ω, b) ∈ E(2)×e(2) b = ω×s(u), "a, e3 # = −"s(u), u#}. (137)

In §6.2 we have seen that E(2) acts on DN by

' : E(2) × DN → DN : ((A* , a* ), (A, a, ω, b)) '→ (A* A, A* a + a* , ω, b). (138) Φ

On DN define an axial action of S 1 by Φ : DN × S 1 → DN :

From

((A, a, ω, b), R) '→ (AR−1 , Ra, Rω, Rb). Rb = −(Rω)× Rs(u) = −(Rω)× s(Ru)

(139)

6.7. Body of revolution

247

and "Ra, Re3 # = "a, e3 # = −"s(u), u# = −"Rs(u), Ru#, it follows that the axial S 1 -action Φ preserves DN . Because E(2) acts on DN on the left, while S 1 acts on DN on the right, the actions commute. Thus Φ induces an S 1 action on the E(2)-orbit space DN /E(2) = S 2 × R3 (see §6.2) defined by Ψ : S 1 × (S 2 × R3 ) → S 2 × R3 : (ψ, u, ω) '→ (Ru, Rω).

(140)

We call the action Ψ the induced axial symmetry. Ψ is the restriction to S 2 × R3 of the lift of the axial symmetry (135) to T R3 . From §6.4 we know that the motion of the body B on the horizontal plane after removing its E(2)-symmetry is governed by a vector field V on S 2 × R3 , whose integral curves satisfy u˙ = u × ω

Y(s)ω˙ = Y(s)ω × ω + m (ω × Ds(u)(u × ω)) × s − mg u × s.

(141)

Here Y(s)v = Iv + m (s(u) × (v × s(u))) for every v ∈ R3 . Claim 6.7.1.24. The E(2)-reduced vector field V on S 2 × R3 is invariant under the induced axial symmetry Ψ (140). Proof. This follows because the axial symmetry of B implies the invariance of (42) under Ψ. From proposition 6.6.4.23 we know that the function E : S 2 × R3 → R : (u, ω) '→

1 2

"Y(s)ω, ω# − mg "s(u), u#

(142)

is an integral of the vector field V . Claim 6.7.1.25. The E(2)-reduced energy E is invariant under the induced axial symmetry (140). Proof. We compute E(Ru, Rω) = =

1 2 1 2

"Y(s(Ru))Rω, ω# − mg "s(Ru), Ru#

"R(Y(s)ω), Rω# − mg "R(s(u)), Ru#

= E(u, ω).

248

6.7.2

Convex rolling rigid body

Reduction of the induced axial symmetry

Here we reduce the induced axial symmetry (140) on S 2 × R3 . To carry out this reduction we will use invariant theory. The algebra of polynomials on R3 × R3 , which are invariant under the S 1 = {R ∈ SO(3) Re3 = e3 }-action b : S 1 × (R3 × R3 ) → R3 × R3 : (R, (u, ω)) → (Ru, Rω), Ψ

(143)

is generated by

σ1 = u3

σ2 = u1 ω2 − u2 ω1

σ3 = u1 ω1 + u2 ω2

σ4 = ω3

σ5 = ω12 + ω22

σ6 = u21 + u22

(144)

subject to the relation σ22 + σ32 = σ5 σ6 ,

σ5 ≥ 0, σ6 ≥ 0.

(145)

b to S 2 × R3 gives the induced axial action Ψ (140). Restricting the action Ψ Its algebra of invariant polynomials is generated by restricting the polynomials σ1 , . . . , σ6 to S 2 × R3 (which we again denote by σ1 , . . . , σ6 ). In addition to (145), these polynomials satisfy the relation σ6 + σ12 = 1,

σ6 ≥ 0.

(146)

|σ1 | ≤ 1, σ5 ≥ 0.

(147)

Equation (146) is just the defining equation of S 2 × R3 expressed in terms of invariants. Eliminating σ6 from (145) and (146) shows that the space (S 2 × R3 )/S 1 of orbits of the induced axial action is the semialgebraic variety M in R5 (with coordinates (σ1 , . . . , σ5 )) defined by σ22 + σ32 = (1 − σ12 )σ5 ,

M is singular with Msing = {(±1, 0, 0, σ4 , 0) ∈ M σ4 ∈ R}

(148)

being its subvariety of singular points. Msing is the image of the submanifold (S 2 × R3 )S 1 = {(0, 0, ±1, 0, 0, ω3) ∈ S 2 × R3 ω3 ∈ R}

of points of S 2 × R3 of symmetry type (=isotropy type, see §3 of chapter 2) S 1 of the action Ψ under its S 1 -orbit map π : S 2 × R3 → M ⊆ R5 : (u, ω) 7→ σ1 (u, ω), . . . , σ5 (u, ω) . (149) If we disregard the dummy variable σ4 , then M has two singular points, near each of which M looks like a quadratic cone. Since (S 2 × R3 ) \ (S 2 ×

6.7. Body of revolution

249

R3 )S 1 is the submanifold of S 2 × R3 of symmetry type {e}, its image under π is the smooth manifold Mreg = M \ Msing . In the domain where −1 < σ1 < 1, Mreg is equal to the graph of the smooth function σ5 =

σ22 + σ32 , 1 − σ12

−1 < σ1 < 1.

(150)

The invariants σj , 1 ≤ j ≤ 4 form a system of coordinates in this domain. However, Mreg contains the points where σ5 > 0 and σ1 = ±1. The graphs of the smooth functions σ1 = ±(1 − (σ22 + σ32 )/σ5 )1/2 ,

σ5 > σ22 + σ32 ,

(151)

give an open neighborhood of each of these additional points. The invariants σj , 2 ≤ j ≤ 5 form a regular system of coordinates on each of these two open subsets. 6.7.3

Axially reduced equations of motion

In this subsection we study the motion of the axially reduced system in terms of the invariants σj in (144). 6.7.3.1

Chaplygin’s equations

The next lemma leads to Chaplygin’s equations for a surface of revolution. Lemma 6.7.3.1.26. Using dot to denote derivative with respect to time, the following equations hold σ˙ 1 = σ2

(152a)

ρ I1 σ˙ 3 + ζ I3 σ˙ 4 + ρ I3 σ4 σ˙ 1 = 0.

(152b)

Here ρ = ρ(u3 ) = ρ(σ1 ) and ζ = ζ(u3 ) = ζ(σ1 ), given by (130a) and (130b), respectively, are viewed as functions of σ1 . Proof. The third component of u˙ = u × ω yields (152a). From (130a) and (130b) it follows that s = ρ u − ρ u3 e3 + ζ e3 = ρ u + τ e3 ,

Therefore

where τ = ζ − ρ σ1 .

"I ω˙ − (Iω) × ω, s# = ρ "I ω˙ − (Iω) × ω, u# + τ "I ω˙ − (Iω) × ω, e3 #.

Now −"(Iω) × ω, u# = "Iω, u × ω# = "Iω, u# ˙ implies that

"I ω˙ − (Iω) × ω, u# = "I ω, ˙ u# + "Iω, u# ˙ = ("Iω, u#)· ,

250

Convex rolling rigid body

d where we have used ( )· to denote dt ( ). In view of (144) and I1 = I2 we see that hIω, ui = I1 σ3 + I3 σ4 σ1 . But

h(Iω) × ω, e3 i = I1 ω1 ω2 − I2 ω2 ω1 = 0.

(153)

So we arrive at · hI ω˙ − (I ω) × ω, si = ρ (I1 σ3 + I3 σ4 σ1 ) + (ζ − ρ σ1 ) I3 σ˙ 4 = ρ I1 σ˙ 3 + ζ I3 σ˙ 4 + ρ I3 σ4 σ˙ 1 .

Therefore taking the inner product of (38) with s gives (152b). Lemma 6.7.3.1.27. We have −mρζ σ˙ 3 + (I3 + mρ2 (1 − σ12 )) σ˙ 4

− mρ ((ζ 0 − ρ) σ3 + ζ 0 σ1 σ4 ) σ˙ 1 = 0.

(154)

Here ρ0 and ζ 0 are the derivatives of ρ and ζ with respect to σ1 . Proof. Let ik be the inner product of e3 with the k th term of I ω˙ − I(ω) × ω − m (ω × s)· × s − m hs, ωi ω × s + mg u × s.

(155)

Then i2 = hI(ω) × ω, e3 i = 0 because of (153), and i5 = hu × s, e3 i = u1 s2 − u2 s1 = u1 ρ u2 − u2 ρ u1 = 0. For i3 = h(ω × s)· × s, e3 i we write h(ω × s)· × s, e3 i = (ω × s)·1 s2 − (ω × s)·2 s1 .

Substituting s1 = ρ u1 and s2 = ρ u2 gives (ω × s)1 = ω2 s3 − ω3 s2 = ω2 ζ − σ4 ρ u2 , and (ω × s)2 = ω3 s1 − ω1 s3 = σ4 ρ u1 − ω1 ζ. Therefore i3 is equal to m ρ times − (ω2 ζ − σ4 ρ u2 )· u2 + (σ4 ρ u1 − ω1 ζ)· u1 = −ζ (ω˙ 2 u2 + ω˙ 1 u1 ) − ζ˙ (ω2 u2 + ω1 u1 ) + σ4 ρ(u˙ 2 u2 + u˙ 1 u1 ) + (σ4 ρ)· (u2 2 + u1 2 ) = −ζ (ω˙ 2 u2 + ω˙ 1 u1 ) − ζ˙ σ3 + 21 σ4 ρ σ˙ 6 + (σ4 ρ)· σ6 . Because ω˙ 2 u2 + ω˙ 1 u1 = hω, ˙ ui − ω˙ 3 u3 = (hω, ui)· − hω, ui ˙ − σ˙ 4 σ1 · = (σ3 + σ4 σ1 ) − hω, u × ωi − σ˙ 4 σ1 = σ˙ 3 + σ4 σ˙ 1 ,

251

6.7. Body of revolution

we see that i3 = h(ω × s)· × s, e3 i is equal to m ρ times −ζ (σ˙ 3 + σ4 σ˙ 1 ) − ζ˙ σ3 +

1 2

σ4 ρ σ˙ 6 + (σ4 ρ)· σ6 .

For i4 = hs, ωi hω × s, e3 i we write hω, si = ω1 ρ u1 + ω2 ρ u2 + ω3 ζ = ρ σ3 + ζ σ4 ,

(156)

and (ω × s)3 = ω1 s2 − ω2 s1 = −ρσ2 , which implies that i4 = mρ (ρ σ3 + ζ σ4 ) σ2 . Because σ˙ 1 = σ2 , the term mρζ σ4 σ˙ 1 in i3 cancels the term mρζ σ4 σ2 in i4 . Collecting all the terms, we see that the inner product of (155) with e3 is I3 σ˙ 4 + mρ − ζ σ˙ 3 − ζ˙ σ3 + 12 σ4 ρσ˙ 6 + (σ4 ρ)· σ6 + ρ σ3 σ˙ 1 ,

which equals 0, because of (38). We conclude the proof of (154) by observing that ρ = ρ(u3 ) = ρ(σ1 ), ζ = ζ(u3 ) = ζ(σ1 ). Therefore ρ˙ = ρ0 (σ1 ) σ˙ 1 and ζ˙ = ζ 0 (σ1 ) σ˙ 1 . Furthermore σ6 = 1 − σ12 , see (146). Hence σ˙ 6 = −2σ1 σ˙ 1 . Consequently, 1 2

σ4 ρσ˙ 6 + (σ4 ρ)· σ6 = −σ4 ρ σ1 σ˙ 1 + σ4 ρ0 σ˙ 1 (1 − σ12 ) + σ˙ 4 ρ (1 − σ12 ) = −σ4 ζ 0 σ1 σ˙ 1 + σ˙ 4 ρ (1 − σ12 ).

To obtain the last equation above we used −ρ σ1 + ρ0 (1 − σ12 ) = −ζ 0 σ1 , see (134). This proves (154). Writing

σ˙ j σ˙ 1

=

dσj dt

1 / dσ dt =

0 = ρ I1

dσ3 dσ1

dσj dσ1 ,

equations (152b) and (154) imply

+ ζ I3

dσ4 dσ1

+ ρ I3 σ4

dσ4 2 2 3 0 = −mρ ζ dσ dσ1 + (I3 + m ρ (1 − σ1 )) dσ1 0 0 − mρ ((ζ − ρ) σ3 + ζ σ1 σ4 ) .

(157)

dσ4 3 Solving for dσ dσ1 and dσ1 from (157), we arrive at the following system of ordinary differential equations, which we call Chaplygin’s equations,

dσ3 = a(σ1 ) σ3 + b(σ1 ) σ4 , dσ1 dσ4 = c(σ1 ) σ3 + d(σ1 ) σ4 , dσ1

(158)

252

Convex rolling rigid body

where a(σ1 ) ∆ = −mI3 ζ (ζ 0 − ρ),

b(σ1 ) ∆ = −I3 I3 + m ρ2 (1 − σ12 ) + m ζ ζ 0 σ1 ,

(159)

0

c(σ1 ) ∆ = mI1 ρ (ζ − ρ),

and

d(σ1 ) ∆ = mρ(I1 ζ 0 σ1 − I3 ζ), ∆ = I1 I3 + m I1 ρ2 (1 − σ12 ) + m I3 ζ 2 .

(160)

Note that Chaplygin’s system is a linear system of ordinary differential equations with coefficients which smoothly depend on σ1 . Here σ1 is treated as the independent variable; whereas σ3 and σ4 are treated as the dependent variables. In §7.1 we have observed that the functions ρ and ζ extend to smooth functions on a neighborhood of the closed segment [−1, 1]. From this it follows that the coefficients a, b, c and d in Chaplygin’s equations (158) are smooth functions on [−1, 1] for the strongly convex body of revolution. This is quite remarkable, because as σ1 → ±1 we leave the part of the reduced phase space where σ1 , σ2 , σ3 , σ4 is a system of coordinates. 6.7.3.2

Two additional constants of motion

From the theory of linear systems of ordinary differential equations, see chapter 3 of [27], it follows that for each (σ 3 , σ 4 ) ∈ R2 there is a unique solution (161) σ1 7→ σ3 (σ1 ; σ 3 , σ 4 ), σ4 (σ1 ; σ 3 , σ 4 ) , of Chaplygin’s system (158) on [−1, 1] such that

(σ3 (0; σ 3 , σ 4 ), σ4 (0; σ 3 , σ 4 )) = (σ 3 , σ 4 ). Moreover, the solution (161) is a smooth function of σ1 ∈ [−1, 1] and depends linearly on the initial vector (σ 3 , σ 4 ) ∈ R2 . If for j = 3, 4 we write σj1, 0 (σ1 ) = σj (σ1 ; 1, 0),

σj0, 1 (σ1 ) = σj (σ1 ; 0, 1),

(162)

which are smooth functions of σ1 ∈ [−1, 1], then the linear dependence on the initial vector takes the form σj (σ1 ; σ 3 , σ 4 ) = σ 3 σj1, 0 (σ1 ) + σ 4 σj0, 1 (σ1 ),

(163)

for j = 3, 4. Geometrically this means that the solution curves of the Chaplygin system define a smooth fibration of (σ1 , σ3 , σ4 )-space [−1, 1] × R2 , with the inverse of the diffeomorphism ψ : (σ1 , (σ 3 , σ 4 )) 7→ σ1 , σ3 (σ1 ; σ 3 , σ 4 ), σ4 (σ1 ; σ 3 , σ 4 ) (164)

6.7. Body of revolution

253

of [−1, 1] × R2 being a trivialization of this fibration. If π1 : [−1, 1] × R2 → R2 : (σ1 , (σ 3 , σ 4 )) '→ (σ 3 , σ 4 ) denotes the projection map which forgets σ1 , then π = π1 ◦ ψ −1 : [−1, 1] × R2 → R2 is the projection map of a fibration where the fiber over (σ 3 , σ 4 ) is the solution curve of Chaplygin equations with initial condition (σ 3 , σ 4 ). From the fact that the equations of motion of the E(2)×S 1 - reduced system imply Chaplygin’s equations, it follows that the image of each solution of the E(2) × S 1 -reduced system under the projection map P : [−1, 1] × R3 → [−1, 1] × R2 : (σ1 , σ2 , σ3 , σ4 ) '→ (σ1 , σ3 , σ4 ) is contained in a fiber of P . Equivalently, the two components of the vector valued function π ◦ P are constants of motion. With a slight abuse of notation, we shall write π ◦ P = (σ 3 , σ 4 ),

(165)

where σ 3 and σ 4 are viewed as smooth functions on the reduced phase space. Note that these constants of motion, although smooth, are not given by algebraic formulæ involving the shape of the body of revolution. Instead the constants of motion are obtained by solving a linear system of ordinary diffferential equations with quite general variable coefficients. 6.7.3.3

The total energy

We now combine the constants of motion defined by the Chaplygin’s equations with the E(2) × S 1 -reduced total energy. Recall that total energy is a constant of motion for any system which satisfies the Lagrange-d’Alembert principle in §3 of chapter 1. Lemma 6.7.3.3.28. In terms of the invariants σj (144), the E(2) × S 1 reduced total energy E (142) is given by " E = 12 (I1 σ5 + I3 σ42 ) + 12 m ρ2 σ22 + ζ 2 σ5 + ρ2 (1 − σ12 ) σ42 # " # (166) − 2ρζ σ3 σ4 − mg ρ (1 − σ12 ) + ζ σ1 ,

where ρ = ρ(σ1 ) and ζ = ζ(σ1 ) are smooth functions of σ1 defined by (130a) and (130b).

254

Convex rolling rigid body

Proof. We use (122) for the E(2) × S 1 -reduced total energy. In it, we substitute "I ω, ω# = I1 σ5 + I3 σ42 and "ω × s, ω × s# = "ω, ω# "s, s# − "ω, s#2

= (σ5 + σ42 ) (ρ2 σ6 + ζ 2 ) − (ρ σ3 + ζ σ4 )2

= σ5 ρ2 σ6 + σ5 ζ 2 + σ42 ρ2 σ6 + σ42 ζ 2 − ρ2 σ32 − ζ 2 σ42 − 2ρ ζ σ3 σ4 = ρ2 σ22 + ζ 2 σ5 + ρ2 (1 − σ12 )σ42 − 2ρ ζ σ3 σ4 .

To obtain the second equality above we use (156) and for the fourth equality equation σ6 = 1 − σ12 . Furthermore, "s, u# = ρ σ6 + ζ σ1 = ρ(1 − σ12 ) + ζ σ1 . Collecting terms and using the definition of the invariants (144) we get (166). The E(2) × S 1 -reduced total energy E (166) is a smooth function of σj , 1 ≤ j ≤ 5. In the domain 0 ≤ σ12 < 1, where the invariants σj , 1 ≤ j ≤ 4, form a system of coordinates, substituting σ5 = (σ22 + σ32 )(1 − σ12 )−1 gives E= +

1 2

I1 + mρ (1 − σ12 ) + m ζ 2 2 I1 + m ζ 2 2 σ2 + 12 σ3 − m ρ ζ σ3 σ4 2 1 − σ1 1 − σ12 # " (167) (I3 + m ρ2 (1 − σ12 )) σ42 − mg ρ (1 − σ12 ) + ζ σ1 .

1 2

In the domain σ5 > σ22 + σ32 , where the invariants σj , 2 ≤ j ≤ 5, form #1/2 " and a system of coordinates, we substitute σ1 = ± 1 − (σ22 + σ32 )/σ5 obtain E is a smooth function of σ2 , . . . , σ5 . 6.7.3.4

A conservative Newtonian system

If we fix the values of the constants of motion σ 3 and σ 4 given by Chaplygin’s equations (158), then the fiber (π ◦ P )−1 (σ 3 , σ 4 ), which lies in the domain −1 < σ1 < 1, is a smooth surface N in the E(2)× S 1 -reduced phase space, which is parametrized by (σ1 , σ2 ) ∈ (−1, 1) × R. The surface N is invariant under the flow of the E(2) × S 1 -reduced vector field , for which the E(2) × S 1 -reduced total energy E is a constant of motion. In terms of the parameters (σ1 , σ2 ), the restriction to N of the E(2) × S 1 -reduced total energy (167) is E = T (σ1 , σ2 ) + V σ 3 , σ 4 (σ1 ),

(168)

where T (σ1 , σ2 ) =

1 2

I1 + m ρ (1 − σ12 ) + m ζ 2 2 σ2 1 − σ12

(169)

255

6.7. Body of revolution

plays the role of a kinetic energy and I1 + m ζ 2 2 1 V σ3 , σ4 (σ1 ) = 12 σ3 + 2 (I3 + m ρ2 (1 − σ12 )) σ42 − m ρ ζ σ3 σ4 1 − σ12 − mg ρ (1 − σ12 ) + ζ σ1 (170) is viewed as a potential energy function. Here ρ = ρ(σ1 ), ζ = ζ(σ1 ), σ3 = σ3 (σ1 ; σ 3 , σ 4 ) and σ4 = σ4 (σ1 ; σ 3 , σ 4 ) are solutions of Chaplygin’s equations (158) viewed as functions of σ1 with the initial conditions σ 3 and σ 4 being viewed as parameters.

Because σ˙ 1 = σ2 , we have a one degree of freedom classical mechanical system of the form kinetic plus potential energy on a configuration space parametrized by σ1 . The kinetic energy defined by the Riemannian structure (I1 + m ρ (1 − σ12 ) + m ζ 2 )/(1 − σ12 ) times the standard one and the potential energy is given as a function of σ1 by (169). The kinetic energy seems to be singular at σ1 = ±1. However, it can be regularized by means of the change of variable σ1 = sin ϕ with −π/2 < ϕ < π/2, for then Therefore

(1 − σ12 )−1/2 σ˙1 = (cos ϕ)−1 (cos ϕ) ϕ˙ = ϕ. ˙ E=

1 2

k(ϕ) ϕ˙ 2 + V σ3 , σ4 (sin ϕ),

(171)

(172)

where k(ϕ) = I1 + m ρ(sin ϕ) cos2 ϕ + m ζ(sin ϕ)2 .

(173)

1

Conservation of E(2) × S -reduced energy yields

∂ V σ 3 , σ 4 (sin ϕ) = 0. (174) ∂ϕ Equation (174) describes a conservative Newtonian system with kinetic en1 2

ergy

1 2

k 0 (ϕ) ϕ˙ 2 + k(ϕ) ϕ¨ +

k(ϕ) ϕ˙ 2 subject to a force −

∂ V σ 3 , σ 4 (sin ϕ) . ∂ϕ 1

Actually, the Lie deriva-

tive E˙ of E with respect to the E(2) × S -reduced vector field is equal to ϕ˙ times the left hand side of (174). Therefore (174) follows from E˙ = 0 if ϕ˙ 6= 0. However, if ϕ˙ = 0, then (174) follows by continuity, because from §1.2 of chapter 3 we know that the motion on the smooth part of the orbit space is defined by a smooth vector field. Therefore ϕ¨ is a smooth function of ϕ and ϕ˙ if −π/2 < ϕ < π/2.4 4 From

sin ϕ = σ1 = hu, e3 i = hA−1 e3 , e3 i = he3 , A e3 i it follows that ϕ is equal to the angle between the oriented axis of symmetry of the body of revolution in its actual position in space, and the horizontal plane. The angle is taken to be positive if A e3 points upward.

256

Convex rolling rigid body

Note that if ϕ(t) is known, then we know σ1 (t) = sin ϕ(t), ˙ cos ϕ(t), σ3 (t) = σ3 (σ1 (t); σ 3 , σ 4 ), and σ4 (t) = σ2 (t) = σ˙ 1 (t) = ϕ(t) σ4 (σ1 (t); σ 3 , σ 4 ). In other words, we know the solution of the E(2) × S 1 reduced system on the part −1 < σ1 < 1 of the E(2) × S 1 -orbit space M. 6.7.3.5

Solutions of the one degree of freedom system

Conservative one degreee of freedom classical mechanical systems are the best understood mechanical systems. We recall the classical description of its solutions. To simplify notation we write V (ϕ) = V σ 3 , σ 4 (sin ϕ). Choose a value E of the value of the constant of motion given by the E(2) × S 1 reduced energy (172), which is larger than the infimum of the potential function V . Let I be a connected component of the open set of all ϕ ∈ (−π/2, π/2) such that V (ϕ) < E. Then I is an open interval (ϕ− , ϕ+ ), where π/2 ≤ ϕ− < ϕ+ ≤ π/2. Note that V (ϕ− ) = E and V (ϕ+ ) = E if −π/2 < ϕ− < π/2 and −π/2 < ϕ+ < π/2, respectively. Depending on the sign of ϕ, ˙ we have #1/2 " 2 (E − V (ϕ)) , (175) ϕ˙ = ± k(ϕ)

as long as ϕ− < ϕ < ϕ+ . Separating variables and integrating, (175) gives . ϕ(t1 ) " 2 #−1/2 (E − V (ϕ)) dϕ (176) t 1 − t0 = ± k(ϕ) ϕ(t0 )

for the time needed to go from ϕ(t0 ) to ϕ(t1 ). The solution t1 '→ ϕ(t1 ) of (175) is obtained by inverting the function ϕ(t1 ) '→ t1 given by the right hand side of (176). * * Take the plus sign in (176). If φ+ < π/2 and V (φ+ ) -= 0, then V (φ+ ) > 0 and the right hand side of (176) has a finite positive limit as ϕ(t1 ) ↑ ϕ+ . This means that ϕ(t) reaches ϕ+ at a finite time t = T . From (174) it follows that ϕ¨ < 0 when t = T , which means that the continuation of the solution for t > T is determined by taking the minus sign in (176). * Therefore ϕ(t) = ϕ(2T − t) for all t. If also ϕ− > −π/2 and V (ϕ− ) -= 0, * then V (ϕ− ) < 0 and for t > T the solution will first reach ϕ− at the finite time t = S > T . Hence ϕ(t) = ϕ(2S − t) for all t. It follows that ϕ(t) = ϕ(2T − t) = ϕ(2S − (2T − t)) = ϕ(2(S − T ) + t)

for all t, that is, the solution is periodic with minimal positive period equal to 2(S−T ). Here ϕ(t) oscillates back and forth between ϕ− and ϕ+ , moving monotonically during each traversal with the time reversed motion during

6.7. Body of revolution

257

the succeeding traversal. The number S − T is equal to the right hand side of (176) with the plus sign and ϕ(t0 ) and ϕ(t1 ) replaced by ϕ− and ϕ+ , respectively. Therefore the minimal positive period τ of the motion is given by the convergent improper integral Z ϕ+ −1/2 2 τ =2 (E − V (ϕ)) dϕ. (177) k(ϕ) ϕ− 0

˙ = (ϕ+ , 0) is an equilibrium point in the If V (ϕ+ ) = 0, then (ϕ, ϕ) phase plane of (174), which corresponds to a relative equilibrium of the rolling body of revolution. Because E − V (ϕ) = V (ϕ+ ) − V (ϕ) = O((ϕ − ϕ+ )2 ) as ϕ ↑ ϕ+ ,

the right hand side of (176) diverges as ϕ(t1 ) ↑ ϕ+ . We conclude that (ϕ(t), ϕ(t)) ˙ → (ϕ+ , 0) as t → ∞. With the choice of the minus sign in (176), we get (ϕ(t), ϕ(t)) ˙ → (ϕ+ , 0) as t → ∞. Similar conclusions hold with ϕ+ replaced by ϕ− . Now assume that σ3 (±1; σ 3 , σ 4 ) 6= 0. From (170) it follows that V (ϕ) → ∞ if ϕ → ±π/2. If the parameters (σ 3 , σ 4 ) ∈ R2 avoid the two straight lines through the origin in R2 where σ3 (±1; σ 3 , σ 4 ) = 0, then V (ϕ) → ∞ as ϕ ↓ −π/2 and as ϕ ↑ −π/2. So ϕ(t) moves in a potential well with infinitely steep walls at ϕ = ±π/2. Therefore −π/2 < ϕ− < ϕ+ < π/2. Consequently, ϕ(t) oscillates back and forth between ϕ− and ϕ+ , unless the solution converges to an unstable relative equilibrium. 6.7.3.6

Quasi-periodic solutions

The above analysis of the conservative one degree of freedom Newtonian system (174) shows that, except for initial values on a submanifold of codimension one, the solutions in the E(2) × S 1 -orbit space are periodic.

Let ((A, a), B) ∈ E(2) × S 1 be the shift element, see lemma 4.3.1.41 of chapter 4, where A ∈ SO(2). If A 6= I, then from the second item in example 4.3.6.1.53 of chapter 4 it follows that (A, a) is a regular and stably elliptic element of E(2) with a circle subgroup C of E(2) as its centralizer. Consequently, ((A, a), B) is a regular and stably elliptic element of E(2) × S 1 with the two-dimensional torus C × S 1 as its centralizer. From proposition 4.3.6.2.58 of chapter 4 we see that over the open subset of the periodic solutions in the E(2) × S 1 -orbit space M where A 6= I, the motion of the strongly convex rolling body of revolution is quasi-periodic on three-dimensional tori in phase space.

258

Convex rolling rigid body

The strongly convex body of revolution differs from the rolling disk because for the disk the E(2) × S 1 -reduced equations of motion are truly singular when σ1 → ±1. These motions correspond to the disk falling flat. In chapter 7 we will treat the motion of the rolling disk in much more detail. 6.7.3.7

Appendix. The E(2) × S 1 -reduced equations of motion

In the domain where −1 < σ1 < 1 of the E(2) × S 1 -reduced space Mreg , where the invariants σj , 1 ≤ j ≤ 4, form a regular system of coordinates, the equations of motion for the E(2)×S 1-reduced system expressed in terms of the invariants σj , 1 ≤ j ≤ 4, can be obtained in the following way. Start with σ2 = σ˙ 1 and σ˙ 3 = (a(σ1 ) σ3 + b(σ1 ) σ4 ) σ2 ,

(178)

σ˙ 4 = (c(σ1 ) σ3 + d(σ1 ) σ4 ) σ2 ,

(179)

which follow from Chaplygin’s equations (158) and σ2 = σ˙ 1 . To find an expression for σ˙ 2 in terms of the σj , 1 ≤ j ≤ 4, we use the fact that the time derivative of the E(2) × S 1 -reduced total energy E (167) is equal to zero. In working out the time derivative of (167), we use ρ˙ = ρ0 (σ1 ) σ˙ 1 = ρ0 (σ1 ) σ2 , and ζ˙ = ζ 0 (σ1 ) σ˙ 1 = ζ 0 (σ1 ) σ2 . Because (σ22 )· = 2σ2 σ˙ 2 , all the terms in dE dt have a factor σ2 . When σ2 6= 0, we can divide this factor out, which leads to the desired equation for σ˙ 2 . By continuity, this equation remains valid when σ2 = 0, because we know a priori that in the smooth part of the E(2) × S 1 -orbit space the motion is determined by a smooth vector field, see §1.2 of chapter 3. The formula for σ˙ 2 in terms of σj , 1 ≤ j ≤ 4, obtained this way is explicit, but quite unwieldy. Moreover, it does not show in an obvious way that the E(2)× S 1 reduced equations of motion are integral curves of a smooth vector field on the E(2) × S 1 -orbit space including those points where σ1 = ±1. Below we present an alternative calculation of smooth vector field in the reduced phase space. We now remove the axial symmetry from the E(2)-reduced vector field V (141). Because V is invariant under the induced axial symmetry Ψ b on (C ∞ (M), { , }). From the theory of (140), it induces a derivation V nonholonomic reduction in §2 of chapter 3, we expect that the integral b satisfy almost Poisson bracket equations of the form curves of V dσi = {σi , E} dt

for i = 1, . . . , 5,

(180)

259

6.7. Body of revolution

where E is the E(2) × S 1 reduced energy given by 1 2

(I1 + m(ρ2 (1 − σ12 ) + ζ 2 )σ5 + 2 2 1 2 mρ σ3

−

1 2

(I3 + m ρ2 (1 − σ12 ))σ42

− mρζ σ3 σ4 − mg (ρ(1 − σ12 )ζσ1 ).

A direct calculation of the nonholonomic bracket on the right hand side of (180) is quite involved. So we use the following indirect method. Consider the functions τ1 = hY(s)ω, e3 i

(181a)

τ2 = hY(s)ω, si

(181b)

τ3 = hY(s)ω, u × e3 i

(181c)

τ4 = hY(s)ω, ωi,

(181d) 2

3

each of which is invariant under the induced axial action on S × R . Hence each τi is a smooth function of the invariants σj (144) for j = 1, . . . , 5. A straightforward calculation shows that τ1 = −m ρζ σ3 + (I3 + mρ2 (1 − σ12 ))σ4 τ2 = I1 ρ σ3 + I3 ζ σ4 2

τ3 = −(I1 + mρ (1 −

σ12 )

2

+ m ζ )σ2

2

2

τ4 = (I1 + mζ )σ5 + (I3 + mρ (1 −

(182b) (182c)

σ12 ))σ42

− 2 mρζ σ3 σ4 + mρ2 σ22

We have

(182a)

(182d)

Proposition 6.7.3.7.29. The E(2) × S 1 reduced equations of motion of a dynamically symmetric smooth strongly convex surface of revolution S which rolls without slipping on a horizontal plane under the influence of a constant vertical gravitational force are σ˙ 1 −mρζ σ˙ 3 + (I3 +

mρ2 (1

−

σ12 ))σ˙ 4

I1 ρ σ˙ 3 + I3 ζ σ˙ 4 (I1 +

mρ2 (1

−

σ12 )

1 2

+

=

mρ(ζ 0

(183a) − ρ)σ2 σ3 +

mρζ 0

σ1 σ2 σ4

= −I3 ρ σ2 σ4 = (I3 + mρ(1 −

− (I1 +

(183b) (183c)

σ12 )

+ mρζ σ1 )σ3 σ4

+ mg (ζ − ρ σ1 )(1 − σ12 ) ´ −m ζζ 0 + ρρ0 (1 − σ12 ) σ22 ˆ ˜ (I1 + mζ 2 ) σ˙ 5 + (I3 + mρ2 (1 − σ12 ))σ4 − mρζ σ3 σ˙ 4 ˆ −mρζ σ4 σ˙ 3 + mρ2 σ2 σ˙ 2 = m(ρ0 ζ + ρζ 0 σ3 σ4 − ρρ0 σ22 ) ` ´ ˜ + mg (ζ − ρ σ1 ) − mζζ 0 σ5 − mρ ρ0 (1 − σ12 ) − ρ σ1 ) σ42 σ2 . −mρζ

σ32

m ζ 2 )σ˙ 2

= σ2

m ζ 2 )σ1 σ5 `

(183d)

(183e)

260

Convex rolling rigid body

Proof. Computing the derivative of (181a) and (181b) and equating the result with the derivative of (182a) and (182b) gives (183b) and (183c), respectively. These equations can be obtained from (157) by multiplying 1 both sides by dσ dt = σ2 (183a). We repeat this process for τ3 and τ4 , but give all the details. We begin by differentiating (181c), which gives d(Y(s)ω) τ˙3 = h , u × e3 i + hY(s)ω, u˙ × e3 i dt = h(Y(s)ω) × ω + m(s˙ × (ω × s)) − mg u × s, u × e3 i since

+ hY(s)ω, (u × ω) × e3 i,

(184)

dY(s)ω) = I ω˙ + m s˙ × (ω × s) + m s × (ω˙ × s) + m (s × (ω × s)) ˙ dt = Iω × ω + m hs, ωi (ω × s) + m s˙ × (ω × s) − mg u × s = (Y(s)ω) × ω + m s˙ × (ω × s) − mg u × s.

(185)

Now using Grassmann’s identity we get

ω × (u × e3 ) = uhω, e3 i − e3 hu, ωi

(ω × s) × (u × e3 ) = uhω × s, e3 i − e3 hω × s, ui

hu × s, u × e3 i = hu, s × (u × e3 )i = hs, e3 i − hu, e3 i hs, ui

So

e3 × (u × ω) = uhe3 , ωi − ωhe3 , ui.

τ˙3 = −hY(s)ω, e3 ihu, ωi + m hω × s, e3 ihs, ˙ ui − m hω × s, uihs, ˙ e3 i − mg hs, e3 i − hu, e3 ihs, ui + hu, e3 ihY(s)ω, ωi. (186)

We now express each term in (186) in terms of invariants. We find that the first term is −hu, ωihY(s)ω, e3 i

= (u1 ω1 + u2 ω2 + u3 ω3 )(I3 ω3 + m ω3 hs, si − m s3 hω, si) = −(σ3 + σ1 σ4 ) (I3 + mρ2 (1 − σ12 ))σ4 − mζρ σ3 ;

the second term is m hω × s, e3 ihs, ˙ ui

= −mρ(u1 ω2 − u2 ω1 )2 (−ρ u3 + ρ0 (u21 + u22 ) + ζ 0 u3 ),

since ω × s = `ζω2 − ρu02 ω3 , ρu1 ω3 − ζω1 , −ρ(u1 ω2 − u2 ω1 )´ 1 ρ(u ω − u ω ) + ρ0 u (u ω − u ω ) and Ds(u)(u × ω) = @ρ(u32 ω13 − u310ω32 ) + ρ0 u21 (u11 ω22 − u22 ω11 )A ζ (u1 ω2 − u2 ω2 ) = −mρ (ζ 0 − ρ)σ1 + ρ0 (1 − σ 2 ) σ22 ;

261

6.7. Body of revolution

the third term is −m hω × s, uihs, ˙ e3 i = −m ζ 0 (ζ − ρ u3 )(u1 ω2 − u2 ω1 )2 = −m ζ 0 (ζ − ρ σ1 )σ22 ; the fourth term is −mg s3 − u3 hs, ui = −mg (ζ − ρ σ1 )(1 − σ12 );

and the fifth term is

hu, e3 ihY(s)ω, ωi = σ1 τ4

= σ1 (I1 + m ζ 2 )σ5 + (I3 + m ρ2 (1 − σ12 ))σ42 − 2m ρζ σ3 σ4 + m ρ2 σ22 .

Therefore

τ˙3 = −(I3 + m ρ2 (1 − σ12 )σ3 σ4 + m ρζ σ32

− (I3 + mρ2 (1 − σ12 ))σ1 σ4 + m ρζ σ3 σ1 σ4 − m ρ (ζ 0 − ρ)σ1 + ρ(1 − σ12 ) σ22 − m ζ 0 (ζ − ρσ1 )σ22 − mg (ζ − ρσ1 )(1 − σ12 )

+ (I1 + m ζ 2 )σ1 σ5 + (I3 + m ρ2 (1 − σ12 ))σ1 σ42 − 2m ρζ σ1 σ3 σ4 + m ρ2 σ1 σ22 = − I3 + m ρζ σ1 + m ρ2 (1 − σ12 ) σ3 σ4 + m ρζ σ32 + 2m ρ2 σ1 − mζζ 0 − m ρρ0 (1 − σ12 ) σ22 + (I1 + m ζ 2 )σ1 σ5 − mg (ζ − ρ σ1 )(1 − σ12 ). (187) Differentiating τ3 = − I1 + m ζ 2 + mρ2 (1 − σ12 ) σ2 gives τ˙3 = −2m ζζ 0 + ρρ0 (1 − σ12 ) − ρ2 σ1 σ2 − I1 + m ζ 2 + m ρ2 (1 − σ12 ) σ˙ 2 . (188)

Equating (187) and (188) gives (183e).

Differentiating the expression (181d) for τ4 and using σ2 = σ˙ 1 gives τ˙4 = 2m ζζ 0 σ2 σ5 + (I1 + m ζ 2 )σ˙ 5 + 2m ρ ρ0 (1 − σ12 ) − ρ σ1 σ2 σ42 + 2 I3 + m ρ2 (1 − σ12 ) σ4 σ˙ 4 − 2m (ρ0 ζ + ρζ 0 )σ2 σ3 σ4 + ρζ(σ˙ 3 σ4 + σ3 σ˙ 4 ) + 2m ρ(ρ0 σ22 + ρ σ˙ 2 )σ2 .

Next differentiating the expression (182d) for τ4 gives d d hY(s)ω, ωi = [hIω, ωi + m hω × s, ω × si] dt dt d = 2hI ω, ˙ ωi + 2m h (ω × s), ω × si. dt

τ˙4 =

Substituting I ω˙ = Iω × ω + m

d ((ω × s) × s) + m hs, ωi(ω × s) − mg u × s dt

(189)

262

Convex rolling rigid body

into the above expression for τ˙4 we get d d τ˙4 = −2m h ((ω × s) × s) − g hu × s, ωi + h (ω × s), ω × si dt dt = 2mg hu, ω × si = 2mg (u1 ω2 − u2 ω1 )(ζ − ρ u3 )

= 2mg (ζ − ρ σ1 )σ2 . (190) Equating (189) and (190) gives (183e). This proves proposition 6.7.3.7.29.

6.8

Notes

The approach to the dynamics of a rigid body used here follows the techniques developed by Hermans [53]. The fact that every position of the body lying on the horizontal plane can be reached from any other such position by means of a rolling motion without slipping was verified for the rolling disk in §1 of chapter 1 of Ne˘ımark and Fufaev [82]. For a classical treatment see [95]. As was mentioned in the notes to chapter 1, taking the Euler-Lagrange equations on the tangent bundle of the orbit space of an abelian Rn action to find the equations of motion of a nonholonomically constrained system with an Rn symmetry was quite common at the end of the 19th century. Comparing the mathematical model obtained by this method with the one obtained by applying the Lagrange-d’Alembert principle in the case of a rigid body rolling without slipping on a horizontal plane, experimental evidence favors the latter, see Lewis and Murray [67]. For instance, the rattleback has spinning modes which are asymptotically stable on a level set of the energy, see Walker [115, 116]. On the other hand, this contradicts the Hamiltonian character of the Euler-Lagrange system. In general the 2-form $D N on DN = T SO(3) is not closed. For if it were, then the R2 -reduced total energy hDN on DN and $DN would form a Hamiltonian system on DN and therefore would preserve the volume element on DN . But in the case of the rattleback there are asymptotically stable motions. This precludes volume preservation. Equations (158) are Chaplgyin’s equations in [23] except that we have taken s = 0 in his equation (15) because we do not have a little gyroscope on the top of our body of revolution. Our analysis of the rolling body of revolution grew out of attempts to understand Chaplygin’s equations. In [23] Chaplygin was focused on finding the motion of the rolling body of revolution by means of quadratures. He did not pursue the analysis of the motion of the general body of revolution any further than deriving Chaplygin’s equations (158).

6.8. Notes

263

The potential energy function V σ 3 , σ4 (170) is not equal to the potential energy mg a3 (t) of the rolling body of revolution. However, it plays the role of a potential energy function in the description of the motion of σ1 (t). So we allow ourselves to use the term “potential energy function” for V σ3 , σ4 . The E(2) × S 1 -reduced equations of motion in proposition 6.7.3.7.29 are new.

This page intentionally left blank

Chapter 7

The rolling disk

Summary In this chapter we discuss the motion of a rolling disk , which is a body of revolution with its edge rolling on a horizontal plane under the influence of a constant vertical gravitational field. The rim of the disk is a planar circle of radius r with its center at the center of mass. We assume that during the motion the lowest point of the rim remains in contact with the horizontal plane, thus preventing the disk from taking off into space. The rolling disk has a symmetry group E(2) × S 1 . Here E(2) is the Euclidean motion group of the plane, which consists of horizontal translations and rotations about the vertical. The group of internal rotational symmetries of the disk is S 1 . The group E(2) × S 1 acts on positions of the unconstrained rolling disk in Euclidean 3-space. The fully reduced space M of E(2) × S 1 -orbits on the constraint manifold C of the rolling disk is parametrized by four invariants σj , 1 ≤ j ≤ 4. Here σ1 = sin ϕ, where ϕ is the angle between the oriented plane of the rim of the disk and the vertical direction. When the position of the disk is flat, which is not part of the fully reduced phase space M, we have σ1 = ±1. In fact M = (−1, 1) × R3 , see (10). Because every motion of the disk on the constraint manifold C is an integral curve of a vector field V (4) which commutes with the action of the symmetry group E(2) × S 1 on C, there is an induced flow on M, which is generated by the fully reduced vector field V (13). In §2 we derive the equations of motion satisfied by the integral curves of V. In §3 we describe how motions on C can be reconstructed from the fully reduced motions on M. The simplest fully reduced motions are equilibrium points. The corresponding reconstructed motions on the constraint manifold are relative equilibria, which we describe in §4. 265

266

The rolling disk

For any body of revolution rolling on a horizontal plane, Chaplygin discovered a system of ordinary differential equations of the form dσ3 dσ4 = a(σ1 ) σ3 + b(σ1 ) σ4 and = c(σ1 ) σ3 + d(σ1 ) σ4 , dσ1 dσ1 called Chaplygin’s equations. The image of each integral curve of the fully reduced vector field V under the projection map (σ1 , σ2 , σ3 , σ4 ) 7→ (σ1 , σ3 , σ4 ) is a solution of Chaplygin’s equations, see (76). Because Chaplygin’s equations are a system of linear differential equations, (σ1 , σ3 , σ4 )space is fibered by its solution curves, each of which is parametrized by σ1 ∈ (−1, 1). Each fiber of the fibration defined by the above projection map is determined by its intersection point (0, σ 3 , σ 4 ) with the plane {σ1 = 0}. Viewing the parameters (σ 3 , σ 4 ) as constant functions on the solution curves of Chaplygin’s equations, they become constants of motion (or integrals) of the fully reduced vector field V. Using the fully reduced energy E, see (19) as a third constant of motion, the fully reduced system in (σ1 , σ2 )-space, namely (−1, 1) × R, becomes a conservative one degree of freedom Newtonian system of the form kinetic plus potential energy. The potential function ϕ 7→ Vσ 3 ,σ4 (ϕ) depends on the parameters (σ 3 , σ 4 ), see §5. For most values of (σ 3 , σ 4 ), the potential function σ1 7→ Vσ 3 ,σ4 (σ1 ) tends to ∞ as σ1 → ±1. Moreover, it has either one or three nondegenerate critical points, which are either one or two local minima and a local maximum. Therefore most of the solutions of the one degree of freedom Newtonian system are periodic, where the angle ϕ(t) oscillates in a well of the potential Vσ 3 ,σ 4 . At exceptional values of (σ 3 , σ 4 ), which lie on two lines `± that pass through the origin, the potential does not tend to ∞ at σ1 = ±1. Here the disk falls flat in finite time. There are also exceptional values of the parameters (σ 3 , σ 4 ) where the potential Vσ 3 ,σ4 has a degenerate critical points. These values define the degeneracy locus ∆. In §9 we show that ∆ is a closed curve in the (σ 3 , σ 4 )-plane, which bounds an open set R containing the origin. ∆ is analytic except at four points on the coordinate axes where it has a cusp. At the cusp points all three critical points of the potential have merged to form a fourth order degenerate minimum. The region R looks like a diamond, see figure 7.2. At each point in R the potential has three critical points. At each point of the complement of the closure of R, the potential has only one critical point which is a nondegenerate minimum, see §9 and §12. The set Σ of all (σ 3 , σ 4 , v), where v is a critical value of the potential

7.1. General set up

267

Vσ 3 ,σ 4 is called the critical surface. It has a cusped edge over the smooth part of the degeneracy locus ∆, a transversal self-intersection over each part of the coordinate axes which lie in R, and a cusped edge along two lines `± which pass through the origin. Otherwise it is a smooth analytic surface, see figure 7.13. The critical value surface is also the set of critical values of the integral map I : M → R3 : (σ1 , σ2 , σ3 , σ4 ) → σ 3 , σ 4 , Eσ3 ,σ4 (σ1 , σ2 ) . The various components of R3 \Σ correspond to different qualitative behaviors of the rolling disk. Local maxima of the potential, which correspond to hyperbolic relative equilibria, exist only over R. Because there is an upper bound to all such critical values, we obtain a global gyroscopic stabilization principle, namely, relative equilibria are stable (= elliptic) if their energy is larger than a fixed number, see §13. When the parameters (σ 3 , σ 4 ) lie on one of the exceptional lines `± , where the potential Vσ 3 ,σ4 has a finite limit as σ1 → ±1, the disk falls flat in finite time. In §10 we give an asymptotic analysis of the situation when the disk near falls flat and then rises up again. A surprising result of this 2 1/2 analysis is the existence of a universal constant ∆χ = (1 + mr π for the I1 ) change in the angle of the point of contact. This asymptotic analysis is taken up once more in §14 to show that the critical surface has a downward pointing cusped edge over `± and again in §15 to investigate the spatial rotational shift for motions of the disk which remain nearly flat.

7.1

General set up

Consider a reference disk D with a rim ∂D, which is a planar circle of radius r. Suppose that the total mass of the disk is m. Assume that the reference frame {e1 , e2 , e3 } for the reference disk is the principal axis frame with origin at the center of mass O. In this frame the moment of inertia tensor I of the disk is diag(I1 , I2 , I3 ). We assume that I1 = I2 . We make no further assumptions on the mass distribution. For example, if the mass is distributed uniformly, then I1 = I2 = 14 mr2 and I3 = 12 mr2 ; (1) and the disk is called the uniform disk ; whereas, if the mass is distributed uniformly on the rim of the disk, then (2) I1 = I2 = 12 mr2 and I3 = mr2

268

The rolling disk

and the disk is called a hoop. Proof. Recall that in the principal axis frame I1 = M2 +M3 , I2 = M3 +M1 and I3 = M1 + M2 . Here Mi is the integral of the function x2i with respect

e3

a

u Ae3

Au = e3

O e2

s e1

O0

Reference disk

As P

0

Rolling disk

Fig. 7.1 The reference disk (left) and the rolling disk (right). (A, a) ∈ E(3). The points O and O 0 are the centers of the reference disk and moving disk, respectively. P 0 is the point of contact of the moving disk with the horizontal plane.

to the mass distribution. Note that 0 ≤ I3 ≤ I1 + I2 = 2I1 . If the mass distribution is uniform with density µ, we use polar coordinates to obtain Z 2π Z 2π Z r (cos θ)2 dθ = 41 µπ r4 ; M1 = M2 = µ (ρ cos θ)2 ρ dρ dθ = 41 µr4 0

0

0

whereas M3 = 0, because the reference disk lies in the e1 -e2 plane. Since the total mass of the disk is m = µπ r2 , we get I1 = I2 = 41 m r2 , which proves (1). In the case of the hoop with angular mass density µ on the rim we have m = 2π µ and Z 2π M1 = M2 = µ (r cos θ)2 dθ = µπ r2 = 12 m r2 0

and M3 = 0. This yields (2).

The position of the rolling disk in Euclidean 3-space is given by an element (A, a) of the 3-dimensional Euclidean group E(3) ⊆ SO(3) × R3 . As for any smooth convex surface rolling on a horizontal plane, the motion of the disk takes place on the constraint manifold C given by { (A, a), (ω, a) ˙ ∈ E(3) × R6 Ae3 6= ±e3 , a˙ = −A(ω × s(u)), ha, e3 i = −hs(u), ui},

(3)

7 2. Reduction of the E(2) × S 1 symmetry

269

see (30) of chapter 6. The reason for the first condition a˙ = −A(ω × s(u)) ˙ that the in (3) is that it is just the nonholonomic constraint 0 = a˙ + As disk rolls without slipping, see (28) of chapter 6. The holonomic constraint that the disk moves on a horizontal plane is given by ha, e3 i = −hs(u), ui. The motion of the rolling disk is governed by a vector field V on C, whose integral curves satisfy A˙ = Aω × ,

where ω × : R3 → R3 : x 7→ ω × x. So ω × ∈ so(3)

a˙ = −A(ω × s) ω˙ = Y(s)

−1

(4)

[(Y(s)ω) × ω + m(ω × Ds(u)(u × ω)) × s − mg u × s] ,

see (42) of chapter 6. Here s is the inverse of the Gauss map of the disk, which we calculate as follows. Parametrize the rim ∂D of the reference disk by x : S 1 ⊆ R2 → R3 : θ 7→ r (cos θ, sin θ, 0). Since the inward unit normal n(x) at x ∈ ∂D is −(cos θ, sin θ, 0), the inverse of the Gauss map is s : (S 2 \ {±e3 }) ⊆ R3 :→ R3 : r (u1 , u2 , 0). 1−u23

u = (u1 , u2 , u3 ) 7→ − √

(5)

To relate this to the basic set up of the body of revolution treated in §7 of chapter 6, we observe that the functions ρ and ζ introduced in §7.1 in (130a) and (130b) of chapter 6 are ρ(u3 ) = −r (1 − u23 )−1/2 and ζ(u3 ) = 0, respectively, for the rolling disk. Note that s is not defined when u = ±e3 , that is, when the disk is lying flat. This is the origin of the condition Ae3 6= ±e3 in the definition of the constraint manifold C. The physical reason for this is that the point of contact with the horizontal plane of a flat lying disk is not defined. This gives rise to apparent singularities in the motion of the disk. 7.2

Reduction of the E(2) × S 1 symmetry

As is shown for a smooth convex surface of revolution in §6 of chapter 6, the 2-dimensional Euclidean group E(2), consisting of translations bhor of the

270

The rolling disk

horizontal plane and rotations B around the vertical, and the internal rotational symmetry S 1 , consisting of rotations C about the e3 -axis, together form a symmetry group for the rolling disk. In this section we remove this E(2) × S 1 symmetry and obtain the fully reduced equations of motion. We do this in two ways: first, by removing the E(2) symmetry and then the S 1 symmetry, and second, by removing the S 1 symmetry first and then the E(2) symmetry. Both of these reductions lead to the same fully reduced vector field and both are useful in carrying out the reconstruction of the full motion from the fully reduced motion. 7.2.1

First E(2), then S 1

Here we carry out the reduction of the E(2) × S 1 symmetry of the rolling disk, by first reducing the E(2) symmetry and then reducing the induced S 1 symmetry. The argument used in §6 of chapter 6 applies to the disk. It shows that the E(2)-reduced motions of the disk are governed by a vector field V on M = (S 2 \ {±e3 }) × R3 (with coordinates (u, ω)), whose integral curves satisfy u˙ = u × ω ω˙ = Y(s)−1 [Y(s)ω × ω + m(ω × Ds(u)(u × ω)) × s − mg u × s].

(6)

Here Y(s)v = Iv + m(s × (v × s)) for every v ∈ R3 . As usual, g is the strength of the gravitational field. The E(2)-reduced energy of the disk is E(u, ω) =

1 2

hIω, ωi +

1 2

=

1 2

I1 (ω12 + ω22 ) + 12 (I3 + mr2 )ω32 + q + mgr 1 − u23 ,

m hω × s, ω × si − mg hs, ui 1 2

mr2

(u1 ω2 − u2 ω1 )2 1 − u23 (7)

which is an integral of V, see (117a) and (117b) of chapter 6. The E(2)-reduced vector field V (6) and the E(2)-reduced energy E (7) are invariant under the proper free S 1 -action Ψ : S 1 × M → M : (C, (u, ω)) 7→ (Cu, Cω),

(8)

where S 1 = {C ∈ SO(3) Ce3 = e3 }. This S 1 symmetry arises from the rotational symmetry of the reference disk about its e3 -principal axis and the fact that the disk is dynamically symmetric because its moment of inertia

7 2. Reduction of the E(2) × S 1 symmetry

271

tensor I commutes with rotations about e3 . Thus the reduction of the axial S 1 symmetry (8) follows along the same lines as the reduction of the axial symmetry of a smooth convex surface of revolution rolling on a horizontal plane given in §7 of chapter 6. We give the details. We first construct the space M of orbits of the S 1 -action Ψ (8) using invariant theory. Note that σ1 = u3

σ2 = u2 ω1 − u1 ω2

σ3 = u1 ω1 + u2 ω2

(9) σ4 = ω3 σ5 = ω12 + ω22 σ6 = u21 + u22 generate the algebra of polynomials on M which are invariant under the action Ψ. They satisfy the relations σ22 + σ32 = σ5 σ6 , σ5 ≥ 0, σ6 ≥ 0 and σ6 + σ12 = 1, |σ1 | < 1.

Eliminating σ6 gives

σ22 + σ32 = σ5 (1 − σ12 ), σ5 ≥ 0, |σ1 | < 1,

(10)

1

which defines the space M of S -orbits on M . In other words, M is the image of the orbit map π e : M = (S 2 \ {±e3 }) × R3 ⊆ R3 × R3 → R5 : (u, ω) 7→ (σ1 , . . . , σ5 ). (11)

Since |σ1 | < 1, the orbit space M is the graph of the function σ 2 + σ32 σ5 = 2 , |σ1 | < 1 (12) 1 − σ12 and thus is diffeomorphic to (−1, 1)× R3 (with coordinates (σ1 , σ2 , σ3 , σ4 )). Because the E(2)-reduced vector field V (6) is invariant under the S 1 action Ψ, it induces the fully reduced vector field V on the orbit space M. Claim 7.2.1.1. The integral curves of V on M satisfy dσ1 = σ2 dt dσ2 σ1 σ22 I1 σ1 σ32 I3 + mr2 = − − + σ3 σ4 2 2 2 dt 1 − σ1 I1 + mr 1 − σ1 I1 + mr2 q mgr + σ1 1 − σ12 I1 + mr2 I1 dσ3 = − σ2 σ4 dt I3 dσ4 mr2 σ2 σ3 =− . dt I3 + mr2 1 − σ12

(13)

272

The rolling disk

Proof. We give the details how we obtained (13). Let uhor = (u1 , u2 , 0) = u − u3 e3 . Then the inverse of the Gauss map r 2 s = s(u) = − kuhor k uhor. Hence hs, si = r and hs, e3 i = 0. In what follows we will use the formulæ hIω × ω, e3 i = 0

hω × s, uhor i = 0

hs × u, e3 i = 0

hu × s, uhor i = 0

ds ds , e3 i = 0 h , uhor i = 0 dt dt ds kuhor k h , u × e3 i = kuhor k hs × ω, u × e3 i − r u3 he3 × ω, u × e3 i dt h

without explicit mention. They are easily verified. Also we will need d dω dω ds (Iω) = Iω × ω − mr2 + m h , si s + m hs, ωi dt dt dt dt + m hω, si ω × s − mg u × s,

(14)

which follows from the second equation in (6). We now begin our derivation of (13). From the definition of σ2 and the first equation in (6) we obtain σ2 = hu × ω, e3 i = h

du d dσ1 , e3 i = hu, e3 i = . dt dt dt

(15)

Differentiating I1 σ3 = hIω, uhori and using (14) gives I1

dσ3 d(Iω) duhor =h , uhor i + hIω, i dt dt dt dω dω = hIω × ω, uhor i − mr2 h , uhori + m h , sihs, uhori dt dt ds + m hs, ωih , uhori + m hω, sihω × s, uhor i dt du − mg hu × s, uhori + hIω, − u˙ 3 e3 i dt = hIω × ω, uhor i + hIω, u × ωi − u˙ 3 hIω, e3 i = −I3 ω3 u˙ 3 = −I3 σ2 σ4 .

(16)

7 2. Reduction of the E(2) × S 1 symmetry

273

Multiplying the derivative of I3 σ4 = "Iω, e3 # by 1 − u23 gives dσ4 d(Iω) = (1 − u23 )" , e3 # I3 (1 − u23 ) dt dt 0 dω dω = (1 − u23 ) "Iω × ω, e3 # − mr2 " , e3 # + m" , s#"s, e3 # dt dt 1 ds + m "s, ω#" , e3 # + m "ω, s#"ω × s, e3 # − mg "u × s, e3 # dt = −mr2 (1 − u23 )ω˙ 3 − mr2 "ω, uhor #"ω × uhor , e3 #

= −mr2 (1 − u23 )ω˙ 3 − mr2 "ω, uhor #u˙ 3 . In other words, dσ4 = −mr2 σ2 σ3 . (17) (I3 + mr2 )(1 − σ12 ) dt Multiplying the derivative of I1 σ2 = I1 "u × ω, e3 # = "u × Iω, e3 # by 1 − u23 gives 0 1 dσ2 du d(Iω) = (1 − u23 ) " × Iω, e3 # + "u × , e3 # I1 (1 − u23 ) dt dt dt 0 dω , e3 # = "(u × ω) × Iω, e3 # + "u × (Iω × ω), e3 # − mr2 "u × dt ds + m "s, ω#"u × , e3 # + m "ω, s#"u × (ω × s), e3 # dt 1

− mg "u × (u × s), e3 # (1 − u23 ) 0 d = (1 − u23 ) −"Iω, ω# u3 + "ω, u#I3 ω3 − mr2 "u × ω, e3 # dt 1 du + mr2 " × ω, e3 # − m "s, ω#"s × ω, u × e3 # + m "ω, s#"s × ω, u × e3 # dt

Thus

+ mr u3 "s, ω#"e3 × ω, u × e3 # /uhor / + mgr u3 /e3 × uhor/2 /uhor/.

dσ2 dt = [(I3 + mr2 ) ω3 "uhor , ω# − (I1 + mr2 ) u3 (ω12 + ω22 )](1 − σ12 ) (I1 + mr2 )(1 − σ12 )

+mr2 u3 "uhor , ω#2 + mgr u3 (1 − u23 )3/2

= −(I1 + mr2 ) σ1 σ5 (1 − σ12 ) + (I3 + mr2 )σ3 σ4 (1 − σ12 ) + mr2 σ1 σ32 +mgr σ1 (1 − σ12 )3/2

= −(I1 + mr2 ) σ1 σ22 − I1 σ1 σ32 + (I3 + mr2 ) σ3 σ4 (1 − σ12 )

+ mgr σ1 (1 − σ12 )3/2 , since σ5 (1 − σ12 ) = σ22 + σ32 . Thus equations (15)–(18) yield (13).

(18)

274

The rolling disk

The E(2)-reduced energy E on S 2 × R3 is invariant under the S 1 action Ψ (8). Therefore E (7) is a function of the invariants σj (9). So E = 12 (I1 + mr2 )

σ22 σ32 + 12 I1 2 1 − σ1 1 − σ12

+ 12 (I3 + mr2 )σ42 + mgr (1 − σ12 )1/2 .

(19)

E is an integral of the fully reduced vector field V on M.

From the definition of the invariants σ2 and σ3 (9) we have u2 ω1 − u1 ω2 = σ2 u1 ω1 + u2 ω2 = σ3 , which can be solved for ω1 and ω2 . Together with the definition of σ6 in (9) we obtain ω1 =

1 (u2 σ2 u21 +u22

ω2 =

1 (−u1 σ2 u21 +u22

+ u1 σ3 ) =

1 (u2 σ2 1−σ12

+ u2 σ3 ) =

+ u1 σ3 )

1 (−u1 σ2 1−σ12

(20)

+ u2 σ3 )

ω3 = σ4 . Therefore the E(2)-reduced equations of motion for ω, ˙ see (6), are a consequence of the fully reduced equations of motion (13), equation (20), and the first equation in (6). 7.2.2

First S 1 , then E(2)

Here we carry out the reduction of the E(2)×S 1 symmetry by first reducing the S 1 symmetry and then the induced E(2) symmetry. Recall that the S 1 action on the constraint manifold C (3) is given by (C, (A, a, ω, a)) ˙ 7→ (AC −1 , a, Cω, a). ˙

Therefore the space N of S 1 orbits on C is parametrized by (v, ahor , ν), ˙ −1 , and ahor is the projection of a on the horizontal where v = Ae3 , ν × = AA 0 1 0 −ν ν e1 –e2 plane. Recall that if ν = (ν1 , ν2 , ν3 ), then ν × =@ ν3 30 −ν21 A∈ so(3). −ν2

ν1

0

In terms of u, ω and A, we have v = A2 u and ν = Aω. Consequently, N is diffeomorphic to S 2 × R2 × R3 .

Next we find the S 1 -reduced equations of motion on N , which describe the motion of the disk ignoring its internal rotation. Differentiating the definition of the vector v and using the definition of ν we get ˙ 3 = ν × Ae3 = ν × v = ν × v. v˙ = Ae (21)

7 2. Reduction of the E(2) × S 1 symmetry

275

To find the equation for a, ˙ the velocity of the center of mass, we use the nonholonomic constraint a˙ = −Aω × As(u). But ν = Aω and s(u) = − √ r 2 (u − u3 e3 ). Since 1−u3

v3 = hv, e3 i = hAe3 , e3 i = he3 , A−1 e3 i = he3 , ui = u3 , we find that r r As(u) = − p (Au − v3 Ae3 ) = − p (e3 − v3 v). 2 1 − v3 1 − v32

Therefore

r a˙ = p (ν × (e3 − v3 v)). 1 − v32

Taking the horizontal component of (22) gives r (ν × (e3 − v3 v))hor. a˙ hor = p 1 − v32

(22)

(23)

A straightforward calculation of the S 1 -reduced equation for ν˙ is not available. Instead we procede indirectly. First we complete the reduction of the E(2) action induced on N by the E(2)-action on the constraint manifold C. The E(2)-action on C is given by ((B, bhor), (A, a, ω, a)) ˙ 7→ (BA, Ba + bhor , ω, B a); ˙ while the induced E(2) action on N is given by ((B, bhor), (v, ahor , ν)) 7→ (Bv, Bahor + bhor , Bν).

Since the E(2) action ((B, bhor), ahor) 7→ Bahor + bhor on R2 is transitive, the space of E(2)-orbits on N is diffeomorphic to the fully reduced space M (10). The E(2)-invariant vector field on N induced by the S 1 invariant vector field on C, which governs the motion of the disk, is the same as the E(2) × S 1 -reduced vector field V (13) on the fully reduced space M. We can express the invariants σj , 1 ≤ j ≤ 4 (9) in terms of the vectors v and ν as follows σ1 σ2 σ3 + σ1 σ4 σ4

= = = =

u3 = v3 σ˙ 1 = v˙ 3 = (ν × v)3 = ν1 v2 − ν2 v1 hu, ωi = hA−2 v, A−1 νi = hA−1 v, νi = he3 , νi = ν3 hω, e3 i = hA−1 ν, e3 i = hν, vi.

(24)

Solving the equations v2 ν1 − v1 ν2 = σ2

v1 ν1 + v2 ν2 = σ4 − v3 ν3 = σ4 − σ1 (σ3 + σ1 σ4 ) = (1 − σ12 )σ4 − σ1 σ3

276

The rolling disk

for ν1 and ν2 , using v12 + v22 = 1 − v32 = 1 − σ12 , combined with the third equation in (24) gives σ1 σ3 σ2 )v1 + v2 1 − σ12 1 − σ12 σ1 σ3 σ2 v1 + (σ4 − )v2 ν2 = − 2 1 − σ1 1 − σ12 ν1 = (σ4 −

(25)

ν3 = σ3 + σ1 σ4 . Therefore the equation for ν˙ of the S 1 -reduced vector field on N is a consequence of the fully reduced equations of motion (13), equation (25), and v˙ = ν × v. 7.3

Reconstruction

In this section we reconstruct the actual motion of the disk in Euclidean 3-space from its fully reduced motion.

7.3.1

The E(2)-reduced flow

First we reconstruct the flow of the E(2)-reduced vector field V (6) on M = (S 2 \ {±e3 }) × R3 from the flow of the fully reduced vector field V (13) on M (10). We can reconstruct the vector ω(t) from the vector u(t) and the fully reduced motion (σ1 (t), . . . , σ4 (t)) because 1 (u2 σ2 + u1 σ3 ) 1 − σ12 1 ω2 = (−u1 σ2 + u2 σ3 ) 1 − σ12 ω1 =

(26)

ω3 = σ4 , see (20). In order to reconstruct the vector u(t) = A(t)−1 e3 , we use the equation u˙ = u × ω. This leads to u1 u˙ 2 − u2 u˙ 1 = u1 (u3 ω1 − u1 ω3 ) − u2 (u2 ω3 − u3 ω2 )

= (u1 ω1 + u2 ω2 )u3 − (u21 + u22 )ω3 = σ3 σ1 − (1 − σ12 )σ4 ,

277

7.3. Reconstruction

using the definition (9) of the invariants. It is now convenient to introduce spherical coordinates (ϕ, ψ) on the unit sphere S 2 so that u = (cos ϕ cos ψ, cos ϕ sin ψ, sin ϕ).

(27)

Here sin ϕ = σ1 . Thus ϕ is the angle between the oriented plane of the rim of the disk and the vertical axis. Note that the condition Ae3 6= ±e3 , or equivalently |σ1 | < 1, implies that −π/2 < ϕ < π/2. Therefore cos ϕ > 0. Here the cosine of the angle ϕ is positive because the upward vertical direction is on the positive side of the plane of the rim. Thus the angle ψ ∈ R/2π Z is well defined and depends analytically on u. Using (27) we obtain h i u1 u˙ 2 − u2 u˙ 1 = cos ϕ cos ψ −(sin ϕ)ϕ˙ sin ψ + cos ϕ cos ψ ψ˙ h i − cos ϕ sin ψ −(sin ϕ)ϕ˙ cos ψ − cos ϕ(sin ψ)ψ˙ ˙ = (cos2 ϕ)ψ˙ = (1 − σ12 )ψ.

Therefore σ1 σ3 dψ − σ4 . = dt 1 − σ12

(28)

So we obtain ψ(t) from σj (t) by integration. Once ψ(t) has been determined, we can read off u(t) from (27), since ϕ(t) is uniquely determined by sin−1 σ1 (t). Therefore ω(t) follows from (26) and taking its horizontal component. This completes the reconstruction of the E(2)-reduced motion. 7.3.2

The full motion

In this subsection we reconstruct the full motion t 7→ (A(t), a(t)) ∈ E(3) of the rolling disk from its E(2)-reduced motion t 7→ (u(t), ω(t)) ∈ M = (S 2 \ {±e3 }) × R3 . The rotational motion t 7→ A(t) of the disk is governed by A˙ = Aω × .

(29)

Since t 7→ ω(t) is known from (26), integrating (29) with the initial condition A(0) = I gives t 7→ A1 (t). Therefore A(t) = A(0)A1 (t). To reconstruct the translational motion t 7→ a(t) of the center of mass of the disk, we parametrize points on the rim ∂D of the reference disk D by x(θ) = r(cos θ, sin θ, 0), where θ ∈ R/2π Z. The point of contact p(t) of the rolling disk at time t occurs at the value of θ where the function

278

The rolling disk

θ 7→ hA(t)x(θ) + a(t), e3 i assumes its minimum value. This value is 0, because the point of contact lies on the horizontal plane. But hA(t)x(θ) + a(t), e3 i = hx(θ), u(t)i + a3 (t)

(30)

attains its minimum value when x(θ) lies in the same direction p as −uhor(t). From the fact that u3 (t) = σ1 (t), it follows that kuhor(t)k = 1 − σ12 (t). Therefore x(θ) = −r(1 − σ12 (t))−1/2 uhor(t) = −r(cos ψ(t), sin ψ(t), 0),

(31)

using (27). Because the left hand side of (30) is 0 and x(θ) is given by (31), it follows from (30) that the height of the center of mass of the rolling disk above the horizontal plane is a3 (t) = r(1 − σ12 (t))1/2 .

(32)

For fixed θ the condition that the disk rolls without slipping implies that the derivative of the motion of the point of contact p(t) = A(t)x(θ) + a(t) is zero. Here x(θ) is given by (31). In other words, da dA = r(1 − σ12 )−1/2 (uhor) = r(1 − σ12 )−1/2 A(ω × uhor) dt dt = −rA(t)(ω(t) × (cos ψ(t), sin ψ(t), 0)).

(33)

Therefore, given the rotational motion t 7→ A(t) of the disk, the angular velocity t 7→ ω(t) and the angle t 7→ ψ(t) (see (28)), we obtain the horizontal component t 7→ ahor(t) of the center of mass of the rolling disk by integrating (33). This completes the reconstruction of the actual motion of the disk. 7.3.3

The S 1 -reduced flow

In this subsection we reconstruct the motion t 7→ (v(t), a(t), ν(t)) of the S 1 -invariant vector field on N from a motion of the fully reduced vector field on M. The vector ν(t) can be reconstructed from the vector v(t) and the σj (t) of the fully reduced motion because σ1 σ3 σ2 ν1 = σ4 − v1 + v2 2 1 − σ1 1 − σ12 σ1 σ1 σ3 v2 (34) ν2 = − v1 + σ4 − 2 1 − σ1 1 − σ12 ν3 = σ3 + σ1 σ4 ,

see (25).

279

7.3. Reconstruction

To reconstruct the vector v(t), we use spherical coordinates (ϕ, χ) on S to write v = (cos ϕ cos χ, cos ϕ sin χ, sin ϕ). (35) Here ϕ is the same angle as used in (27), because v3 = σ1 = u3 = sin ϕ. Using the same argument as for the angle ψ, we see that the angle χ depends analytically on v. An argument similar to the one used to prove (28) gives (1 − σ12 )χ˙ = cos2 ϕ χ˙ = v1 v˙ 2 − v2 v˙ 1 = (v × v) ˙ 3 2

= (v × (ν × v))3 = hv, viν3 − hv, νiv3 = σ3 + σ1 σ4 − σ4 σ1 ,

using (24)

= σ3 . Therefore

σ3 dχ . (36) = dt 1 − σ12 To reconstruct a(t) from v(t) and ν(t) we use (22). This gives da = r(1 − σ12 )−1/2 (ν × (e3 − σ1 v)), dt since v3 = σ1 . In other words da = r(σ4 sin χ − ϕ˙ cos ϕ cos χ, −σ4 cos χ − ϕ˙ cos ϕ sin χ, −ϕ˙ sin ϕ). (37) dt Equation (37) follows because σ1 σ3 σ˙ 1 ν = (σ4 − ) cos ϕ cos χ + cos ϕ sin χ 1 − σ12 1 − σ12 σ˙ 1 σ1 σ3 cos ϕ cos χ + (σ4 − ) cos ϕ sin χ, σ3 + σ1 σ4 − 2 2 1 − σ1 1 − σ1 and e3 − σ1 v = (−σ1 cos ϕ cos χ, −σ1 cos ϕ sin χ, 1 − σ1 sin ϕ). The two equations above were obtained using σ2 = σ˙ 1 = cos ϕ ϕ˙ and equations (34) and (35). In terms of (v(t), a(t), ν(t)) the point of contact of the rolling disk with the horizontal plane is p(t) = A(t)x(θ) + a(t) = −r(1 − σ12 )−1/2 A(t)uhor(t) + a(t), using (31) = −r(1 − σ12 )−1/2 (e3 − v3 v(t)) + a(t),

because Auhor = A(u − u3 e3 ) = e3 − v3 v

= −a3 (t) e3 + r(1 − σ12 )−1/2 σ1 vhor(t) + a(t),

since e3 − v3 v = e3 − v3 (vhor + v3p e3 ) = (1 − v32 )e3 − v3 vhor and a3 = r 1 − σ12

σ1 =rp vhor(t) + ahor(t). 1 − σ12

(38)

280

The rolling disk

7.3.4

Geometry of the E(2) × S 1 reduction map

In this subsection we show that the map π : C → M : (A, a, ω, a) ˙ 7→ (σ1 , σ2 , σ3 , σ4 ),

(39)

which carries out the reduction of the E(2) × S 1 symmetry of the rolling disk, defines a trivial fibration with fiber E(2)× S 1 . Note that π is the orbit map of the E(2) × S 1 action on C. First we prove Lemma 7.3.4.2. If A ∈ SO(3) with Ae3 6= e3 , then there are unique rotations B and C about the e3 axis through angles χ − π and ψ, respectively, and a unique angle −π/2 < ϕ < π/2 such that 0

A = B@

1 cos(ϕ + π/2) 0 − sin(ϕ + π/2) A 0 1 0 sin(ϕ + π/2) 0 cos(ϕ + π/2)

C −1 .

Proof. Let A = B −1 AC. We have

v = Ae3 = B −1 ACe3 = B −1 Ae3 = B −1 v, e

where v = (cos ϕ cos χ, cos ϕ sin χ, sin ϕ). Since Ae3 6= ±e3 , it follows that Ae3 6= ±e3 . Thus there is a unique rotation B about the e3 -axis such that (B −1 v)hor is a negative multiple of e1 . Consequently, Ae3 = B −1 v = −(cos ϕ)e1 + (sin ϕ)e3 . B is a rotation through an angle χ − π, since 0

10 1 cos(χ − π) − sin(χ − π) 0 − cos ϕ A cos(χ − π) 0 A @ 0 0 0 1 sin ϕ

Be v = @ sin(χ − π)

0

cos ϕ cos χ

(40) 1

= @ cos ϕ sin χ A = v. sin ϕ

Similarly u e = A−1 e3 = C −1 A−1 Be3 = C −1 A−1 e3 = C −1 u,

where u = (cos ϕ cos ψ, cos ϕ sin ψ, sin ϕ). Thus there is a unique rotation C about the e3 -axis such that (C −1 u)hor is a positive multiple of e1 . Consequently, A−1 e3 = (cos ϕ)e1 + (sin ϕ)e3 .

(41)

C is a rotation through ψ since 0

10 1 cos ψ − sin ψ 0 cos ϕ cos ψ 0 A @ 0 A 0 0 1 sin ϕ

Cu e = @ sin ψ

0

cos ϕ cos ψ

1

= @ cos ϕ sin ψ A = u. sin ϕ

Applying A to both sides of (41) and then using (40) gives

e3 = (cos ϕ)Ae1 + (sin ϕ)Ae3 = (cos ϕ)Ae1 − (sin ϕ cos ϕ)e1 + (sin2 ϕ)e3 ,

281

7.3. Reconstruction

that is, (cos ϕ)Ae1 = cos ϕ [(sin ϕ)e1 + (cos ϕ)e3 ] . But cos ϕ > 0. So Ae1 = (sin ϕ)e1 + (cos ϕ)e3 .

(42)

From (42) and (40) it follows that the linear map A leaves the 2-plane Π = span{e1 , e3 } invariant. Since A ∈ SO(3), we deduce that A leaves Π⊥ = span{e2 } invariant. From the fact that det A = 1 and det A|Π = 1 it follows that Ae2 = e2 . Therefore the matrix of A with respect to the basis {e1 , e2 , e3 } is 0 @

1 sin ϕ 0 − cos ϕ A 0 1 0 cos ϕ 0 sin ϕ

0

=@

1 cos(ϕ + π/2) 0 − sin(ϕ + π/2) A 0 1 0 sin(ϕ + π/2) 0 cos(ϕ + π/2)

.

This proves the lemma. Note that the angles ϕ + π/2, χ − π, and ψ are Euler angles for the rotation A. ∗

Corollary 7.3.4.3. Let SO(3) = SO(3) \ {A ∈ SO(3) Ae3 = ±e3 }. For x ∈ R3 let SO(3)x = {A ∈ SO(3) Ax = x}. The map ∗

SO(3) → SO(3)e3 × SO(3)e2 × SO(3)e3 : A → (B, A, C), is smooth. Here A = BAC −1 . Proof. Just observe that the vectors v = (cos ϕ cos χ, cos ϕ sin χ, sin ϕ) and u = (cos ϕ sin ψ, cos ϕ sin ψ, sin ϕ) depend smoothly on A ∈ SO(3)∗ . Consequently, the Euler angles χ − π, ϕ + π/2, and ψ do also. Thus the rotations B and C about the e3 axis through the angles χ − π and ψ and the rotation ∗ A about the e2 -axis through ϕ + π/2 depend smoothly on A ∈ SO(3) . To proceed further we note that the constraint manifold C (3) is parametrized by (A, ω, ahor), where A ∈ SO(3)∗ , ahor = (a1 , a2 , 0) with (a1 , a2 ) ∈ R2 and ω ∈ R3 . To verify this, observe that A determines the vector u = Ae3 , which is not a multiple of e3 since Ae3 6= ±e3 . Therefore r u3 6= ±1. Consequently, s(u) = − kuhor k uhor is defined. The third component of the vector a = (a1 , a2 , a3 ) is determined by the holonomic constraint a3 = −rhs(u), ui = rkuhork. The velocity of the center of mass of the rolling disk is defined by the nonholonomic constraint a˙ = −Aω × As(u), which is determined by A, ω, and s(u). Thus C is parametrized by A, ω, and ahor.

282

The rolling disk

Consider the smooth map Σ : M → C : σ = (σ1 , σ2 , σ3 , σ4 ) 7→ 0

σ1 0 − B B 1 @q 0 2 0 1 − σ1



q

1 1 − σ2 1 C C 0 A σ1

, 0,

0 σ q 3 B 1−σ2 1 B B − q σ2 B @ 1−σ2 1 0



1 C C C C A

 = (A, 0, ω e)

(43)

Lemma 7.3.4.4. The map Σ is a smooth section of the bundle π : C → M (39). p Proof. Now u e = A−1 e3 = ( 1 − σ12 , 0, σ1 ) and ω e = √ 1 2 (σ3 , σ2 , 0). But 1−σ1

(e u, ω e ) = (C −1 u, C −1 ω) for a rotation C about the e3 -axis through an angle σ3 σ2 ψ. To verify this write ω e = (cos ϕ 1−σ 2 , − cos ϕ 1−σ 2 , σ4 ). Then 1

Cω e=

0

σ3 σ2 2 + cos ϕ sin ψ 1−σ2 1−σ1 B B σ σ2 1 3 B cos ϕ sin ψ 2 − cos ϕ cos ψ 1−σ2 @ 1−σ1 1 σ4 cos ϕ cos ψ

= ω,

1

1 C C C A

=

0

1 σ2 σ3 2 + u2 1−σ2 1−σ1 B 1 C B C σ σ 3 2 B u2 C 2 − u1 1−σ2 A @ 1−σ1 1 σ4 u1

,

using u = (cos ϕ cos ψ, cos ϕ sin ψ, sin ϕ) = C u e

using (26).

Therefore u e3 = u3 = σ1 = sin ϕ and ω e3 = ω3 = σ4 . Moreover, u e2 ω e1 − u e1 ω e2 = u2 ω1 − u1 ω2 = σ2 and u e1 ω e1 + u e2 ω e2 = u1 ω1 + u2 ω2 = σ3 . So π ◦ Σ(σ) = σ for every σ ∈ M. Next consider the map

Ξ : (E(2) × S 1 ) × M → C : (((B, bhor), C), σ) 7→ ((B, bhor), C) · Σ(σ). (44) Then Ξ(((B, bhor), C), σ) = ((B, bhor), C) · (A, 0, ω e ) = (BAC −1 , bhor , C ω e ) = (A, bhor , ω).

Proposition 7.3.4.5. Ξ is a trivialization of the bundle π : C → M (39). Proof. Because π ◦ Ξ(((B, bhor), C), σ) = π(((B, bhor), C) · Σ(σ)) = π ◦ Σ(σ) = σ = π2 (σ), where π2 is projection on the second factor, that is, π2 : (E(2) × S 1 ) × M → M : (((B, bhor), C), σ) 7→ σ, it suffices to show that Ξ is a diffeomorphism. Clearly Ξ is smooth. Consider the map ξ : C → (E(2) × S 1 ) × M : (A, bhor, ω) 7→ (((B, bhor), C), π(A, 0, ω e )).

283

7.4. Relative equilibria

Using corollary 7.3.4.3 it follows that ξ is smooth. Next we verify that Ξ ◦ ξ = idC and ξ ◦ Ξ = idM . We compute Ξ ◦ ξ(A, bhor, ω) = Ξ(((B, bhor), C), π(A, 0, ω e )) = ((B, bhor), C) · Σ ◦ π(A, 0, ω e )) and

= ((B, bhor), C) · (A, 0, ω e ) = (A, bhor, ω)

ξ ◦ Ξ(((B, bhor), C), σ) = ξ(((B, bhor), C) · Σ(σ))

= (((B, bhor), C), π ◦ Σ(σ)) = (((B, bhor), C), σ).

This shows that ξ is the inverse of Ξ and completes the proof of the proposition.

7.4

Relative equilibria

In this section we study the set of equilibrium points of the fully reduced vector field V (13). These points are the E(2) × S 1 -relative equilibria of the rolling disk.

7.4.1

The manifold of relative equilibria

Because σ2 is a factor in the right hand side of the last two equations in (13), it follows that (σ1 , σ2 , σ3 , σ4 ) is an equilibrium point of V if and only dσ2 d2 σ1 1 if σ2 = dσ dt = 0 and dt = dt2 = 0. Substituting σ2 = 0 into the second 2 equation in (13) and using dσ dt = 0 gives σ1 σ 2 + (I3 + mr2 )σ3 σ4 + mgr σ1 (1 − σ12 )1/2 = 0. −I1 (45) 1 − σ12 3 When σ3 6= 0, equation (45) defines σ4 as an analytic function of (σ1 , σ3 ) ∈ (−1, 1) × (R \ {0}). If σ3 = 0, then the set of solutions of (45) is the σ4 -axis. At the point (σ10 , σ30 , σ40 ) = (0, 0, σ4 ) the differential of the left hand side of (45) is Ω = (I3 + mr2 )σ40 dσ3 + mgr dσ1 , which is nonzero. This shows that the set N of equilibrium points of the fully reduced vector field V is an analytic surface in (σ1 , σ2 , σ3 , σ4 )-space. Because N ⊆ {σ2 = 0}, it has codimension 2. In fact N is connected. This follows because at points of the σ4 -axis, the tangent space to N , which is {σ2 = 0}∩ker Ω, contains a vector with nonzero third component. Thus the intersection of the σ4 -axis with N is connected. So N is connected being a graph over the half planes {(σ1 , 0, σ3 ) σ3 ≥ 0} and {(σ1 , 0, σ3 ) σ3 ≤ 0}.

284

The rolling disk

7.4.2

One parameter groups

Here we take a closer look at the relative equilibria of the rolling disk. Let (A, a) be an element of the 3-dimensional Euclidean group E(3), which defines the configuration space of the unconstrained disk. Let S 1 = SO(2) be the subgroup of SO(3) consisting of rotations, which leave the vertical basis vector e3 of R3 fixed. Let E(2) be the 2-dimensional Euclidean subgroup of E(3) whose elements are (B, bhor), where B ∈ SO(2) and bhor = (b1 , b2 , 0). The group E(2) × S 1 acts on E(3) by

((B, bhor), C)•(A, a) = (B, bhor) · (A, a) · (C, 0) = (BAC −1 , a + bhor ).

Here · is multiplication in E(3). This action induces the E(2) × S 1 action Φ : (E(2) × S 1 ) × C → C :

(((B, bhor), C), (A, ahor , ω)) 7→ (BAC −1 , Bahor + bhor , Cω)

(46)

on the constraint manifold C (3). The E(2) × S 1 -action Φ is free and proper. Therefore from proposition 2.1.1 of chapter 4 we know that to every equilibrium point σ 0 = (σ10 , σ20 , σ30 , σ40 ) of the fully reduced vector field V on the fully reduced space M there is a motion t 7→ c(t) of the disk in the fiber π −1 (σ 0 ) ⊆ C of the E(2) × S 1 reduction map π (39). Also, there is a unique element ξ in the Lie algebra e(2) × R of E(2) × S 1 such that c(t) = Φexp tξ (c(0)). The choice of c(0) ∈ π −1 (σ 0 ) depends on dim C − dim(E(2) × S 1 ) = 8 − 4 = 4 parameters: two of which fix the initial horizontal component of the center of mass of the disk; one of which fixes the amount of initial rotation of the disk about the vertical; and the fourth which fixes the amount of initial internal rotation of the disk its symmetry axis. 0

1 0 −1 0 0 0A 0 0 0

Let E3 = (e3 )× =@ 1

. Then every element ξ ∈ e(2) × R is of the

form ((η E3 , βhor), ζ E3 ), where (η, ζ) ∈ R2 and βhor = (β1 , β2 , 0). Physically, η is the spatial angular velocity of the disk about the vertical; βhor is the horizontal translational velocity of the center of mass of the moving disk; and ζ is the intrinsic angular velocity of the disk about its symmetry axis. Since λ = (η E3 , βhor) ∈ e(2), the corresponding one parameter subgroup of E(2) is t 7→ exp tλ = g(t), where g(t) satisfies the differential equation g 0 (t) = λ · g(t) with g(0) being the identity element of E(2). In other words, g(t) = (B(t), bhor(t)) satisfies B 0 (t) = η E3 B(t) and b0hor(t) = η E3 bhor(t) + βhor

7.4. Relative equilibria

285

with B(0) = I and b(0) = 0. Consequently, B(t) = etη E3 and $. t % (t−s)η E3 bhor(t) = e ds βhor 2 0 1 (1 − etη E3 )E3 βhor , if η -= 0 = η tβhor , if η = 0.

(47)

Therefore the action of exp tξ on E(3) is given by (A(t), a(t)) = exp tξ · (A0 , a0 ) = (B(t), bhor(t)) · (A0 , a0 ) · (e−tζ E3 , 0) = (B(t)A0 e−tζ E3 , B(t)a0 + bhor)

(48)

Thus we have proved Lemma 7.4.2.6. For ξ = ((η E3 , βhor), ζ E3 ) ∈ e(2) × R, the curve t '→ c(t) = Φexp tξ (c(0)) on the constaint manifold C, which reconstructs the motion of the rolling disk corresponding to a relative equilibrium determined by ξ ∈ e(2) × R, is given by (49) c(0) = (A0 , a0hor , ω 0 ) '→ c(t) = (A(t), ahor(t), ω(t)) 3“ ” 1 tη E 0 −tζ E tη E 0 tη E tη E 0 3 3 3 3 3 e A e ,e ahor + η (1 − e )E3 βhor , e ω , if η #= 0 = ´ ` 0 −tζ E 0 0 3

A e

7.4.3

, ahor + tβhor , ω

, if η = 0.

Angular speeds in terms of invariants

In this subsection we calculate the angular speeds η and ζ of the disk about the vertical direction and its axis of symmetry, respectively, in terms of the components of the equilibrium point σ 0 of the fully reduced vector field. Since u(t) = A(t)−1 e3 = etζ E3 (A0 )−1 e−tη E3 e3 , =e

tζ E3 0

u ,

since e

−tη E3

using (49)

is a rotation about e3

we get (cos tζ u01 − sin tζ u02 , sin tζ u01 + cos tζ u02 , u03 ) = u(t)

= (cos ϕ0 cos ψ(t), cos ϕ0 sin ψ(t), sin ϕ0 ),

(50)

using (27). Setting t = 0 in the above equation gives u0 = (cos ϕ0 cos ψ 0 , cos ϕ0 sin ψ 0 , u03 ). Therefore the first component of (50) reads cos ϕ0 cos ψ(t) = cos ϕ0 (cos tζ cos ψ0 − sin tζ sin ψ0 ) = cos ϕ0 cos(tζ + ψ0 ).

(51)

286

The rolling disk

Since |σ 0 | < 1, it follows that cos ϕ0 > 0 and that ψ(t) is uniquely determined. Therefore ψ(t) = tζ + ψ 0 .

(52)

Hence ζ= using (28).

dψ σ10 σ30 (0) = − σ40 , dt 1 − (σ10 )2

(53)

Since v(t) = A(t)e3 = etη E3 v 0 , we get (cos tη v10 − sin tη v20 , sin tη v10 + cos tη v20 , v30 ) = (cos ϕ0 cos χ(t), cos ϕ0 sin χ(t), sin ϕ0 ).

An argument similar to the one given in the preceding paragraph shows that χ(t) = tη + χ0 .

(54)

dχ σ30 (0) = , dt 1 − (σ10 )2

(55)

Hence η=

using (36). Solving (53) and (55) gives

σ30 = η(1 − (σ10 )2 ) and σ40 = η σ10 − ζ.

(56)

Substituting both expressions in (56) into (45) shows that the one parameter group (49) generated by ξ = ((η, βhor), ζ) ∈ e(2) × R is a relative equilibrium if and only if mgr σ10 (1 − (σ10 )2 )−1/2 = η 2 (I1 − I3 − mr2 )σ10 + ηζ (I3 + mr2 ).

(57)

The above equation expresses the balance between the force of gravity and the centrifugal force acting on the rolling disk. Observe that the mapping (−1, 1) × R2 → (−1, 1) × R2 : (σ10 , η, ζ) 7→ (σ10 , η(1 − (σ10 )2 ), η σ10 − ζ) is a diffeomorphism from the solution set of (57) onto the manifold N of relative equilibria defined by (45). Therefore the solution set of (57) is a connected 2-dimensional analytic submanifold of (−1, 1) × R2 . Because the left hand side of (57) tends to ±∞ as σ10 → ±1, equation (57) has at least one solution for every (η, ζ) ∈ R2 . If I3 + mr2 ≥ I1 (which holds for the uniform disk and the hoop), then (57) has a unique solution for every (η, ζ) ∈ R2 , which depends analytically on (η, ζ). If I3 + mr2 < I1 , then

287

7.4. Relative equilibria

(57) can have at most three solutions. When σ10 → ±1, which means that the disk is almost flat, then η and ζ cannot both remain bounded. If η 6= 0, then (57) defines ζ as an analytic function of (σ 0 , η) ∈ (−1, 1)× (R \ {0}). Explicitly, h i −1 ζ = mgr σ10 (1 − (σ10 ))−1/2 − η 2 (I1 − I3 − mr2 ) η(I3 + mr2 ) . (58) If η = 0, then σ10 = 0 is the only solution of (57); moreover, the value of ζ is arbitrary. 7.4.4

Motion of the relative equilibria

In this subsection we discuss how the disk is moving in space when its motion is a relative equilibrium corresponding to ξ = ((η, βhor), ζ) ∈ e(2) × R. Since σ1 (t) = u3 (t) = (A(t)−1 e3 )3 = (etζ E3 (A0 )−1 e−tη E3 e3 )3 , using (49) = (etζ E3 u0 )3 = u03 = σ10 = sin ϕ0 ,

(59)

the angle ϕ(t) of inclination of the oriented plane of the rim of the disk with the horizontal plane is constant, namely ϕ0 . From equation (32) it follows that q a3 (t) = r 1 − (σ10 )2 , (60)

that is, the height of the center of mass of the rolling disk above the horizontal plane is constant. For a relative equilibrium, we have v(t) = A(t)e3 = etη E3 A0 e−tζ E3 e3 = etη E3 v 0 ,

(61)

0 which implies that v3 (t) = v30 = σ10 and vhor(t) = etη E3 vhor . Also from the definition of ν we get

˙ −1 = (η E3 A − ζ AE3 )A−1 , ν × = AA

by differentiating A(t) = etη E3 A0 e−tζ E3

= η E3 − ζ AE3 A−1 = η (e3 )× − ζ (Ae3 )× = (η e3 − ζ v)× .

288

The rolling disk

Therefore ν(t) = η e3 − ζ v(t) =e

tη E3

=e

tη E3 0

(62) 0

(η e3 − ζ v ),

using (61)

ν .

(63)

Using (61) and (63), equation (23) for the derivative of the horizontal compontent of the position of the center of mass of the rolling disk becomes dahor = r(1 − (σ10 )2 )−1/2 etη E3 (ν 0 × (e3 − σ10 v 0 ))hor. dt

(64)

Evaluating (62) at t = 0 gives ν 0 = η e3 − ζ v 0 . So ν 0 × (e3 − σ10 v 0 ) = (η e3 − ζ v 0 ) × (e3 − σ10 v 0 ) = (ζ − η σ10 )e3 × v 0 = −σ40 e3 × v 0 ,

using (56).

Therefore we may rewrite (64) as dahor 0 = −r(1 − (σ10 )2 )−1/2 σ40 etη E3 E3 vhor , dt which integrated gives  0  r(1 − (σ10 )2 )−1/2 σ4 (1 − etη E3 )v 0 , if η 6= 0 hor η 0 ahor(t) − ahor =  − r(1 − (σ10 )2 )−1/2 σ40 (e3 × v 0 )hor t, if η = 0.

(65)

Now suppose that η 6= 0. From equation (65) it follows that ahor(t) traces out a circle Ca in the horizontal plane with uniform angular speed η having its center at C 0 = a0hor + r(1 − (σ10 )2 )−1/2

σ40 0 v η hor

(66)

and of radius Ra = r(1 − (σ10 )2 )−1/2 |

σ40 σ0 0 | kvhor k = r| 4 |, η η

(67)

0 since v 0 = vhor + σ10 e3 . We now find the motion p(t) of the point of contact. From (38) we have

p(t) = r(1 − (σ10 )2 )−1/2 σ10 vhor(t) + ahor(t) = r(1 − (σ10 )2 )−1/2 (σ10 − + r(1 − (σ10 )2 )−1/2

σ40 η

0 )etη E3 vhor

σ40 0 v + a0hor , η hor

(68)

289

7.4. Relative equilibria

0 using vhor(t) = etη E3 vhor and (65). Therefore the point of contact traces out a circle Cp in the horizontal plane at uniform angular speed η whose center is at C 0 (66). Its radius is ζ ζ 0 Rp = r(1 − (σ10 )2 )−1/2 | | kvhor k = r| |, (69) η η because by (56) we have ζ σ0 (η σ10 − ζ) = . σ10 − 4 = σ10 − η η η

Lemma 7.4.4.7. Suppose that η 6= 0 and ζ 6= 0. If I3 + mr2 ≥ 2I1 then the disk leans inward toward the center of the circle Cp traced out by its point of contact. If I3 + mr2 < 2I1 , then the disk leans outward from the center 2mgr of Cp if and only if σ10 6= 0, (1 − (σ10 )2 )1/2 η 2 > (2I1 −I and (σ 0 , η) 2 3 −mr ) satisfies equation (57). Proof. The disk leans inward if and only if Rp > Ra , that is, if and only if ζ2 ζ r2 2 > r2 (σ10 − )2 . (70) η η After multiplying both sides by η 2 , equation (70) is equivalent to (σ10 )2 η 2 − 2σ10 ζη < 0.

Using (57) to eliminate ζη, we find that (71) is equivalent to h i 0 > (σ10 )2 (2I1 − I3 − mr2 )η 2 − 2mgr(1 − (σ10 )2 )−1/2 .

(71) (72)

If (2I1 − I3 − mr2 ) < 0 and σ10 6= 0, then (72) holds for every η. If σ10 = 0 and η 6= 0, then (57) implies that ζ = 0. This contradicts the hypothesis of the lemma. The disk leans outward if and only if Rp < Ra , that is, if and only if i h (73) 0 < (σ10 )2 (2I1 − I3 − mr2 )η 2 − 2mgr(1 − (σ10 )2 )−1/2 .

Clearly the above inequality does not hold if (2I1 − I3 − mr2 ) ≤ 0. On the other hand, if (2I1 − I3 − mr2 ) > 0 then (73) holds for an arbitrary nonzero σ10 ∈ (−1, 1). This can be arranged by choosing |η| large enough and then making sure that (57) is satisfied by letting ζ be defined by (58). For the hoop 2I1 = I3 < I3 + mr2 . So when the hoop is moving in a relative equilibrium, which is a circle, it leans inward toward the center of this circle. If the mass of the reference disk is concentrated in two points which lie opposite to each other on the e3 -axis and at a distance greater than 12 r from the origin, then I3 + mr2 < 2I1 . So the disk leans outward, when moving in a relative equilibrium which is a circle.

290

The rolling disk

7.4.5

Nearly flat relative equilibria

Interesting relative equilibria occur when the rim of the disk approaches a horizontal position and its fully reduced energy remains bounded. From equations (56) and (57) it follows that I1 η 2 σ1 − I3 ησ4 = mgr σ1 (1 − σ12 )−1/2

(74)

In view of (19), the boundedness of the fully reduced energy E implies that σ4 remains bounded as σ1 → ±1. Then (74) implies that η2 ∼

mgr (1 − σ12 )−1/2 → ∞ as σ1 → ±1. I1

(75)

Moreover, equation (56) implies that ζ/η = σ1 − σ4 /η → ±1 and from (69) it follows that |p(t)| → r. Therefore the point of contact p(t) traces out a circle of radius approximately r at an angular speed which is approximately (mgr/I1 )1/2 (1 − σ12 )−1/4 . This speed tends to ∞ as σ1 → ±1. At the same time the center of mass of the disk traces out a circle of radius equal to r|σ1 − ζ/η| = r|σ4 /η|, see (67) and (56), which is small of order (1 − σ12 )1/4 as σ1 → ±1. Conversely, for any given σ3 and σ4 , there is a real solution η of (57). If σ4 is bounded as σ1 → ±1, then (75) holds. Therefore σ32 = η 2 (1 − σ12 )2 is of order (1 − σ12 )3/2 . From (19) with σ2 = 0 it follows that the fully reduced energy E remains bounded as σ1 → ±1. This shows that relative equilibria always exist with bounded fully reduced energy and almost horizontal rim. This completes our discussion of the relative equilibria of the rolling disk. 7.5 7.5.1

A potential function on an interval Chaplygin’s equations

If in the fully reduced equations of motion (13) we substitute the first 1 equation dσ dt = σ2 into the third and fourth equation and then divide both 1 sides of the resulting equations by dσ dt we obtain Chaplygin’s equations dσ3 I1 = − σ4 dσ1 I3 dσ4 1 mr2 =− σ3 . dσ1 I3 + mr2 1 − σ12

(76)

7.5. A potential function on an interval

291

Observe that for the rolling disk, Chaplygin’s system has the following discrete symmetries (σ1 , σ3 , σ4 ) → (−σ1 , −σ3 , σ4 ) → (−σ1 , σ3 , −σ4 ) → (σ1 , −σ3 , −σ4 ).

(77)

Chaplygin’s equations are a homogeneous linear system of differential equations for (σ3 , σ4 ) as functions of σ1 , whose coefficients are analytic functions of σ1 on (−1, 1). From the theory for such equations, see [27], it follows that for each (σ 3 , σ 4 ) ∈ R2 there is a unique solution (−1, 1) → R2 : σ1 7→ σ3 (σ1 ; σ 3 , σ 4 ), σ4 (σ1 ; σ 3 , σ 4 ) (78) such that σ3 (0; σ 3 , σ 4 ) = σ 3 and σ4 (0; σ 3 , σ 4 ) = σ 4 . Moreover the solution (78) is an analytic function of σ1 and depends linearly on the initial condition (σ 3 , σ 4 ) ∈ R2 .

Geometrically this means that the solution curves of Chaplgyin’s system define an analytic fibration of (σ1 , σ3 , σ4 )-space (−1, 1) × R2 . A trivialization of this fibration is given by the inverse of the analytic diffeomorphism ψ : (−1, 1) × R2 → (−1, 1) × R2 : σ1 , (σ 3 , σ 4 ) 7→ σ1 , σ3 (σ1 ; σ 3 , σ 4 ), σ4 (σ1 ; σ 3 , σ 4 ) ,

(79)

which is the flow of Chaplygin’s system. If

π2 : (−1, 1) × R2 → R2 : (σ1 , (σ 3 , σ 4 )) 7→ (σ 3 , σ 4 )

(80)

denotes the projection onto the second factor, then π = π2 ◦ ψ −1 : (−1, 1) × R2 → R2 is a projection map, whose fiber over (σ 3 , σ 4 ) is a solution curve of Chaplygin’s equations with initial condition (σ 3 , σ 4 ). The fact that the fully reduced equations of motion imply the Chaplygin system means that under the projection map P : (−1, 1) × R3 → (−1, 1) × R2 : (σ1 , σ2 , σ3 , σ3 ) 7→ (σ1 , σ3 , σ4 ) each solution of the fully reduced system is mapped onto a fiber of the map π. Equivalently, this means that the components of the R2 -valued function π ◦ P are constants of motion of the fully reduced system. We shall write π ◦ P as (σ 3 , σ 4 ), where σ 3 and σ 4 are viewed as analytic functions on the fully reduced phase space (−1, 1) × R3 .

292

The rolling disk

7.5.2

A conservative Newtonian system

In this subsection we use the integrals found in the preceding subsection to obtain a conservative one degree of freedom Newtonian system on (−1, 1) × R, whose energy is the sum of kinetic and potential energies. If we fix the values of the constants of motion σ 3 and σ 4 of the fully reduced system on (−1, 1)×R3 , then the corresponding fiber of the map π ◦ P is an analytic surface Σ in the fully reduced space, which is parametrized by (σ1 , σ2 ) ∈ (−1, 1) × R. The surface Σ is invariant under the flow of the fully reduced system. Another constant of motion of the fully reduced system is the fully reduced energy. In terms of the parameters (σ1 , σ2 ), the restriction to Σ of the fully reduced energy (19) is E = T (σ1 , σ2 ) + Vσ 3 ,σ4 (σ1 ),

(81)

where T (σ1 , σ2 ) =

1 2

(I1 + mr2 )

σ22 1 − σ12

(82)

plays the role of kinetic energy and Vσ 3 ,σ4 (σ1 ) =

1 2

I1

1 σ3 (σ1 ; σ 3 , σ 4 )2 1 − σ12

+

1 2

(I3 + mr2 )σ4 (σ1 ; σ 3 , σ 4 )2 + mgr(1 − σ12 )1/2

(83)

is viewed as potential energy. The potential function Vσ 3 ,σ 4 is invariant under the discrete symmetries (77) with the variables (σ3 , σ4 ) there being replaced by (σ 3 , σ 4 ). The kinetic energy (82) simplifies if we parametrize the one dimensional configuration space (−1, 1) by arc length. This amounts to introducing angle ϕ by setting σ1 = sin ϕ with −π/2 < ϕ < π/2, which is the same as the angle in (27). Then (1 − σ12 )−1/2 σ˙ 1 = (cos ϕ)−1 (cos ϕ)ϕ˙ = ϕ. ˙ So E=

1 2

(I1 + mr2 )(ϕ) ˙ 2 + V (ϕ),

where V (ϕ) = Vσ 3 ,σ4 (sin ϕ). Conservation of energy to the conservative Newtonian system M

d2 ϕ ∂V (ϕ) + =0 dt2 ∂ϕ

(84) dE dt

= 0 is equivalent

(85)

(ϕ) with inertial mass M = I1 + mr2 and force equal to − ∂V∂ϕ , provided dϕ dt

6= 0. Because the analytic vector field Xσ 3 ,σ4 on (−π/2, π/2) associated

7.5. A potential function on an interval

293

to the second order equation (85) has integral curves governed by dϕ 1 = θ dt M dθ ∂V (ϕ) =− , dt ∂ϕ

(86)

it follows that (85) holds when dϕ dt = 0. Recall that if t 7→ ϕ(t) is a solution of the Newtonian system (85), then 1 σ2 (t) = dσ ˙ cos ϕ(t) is known. Moreover, we see that substituting dt = ϕ(t) t 7→ σ1 (t) = sin ϕ(t) into a solution of Chaplygin’s system (76) gives t 7→ σ3 (σ1 (t); σ 3 , σ 4 ), σ4 (σ1 (t); σ 3 , σ 4 ) ,

which is a solution of the third and fourth equations in the fully reduced system (13). Thus, given a solution of (85), we know a solution t 7→ (σ1 (t), σ2 (t), σ3 (t), σ4 (t)) of the fully reduced equations of motion. From this solution we can reconstruct the full motion of the disk in space as in §3.2. 7.5.3

Qualitative behavior

In this subsection we discuss the qualitative behavior of the integral curves of the vector field Xσ3 ,σ 4 on (−π/2, π/2)×R (86), or equivalently the second order differential equation (85) on the configuration space (−π/2, π/2). We now recall the classical description of these motions. Choose a value E of the energy E (84) which is larger than the infimum of the potential function V on (−π/2, π/2). Let (ϕ− , ϕ+ ) ⊆ [−π/2, π/2] be a connected component of the open set {ϕ ∈ (−π/2, π/2) V (ϕ) < E}. Note that V (ϕ− ) = E and V (ϕ+ ) = E if ϕ± both lie in (−π/2, π/2). As long as ϕ ∈ (ϕ− , ϕ+ ) we have 1/2 dϕ 2 =± (E − V (ϕ)) . dt M Separating the variables and integrating yields Z ϕ(t1 ) −1/2 2 t1 − t0 = ± (E − V (ϕ)) dϕ, M ϕ(t0 )

(87)

which is a formula for the time needed to go from ϕ(t0 ) to ϕ(t1 ). A solution t 7→ ϕ(t) of (85) is obtained by inverting the function ϕ(t1 ) 7→ t1 defined by (87).

294

The rolling disk

Take the plus sign in (87). If ϕ+ < π/2 and V 0 (ϕ+ ) 6= 0, then V 0 (ϕ+ ) > 0 and the right hand side of (87) has a finite positive limit as ϕ(t1 ) ↑ ϕ+ . This means that ϕ(t) reaches ϕ+ for the first (finite) time at t = T . From 2 (85) it follows that at t = T we have ddtϕ2 < 0. Thus the continuation of the solution for t > T is determined by taking the minus sign in (87). So ϕ(t) = ϕ(2T − t) for all t. If in addition ϕ− > −π/2 and V 0 (ϕ− ) 6= 0, then V 0 (ϕ− ) < 0 and for t > T the solution will reach ϕ− for the first (finite) time t = S > T . So we have ϕ(t) = ϕ(2S − t) for all t. It follows that ϕ(t) = ϕ(2T − t) = ϕ(2S − (2T − t)) = ϕ(2(S − T ) + t)

for all t.

Consequently, the solution is periodic with minimal positive period equal to 2(S − T ). Thus ϕ(t) oscillates in between ϕ− and ϕ+ , monotonically increasing from ϕ− to ϕ+ and following the time reversed motion when it goes from ϕ+ to ϕ− . Here S − T is equal to the right hand side of (87) with the limits ϕ(t0 ) and ϕ(t1 ) replaced by ϕ− and ϕ+ . Therefore the minimal positive period τ of the motion is given by Z ϕ+ −1/2 2 (88) (E − V (ϕ)) dϕ. τ =2 M ϕ− If V 0 (ϕ+ ) = 0, then (ϕ+ , 0) is an equilibrium point of the vector field Xσ3 ,σ4 (86). Because E − V (ϕ) = V (ϕ+ ) − V (ϕ) = O((ϕ − ϕ+ )2 ) as ϕ ↑ ϕ+ . The right hand side of (87) diverges as ϕ(t1 ) ↑ ϕ+ . Therefore the integral curve t 7→ (ϕ(t), ϕ(t)) ˙ of Xσ3 ,σ4 converges to the equilibrium point (ϕ+ , 0) as t → ∞. Choosing the minus sign in the right hand side of (87), a similar argument shows that the integral curve t 7→ (ϕ(t), ϕ(t)) ˙ converges to the equilibrium point (ϕ+ , 0) as t → −∞. Similar conclusions hold when ϕ+ is replaced by ϕ− . Now assume that ϕ+ = π/2 and that limϕ↑π/2 V (ϕ) < E. Then the right hand side of (87) has a finite limit as ϕ(t1 ) ↑ π/2. Thus there is a finite time T such that ϕ(t) ↑ π/2 as t ↑ T , provided we take the plus sign in (87). In this case the planar vector field Xσ3 ,σ 4 is incomplete, that is, it has integral curves which leave every compact subset of (−π/2, π/2) × R in finite time. Because ϕ is the angle between the oriented plane of the rim of the disk and the vertical direction, the plane of the rim of the disk for such solutions reaches the horizontal position in finite time. In other words, the disk falls flat in finite time. The choice of minus sign in (87) leads to an integral curve of Xσ 3 ,σ4 where the plane of the disk rises in a finite time from the horizontal position.

7.5. A potential function on an interval

295

In §12 we will see that there are no other cases to be considered for the potential Vσ 3 ,σ4 (83). This concludes our review of the properties of the conservative Newtonian system (85). 7.5.4

A special case of falling flat

A simple special case of an integral curve of the vector field Xσ 3 ,σ4 (86) occurs when σ 3 = σ 4 = 0. Because this implies that the solutions of the Chaplygin system (76) are linearly dependent, we see that σ3 (t) = σ4 (t) = 0. Therefore (26) implies that at all time the angular velocity vector ω of the disk is horizontal and is perpendicular to the vector u(t), see (9). From the definition of the potential function Vσ 3 ,σ 4 (83) it follows that V0,0 = mgr(1 − σ12 )1/2 , or, using the substitution σ1 = sin ϕ, that V (ϕ) = mgr cos ϕ for ϕ ∈ (−π/2, π/2). Therefore the conservative Newtonian system (85) is equal to the mathematical pendulum on the upper semicircle of radius r. Because all of its solutions, except for the unstable equilibrium point (0, 0) and its asymptotic stable manifold reach ϕ = ±π/2 in finite time, the disk falls flat in finite time. Thus the planar vector field Xσ3 ,σ4 (86) on (−π/2, π/2) × R is incomplete in the sense that not all of its solutions are defined for all time. We now reconstruct this motion of the disk. The differential equation (28) satisfied by the angle ψ is dψ dt = 0, using the fact that σ3 (t) = σ4 (t) = 0 for all time. Therefore the angle ψ is constant. Applying a rotation in body coordinates, which is an element of the S 1 symmetry group, we can arrange that ψ = π. In other words, A(t)−1 e3 = u(t) = − cos ϕ(t)e1 + sin ϕ(t)e3 .

(89)

˙ 1 , which From equations (33) with ψ(t) = π and (89) we obtain a˙ = −r Ae integrated gives a(t) = a(0) − r(A(t) − A(0))e1 . In view of equation (38) for the point of contact, the above equation implies that the point of contact is constant, namely, p = −rA(0)e1 + a(0). Using an element of the translation subgroup of the symmetry goup E(2), we can arrange that p = 0. Having done this the position of the center of mass of the disk in space is given by a(t) = rA(t)e1 . From the differential equation (36) governing the angle χ, it follows that χ(t) is constant. Using a suitable rotation B about the vertical axis, which leaves the vector u(t) fixed and

296

The rolling disk

takes v(t) to Bv(t), we can arrange that χ = 0. This means that A(t)e3 = v(t) = cos ϕ(t)e1 + sin ϕ(t)e3 .

(90)

Equations (89) and (90) imply that A(t) is a rotation about the e2 -axis through an angle of ϕ(t) + π/2, see lemma 7.3.4.2. Thus the action of the element (A(t), a(t)) of E(2) on a material point x on the rim of the disk is A(t)x+a(t) = A(t)(x+re1 ). The above argument shows that after bringing the rim of the disk into the vertical position with its point of contact at the origin and its rim to lie in the e2 –e3 plane, then the disk moves so that at time t the plane of the rim of the disk has rotated about the e2 -axis through an angle ϕ(t). The above motion of the disk falling flat has the full motion of the mathematical pendulum as its analytic continuation. Here the disk is attached to the horizontal plane at its fixed point of contact and is allowed to pass through the horizontal plane. In §11.1 we will investigate motions when the disk comes close to the flat position but then rises up. In the fully reduced phase space these nearby motions converge to a motion ϕ(t) where at the horizontal position we have an elastic reflection. This means that dϕ dt changes sign, after which the motion continues as before but in the opposite direction, see lemma 7.11.1.25. For the motion of the disk in space this limiting behavior is subtler, see §11.2 and §11.3. 7.6

Scaling

In this section we rescale Chaplygin’s equations (76) to eliminate superfluous parameters. This will simplify the discussion later on of the asymptotic behavior of its solutions. Consider the scaling defined by σ3 = −

I3 u e x, I1

σ4 = u e y, and σ1 = z,

(91)

where x, y, and z are new variables and u e > 0 is a positive constant. Then Chaplygin’s equations (76) become the rescaled Chaplygin equations x0 (z) = y(z) y 0 (z) = c

1 x(z), 1 − z2

(92)

297

7.6. Scaling 2

I3 where c = I3mr +mr2 I1 . Note that the rescaled Chaplygin equations imply the second order linear differential equation 1 x(z). (93) x00 (z) = c 1 − z2 Consider the function σ32 + 12 (I3 + mr2 )σ42 + mgr(1 − σ12 )1/2 Ve (σ1 , σ3 , σ4 ) = 21 I1 1 − σ12

on (−1, 1) × R2 . Then Ve ◦ ψ = Vσ 3 ,σ4 . Here ψ is q the map defined in mgr (79). If we now rescale Ve using (91) and set u e = I3 +mr2 , we obtain

f (x, y, z), where Ve (σ1 , σ3 , σ4 ) = mgr W f (x, y, z) = 1 d 1 x2 + W 2 1 − z2 and d =

I32 I1 (I3 +mr2 ) .

1 2

y 2 + (1 − z 2 )1/2

(94)

Let

1 (95) x(z; X, Y )2 + 12 y(z; X, Y )2 + (1 − z 2 )1/2 1 − z2 be the rescaled potential function. Here z 7→ (x(z; X, Y ), y(z; X, Y )) is the solution of the rescaled Chaplygin equations (92) with (X, Y ) = f ◦ ψ = WX,Y . (x(0; X, Y ), y(0; X, Y )). Thus W WX,Y (z) =

1 2

d

We wish to compute the derivative of WX,Y with respect to z. To do this we evaluate the partial differential operator ∂ ∂ c ∂ C= +y + , (96) 2 ∂z ∂x 1 − z ∂y f . The operator C represents the vector field on (−1, 1) × on the function W

R2 whose integral curves are governed by the rescaled Chaplygin equations (92). This gives z f = d z x2 + (c + d) 1 x y − CW . (97) 2 2 1−z 1−z (1 − z 2 )1/2 f= Note that CW with u e=

q

dWX,Y dz

mgr I3 +mr2 ,

. If we take

σ3 = −

I3 u eX I1

and σ 4 = u eY

(98)

then we obtain Vσ 3 ,σ4 (σ1 ) = mgr WX,Y (z).

As a result of the rescaling, we have only two parameters c and d in ∂WX,Y our rescaled Newtonian equations of motion ϕ¨ = − ∂ϕ (sin ϕ). The parameter d appears in the rescaled potential function and c in the rescaled

298

The rolling disk

Chaplygin equations. For the uniform disk we have c = 43 and d = 23 , while for the hoop c = 1 and d = 1. In general, the parameters c and d are subject to the inequalities c > 0, d > 0, and c + d ≤ 2.

(99)

The last inequality above follows because mr2 I3 I3 I3 c+d = + ≤ 2. = I1 I3 + mr2 I3 + mr2 I1

Equality holds above if all the mass of the disk is in the plane of the rim. 7.7

Solutions of the rescaled Chaplygin equations

In this section we give a detailed description of the solutions of the rescaled Chaplygin equations (92) as z → ±1. We begin by finding the recessive solution of the rescaled Chaplygin equations and then use the Wronskian method to finish solving the Chaplygin system. 7.7.1

The recessive solution

If we rewrite equation (93) as 2(1 + z)x00 (z) = cx(z) + (1 + z)2 x00 (z) and then substitute the power series ∞ X r(z) = rn (1 + z)n ,

(100)

n=1

we obtain the recurrence relation c + n(n − 1) rn+1 = rn 2(n + 1)n

with r1 = 1.

(101)

The ratio test shows that the radius of convergence of (100) is 2. This agrees with the fact that the other singular point of equation (93) is z = 1, which is at a distance 2 from the singular point at z = −1. Thus the power series (100) for r(z) with coefficients satisfying (101) is a solution of (93) for all z ∈ (−1, 1). Because c > 0, from (101) we see that rn > 0 for every n ≥ 1. In particular it follows that r(z) > 0 and r0 (z) > 0 for every z ∈ (−1, 1). A formula for the constants r(0) and r0 (0) as an entire analytic function of c will be given in §7.4. For later use we prove

7.7. Solutions of the rescaled Chaplygin equations

299

Lemma 7.7.1.8. There is a function u(z), which is analytic at z = −1 such that 1 1 1 = − 12 c + u(z). (102) 2 2 r(z) (1 + z) 1+z Proof. Since r(−1) = 0, it follows that r(z) = (1 + z)s(z) where s(z) is analytic at z = −1 and s(−1) = 1. Therefore r(z)−2 = (1 + z)2 t(z) where t(z) = s(z)−2 is analytic at z = −1 with t(−1) = 1 and t0 (−1) = −2s0 (−1) = −2r2 = − 21 c, using (101). From this equation (102) follows. 7.7.2

Asymptotics

In this subsection we study the asymptotic behavior of the solutions of the rescaled Chaplygin equations as z ↓ −1. To do this we first prove Lemma 7.7.2.9. There is a function w(z), which is analytic on (−1, 1) with w(−1) = 1 such that the solution x(z) of the rescaled Chaplygin equations (93) with x(0) = X and x0 (0) = Y is given by x(z) =

X r(z) + (Xr0 (0) − Y r(0))[ 12 c r(z) ln(1 + z) + w(z)]. r(0)

(103)

Proof. We use the Wronskian method to find a second solution of (93). Because xr − rx (x0 r − xr0 )0 = x00 r − xr00 = c = 0, 1 − z2

the Wronski determinant w = x0 (z)r(z)−x(z)r0 (z) is constant, say, Y r(0)− Xr0 (0). Integrating the identity (x/r)0 = w/r2 from 0 to z gives Z z X 1 x(z) = r(z) + wr(z) dζ for z ∈ (−1, 1). (104) 2 r(0) 0 r(ζ) Inserting (102) into the above equation and integrating gives X 1 x(z) = r(z) + w r(z) − − 21 c ln(1 + z) + v(z) , r(0) 1+z Rz where v(z) = 0 u(ζ) dζ. Note that v(z) is analytic at z = −1. Using the fact that r(−1) = 0 and r0 (−1) = 1, we obtain (103) where w(z) = r(z)((1+z)−1 −v(z)). Observe that w(z) in analytic in (−1, 1) and w(−1) = r0 (−1) = 1.

300

The rolling disk

We now draw some conclusions about the behavior of the solution (103). From (103) we see that x(z) has a finite limit Xr0 (0) − Y r(0) as z ↓ −1. This limit is equal to zero if and only if the Wronskian determinant equals zero, which implies that x(z) is a constant multiple of r(z). For this reason r(z) is called the normalized recessive solution of (93) at z = −1. Every solution of (93) which has a nonzero limit as z ↓ −1 is called a dominant solution at z = −1. Differentiating (103) gives x0 (z) =

X 0 r (z) + (Xr0 (0) − Y r(0)) r(0)

1 2

c[r(z)(1 + z)−1

+r (z) ln(1 + z)] + w (z) . 0

This shows that x0 (z) ∼

1 2

(105)

0

c(Xr0 (0) − Y r(0)) ln(1 + z) as z ↓ −1.

(106)

Therefore x0 (z) has an infinite limit as z ↓ −1 provided w− = Xr0 (0) − Y r(0) 6= 0. The sign of this limit is opposite to the sign of w− . If w− = 0 X X then x0 (z) = r(0) r0 (z), which converges to r(0) as z ↓ −1. Because equation (93) has the discrete symmetry z 7→ −z, the function z 7→ x(−z) is another solution with x(0) = X and x0 (0) = −Y . Equation (103) remains valid if we first replace x(z) and Y by x(−z) and −Y , respectively, and then replace z by −z. This gives x(z) =

X r(−z) + (Xr0 (0) + Y r(0))[ 12 c r(−z) ln(1 − z) + w(−z)]. (107) r(0)

Recall that the function w(z) is analytic at z = −1 with w(−1) = 1. This implies that limz↑1 x(z) = Xr0 (0) + Y r(0). Similarly, we obtain x0 (z) = − which gives

X 0 r (−z) − (Xr0 (0) + Y r(0)) 12 c[r(−z)(1 − z)−1 r(0) +r0 (−z) ln(1 − z)] + w0 (−z) ,

x0 (z) ∼ − 21 c(Xr0 (0) + Y r(0)) ln(1 − z) as z ↑ 1,

(108)

because r(−1) = 0 and r0 (−1) = 1. Therefore x0 (z) has an infinite limit as z ↑ 1, provided w+ = Xr0 (0) + Y r(0) 6= 0. Moreover the limit has the X same sign as w+ . If w+ = 0, then x0 (z) = − r(0) r0 (−z), which converges to X − r(0) as z ↑ 1.

7.7. Solutions of the rescaled Chaplygin equations

7.7.3

301

The normalized even and odd solutions

In this subsection we define the normalized even and odd solutions of the rescaled Chaplygin equations. Let e(z) and o(z) be solutions of x00 (z) =

c x(z) 1 − z2

(109)

such that e(−z) = e(z) with e(0) = 1, e0 (0) = 0 and o(z) = −o(−z) with o(0) = 0, o0 (0) = 1. Then e(z) and o(z) are the normalized even and odd solutions of (109), respectively. Every solution of the rescaled Chaplygin equations (92) can be written as x(z) = X e(z) + Y o(z) y(z) = x0 (z) = X e0 (z) + Y o0 (z)

(110)

P∞ Now e(z) = k=0 ek z 2k with e0 = 1, e1 = c/2, and ek > 0 for every k > 1. The Taylor expansion of e(z) at z = 0 is e(z) = 1 + 21 c z 2 +

1 24 c(c

+ 2) z 4 + O(z 6 ),

(111)

which implies e0 (z) = c z + 16 c(c + 2) z 3 + O(z 5 ).

(112)

0 P∞ Moreover, e(z) > 1 and e z(z) > 1, when z > 0. Also o(z) = k=0 ok z 2k+1 with o1 = 1 and ok > 0 for every k > 1. Thus the Taylor expansion of o(z) at z = 0 is

o(z) = z + 16 c z 3 +

1 120 c(c

+ 6) z 5 + O(z 7 )

(113)

o0 (z) = 1 + 21 c z 2 +

1 24 c(c

+ 6) z 4 + O(z 6 ).

(114)

and

Moreover, 7.7.4

o(z) z

> 1 and o0 (z) > 1, when z > 0.

Computation of r(0) and r 0 (0)

In this subsection we give an explicit formula for the constants r(0) and r0 (0) as a function of c. In order to calculate r(0) and r0 (0) we need to understand the relation of equation (109) with the hypergeometric equation. Because e is an even

302

The rolling disk

analytic function, e(z) = f (z 2 ) for some analytic function f (ζ) of ζ = z 2 in a neighborhood of ζ = 0 which satisfies f (0) = 1. Since e0 (z) = 2z f 0 (z 2 ) and e00 (z) = 2 f 0 (z 2 ) + 4z 2 f 00 (z 2 ), we see that (109) is equivalent to ζ(1 − ζ)f 00 (ζ) +

1 2

(1 − ζ)f 0 (ζ) − 14 cf (ζ) = 0.

(115)

Equation (115) is the hypergeometric equation ζ(1 − ζ)f 00 (ζ) + (γ − (α + β + 1)ζ)f 0 (ζ) − αβf (ζ) = 0, with γ = α + β + 1 =

1 2

√ 1

α = − 14 +

4

and αβ = 14 c. In other words, √ 1 − 4c, β = − 41 − 14 1 − 4c,

γ=

1 2

(116)

.

(117)

The parameters α and β are nonreal complex conjugate numbers if c > 14 . For arbitrary values of α, β, and γ, the solution f (ζ) of (116), which is analytic at ζ = 0 and satisfies f (0) = 1, is Gauss’ hypergeometric function F (α, β, γ; ζ) =

∞ X (α)n (β)n ζ n . (γ)n n! n=0

(118)

Here (δ)0 = 1 and (δ)n = δ(δ + 1) · · · (δ + n − 1) when n ≥ 1. It is known that Γ(γ)Γ(γ − α − β) lim F (α, β, γ; ζ) = , (119) ζ↑1 Γ(γ − α)Γ(γ − β)

if Re (γ − α − β) > 0, see pp. 249–251 in [28]. From (116) and (117) it follows that √ √ e(z) = F (− 41 + 41 1 − 4c, − 14 − 14 1 − 4c, 12 ; z 2 ). (120) Because γ − α − β = 1 > 0, we can combine (119) with limz↑1 x(z) = X r0 (0) + Y r(0), see (103), and (X, Y ) = (1, 0) to obtain r0 (0) = lim e(z) = z↑1

Γ( 12 ) √ √ . Γ( 34 − 41 1 − 4c)Γ( 34 + 41 1 − 4c)

(121)

Similarly the normalized odd solution o(z) of (109) can be written as o(z) = zg(z 2), where g(ζ) is an analytic function of ζ = z 2 in a neighborhood of 0, which satisfies g(0) = 1. Since o0 (z) = g(z 2 ) + 2z 2 g 0 (z 2 ) and o00 (z) = 6z g 0 (z 2 ) + 4z 3 g 00 (z 2 ), we see that (109) is equivalent to the hypergeometric equation ζ(1 − ζ)g 00 (ζ) + 32 (1 − ζ)g 0 (ζ) − 41 c g(ζ) = 0,

(122)

303

7.8. Bifurcations of a vertical disk

with γ = α + β + 1 = 32 and αβ = 14 c. In other words, √ √ α = 14 + 14 1 − 4c, β = 14 − 14 1 − 4c, and γ = 32 . Therefore o(z) = z F ( 41 +

1 4

√ √ 1 − 4c, 14 − 14 1 − 4c, 32 ; z 2 ).

(123)

Again, because γ − α − β = 1, we can combine (119) with (103) and (X, Y ) = (0, 1) to obtain r(0) = lim o(z) = z↑1

7.8

Γ( 32 ) √ √ . Γ( 54 − 14 1 − 4c)Γ( 54 + 41 1 − 4c)

(124)

Bifurcations of a vertical disk

In this section we discuss the bifurcations that a disk in a vertical position undergoes, when it is either spinning about a fixed point of contact or rolling in a straight line. 7.8.1

Degenerate equilibria

Suppose that p is an equilibrium point of a smooth vector field X on a smooth manifold P . Then its linearization DX(p) at p is a well defined linear map of the tangent space of P at p into itself. The equilibrium points of the fully reduced vector field V (13) on the fully reduced phase space M = (−1, 1) × R3 form a connected two dimensional analytic submanifold N of relative equilibria described in §4. Because (σ 3 , σ 4 ) can be used as coordinates on N , the rank of DV(σ 0 ) at σ 0 ∈ N is at most 2. From the first equation in (13) we see that its rank is at least 1. From the discussion in §5 it follows that an equilibrium point σ 0 of V corresponds to an equilibrium point (ϕ0 , 0) of the vector field Xσ3 ,σ 4 (86) on (−π/2, π/2) × R. In particular, ϕ0 corresponds to a point where the derivative of the potential function V (ϕ) = Vσ 3 ,σ4 (sin ϕ) = Vσ 3 ,σ 4 (σ1 ) (83) vanishes, that is, ϕ0 ∈ (−π/2, π/2) is a critical point of V . The equilibrium point (ϕ0 , 0) of the vector field Xσ3 ,σ4 is stable (of elliptic type) or unstable 2 (of hyperbolic type) if the second derivative dd ϕV2 at ϕ0 is positive or negative, respectively. We will use the same terms for the corresponding relative equilibria. The relative equilibrium σ 0 of the fully reduced vector field V is degenerate if the rank of DV(σ 0 ) is 1. This is equivalent to saying that at the

304

The rolling disk

equilibrium point (ϕ0 , 0) of Xσ3 ,σ4 the potential V has a degenerate critical point at ϕ0 , that is, d2 V 0 dV 0 (ϕ ) = (ϕ ) = 0. dϕ dϕ2 In other words, σ10 = sin ϕ0 is a degenerate critical point of the potential Vσ 3 ,σ 4 , that is, Vσ0 3 ,σ4 (σ10 ) = Vσ003 ,σ4 (σ10 ) = 0. We call the set ∆ of parameter values (σ 3 , σ 4 ) for which the potential Vσ 3 ,σ4 has a degenerate critical point in (−1, 1) the degeneracy locus of Vσ 3 ,σ4 . In the remainder of this section we will study the geometry of ∆. 7.8.2

Vertical degenerate relative equilibria

In this subsection we investigate the bifurcation of the relative equilibria of the disk for which σ1 = 0. From §4 we see that these relative equilibria correspond to the disk being in a vertical position either spinning about a fixed point of contact or rolling along a straight line. Since 0 = σ1 = z, we look at the rescaled potential function WX,Y (95) near 0. Recall that the solutions z 7→ x(z; X, Y ), y(z; X, Y ) of the rescaled Chapygin equations (92) with initial conditions x(0; X, Y ), y(0; X, Y ) = (X, Y ) are x(z; X, Y ) = X e(z) + Y o(z) and y(z; X, Y ) = X e0 (z) + Y o0 (z). (125) Here e(z) and o(z) are the normalized even and odd solutions, whose Taylor series about z = 0 to fourth order are given in equations (111) and (113). Consequently, the Taylor expansion of WX,Y about z = 0 to fourth order terms is 4 X WX,Y (z) = wj (X, Y ) z j + O(z 5 ), (126) j=1

where

w0 (X, Y ) = 1 + 21 d X +

1 2

Y2

(127)

w1 (X, Y ) = (c + d)XY w2 (X, Y ) = w3 (X, Y ) = w4 (X, Y ) =

− 12 + 1 3 (c + − 18 +

1 2

(128) 2

2

(d + cd + c )X +

1 2 (c

+ d)Y

2

2

(129)

3d + 2cd + 2c )XY

(130)

2 2 3 2 1 6 (3d + 14cd + 2c + c d + c )X 1 + 12 (3c + 6d + 2cd + 2c2 )Y 2 .

(131)

305

7.8. Bifurcations of a vertical disk

From equations (127)–(131) we see that the rescaled potential function WX,Y has a critical point at z = 0 if and only if 0 = w1 (X, Y ) = XY . There are two cases. Case 1. Suppose that X = 0. Since σ1 (t) = σ10 = z = 0, we obtain dσ3 1 σ2 (t) = dσ dt = 0. Then the last two equations in (13) read dt = 0 and I1 dσ4 e X, see (98). In view dt = 0. Therefore σ3 (t) = σ 3 = 0, because σ 3 = − I3 u of the first equation in (56), this means that η = 0. Thus the disk is in a vertical position and is rolling in a straight line with constant speed. From (129) we see that if w2 (0, Y ) < 0, that is, |Y | < (c+d)−1/2 , then this relative equilibrium is unstable; whereas if w2 (0, Y ) > 0, that is, |Y | > (c + d)−1/2 , then it is stable. Because σ4 (t) = σ 4 at z = 0, from the second q equation in ζ σ4 e = I3mgr (56), η = 0 and (98), we obtain Y = ue = ue . Since u +mr2 , we see that the relative equilibrium is unstable, respectively, stable, if the angular speed |ζ| of rolling q is less than, respectively, greater than, the critical speed mgr 1 |ζcrit | = √c+d I3 +mr2 = σ 4 . Thus the vertical rolling disk undergoes 1 gyroscopic stabilization when |ζ| increases through |ζcrit |. At 0, ± √c+d = (0, Y0 ), we have w1 (0, Y0 ) = w2 (0, Y0 ) = w3 (0, Y0 ) = 0, but w4 (0, Y0 ) =

1 12 (3c

+ 6d + 2cd + 2c2 )(c + d)−1 ,

which is positive. Thus the rescaled potential W0,Y0 has a stable degenerate equilibrium point at z = 0. In other words, the points r 1 mgr (σ 3 , σ 4 ) = 0, ± √ (132) c + d I3 + mr2 lie on the dengeneracy locus ∆ of the potential Vσ 3 ,σ4 . Case 2. Now consider the case when Y = 0. Because σ4 (t) = σ 4 = u e Y = 0, from (56) it follows that ζ = 0. Thus the disk is spinning in a vertical position with its point of contact fixed, see (69). From (129) we see that if w2 (X, 0) < 0, that is, |X| < (d + cd + c2)−1/2 , then the relative equilibrium is unstable; whereas, if w2 (X, 0) > 0, that is, |X| > (d + cd + c2 )−1/2 then it is stable. Because σ3 (t) = σ 3 at z = 0, we have X = − σue3 II13 = − uηe II13 , q mgr using (98). Since u e = I3 +mr2 , we see that the relative equilibrium is

unstable, respectively, stable if the speed of spinning |η| is less q than, remgr 1 spectively, greater than the critical speed |ηcrit | = II31 √d+cd+c 2 I3 +mr2 .

Thus the vertical spinning motion of the disk undergoes gyroscopic stabi 1 lization when |η| increases through |ηcrit |. At ± √d+cd+c , 0 = (X 0 , 0), 2 we have w1 (X0 , 0) = w2 (X0 , 0) = w3 (X0 , 0) = 0, but w4 (X0 , 0) =

1 24 (9d

+ 5c2 + 11cd + 4c2 d + 4c3 )(d + cd + c2 )−1 ,

306

The rolling disk

which is positive. Therefore WX0 ,0 has a stable degenerate equilibrium point at z = 0. In other words, the points r I3 1 mgr (σ 3 , σ 4 ) = ± √ , 0 (133) I1 d + cd + c2 I3 + mr2 lie on the dengeneracy locus ∆ of the potential Vσ 3 ,σ4 . 7.8.3

Normal form of the potential

In order to describe the qualitative properties of the degeneracy locus ∆ in a neighborhood of the four points (σ 3 , σ 4 ) given in (132) and (133), we need to describe the qualitative properties of the family of rescaled potential functions W : (X, Y ) 7→ WX,Y

(134)

when the parameters (X, Y ) are close to (0, Y0 ) or (X0 , 0). To do this we use the theory of singularities of mappings, see [8] or [40]. The key concept in this theory is the notion of equivalence of two analytic families f and g of analytic functions. For f to be an analytic family of analytic functions we mean that f : Rn × Rk → R : (x, α) 7→ f (x, α) is an analytic function near (x0 , α0 ), which we think of as an n-parameter analytic family of analytic functions of k-variables given by x 7→ fx . Here fx is the analytic function on Rk defined by holding the variable x in the function f fixed, that is, α 7→ fx (α) = f (x, α). Also x varies in a neighborhood of x0 with fx being defined in a neighborhood of α0 . Let g : Rn × Rk → R : (y, β) 7→ g(y, β) define an analytic family near (y0 , β0 ). Informally we say that the analytic families f and g are equivalent if for each y near y0 the function β 7→ g(y, β), defined for β near β0 , can be transformed into the function α 7→ f (x, α), defined for α near α0 , plus a constant h(x) for x near x0 by a substitution of variables of the form (x, α) 7→ (y(x), β(x, α)), which is defined near (x0 , α0 ). More precisely, we say that two analytic families f and g given above are equivalent if there is a local analytic diffeomorphism H : Rn × Rk → Rn × Rk : (x, α) 7→ y(x), β(x, α) such that H(x0 , α0 ) = (y0 , β0 ) and

g(y(x), β(x, α)) = f (x, α) + h(x) n

k

(135) n

in some open neighborhood of (x0 , α0 ) in R × R . Here h : R → R is an analytic function near x0 .

7.8. Bifurcations of a vertical disk

307

The following proposition is a consequence of results from the theory of singularities of mappings. Proposition 7.8.3.10. Suppose that we are given a two parameter analytic family g of analytic functions near (y0 , β0 ) in R2 × R. In other words we are given a function g : R2 × R → R : (y, β) '→ g(y, β),

which is analytic near (y0 , β0 ). In addition suppose that the function g satisfies the following conditions. 1. There is a positive integer p such that the partial derivatives ∂g ∂2g ∂ p+1 g = = ··· = =0 2 ∂β ∂β ∂β p+1 at (y0 , β0 ). 2. At (y0 , β0 ) the partial derivative

∂ p+2 g ∂β p+2

is nonzero. If p is even then

the sign in equation (136) below is equal to the sign of Otherwise the plus sign holds. 3. Let D be the 2 × p matrix 0 B @

∂2g (y0 , β) ∂y1 ∂β

···

∂2g (y0 , β) ∂y2 ∂β

···

1

∂ p+1 g (y0 , β) ∂y1 ∂β p C

A

∂ p+1 g (y0 , β) β=β0 ∂y2 ∂β p

∂ p+2 g ∂β p+2

at (y0 , β0 ).

.

We suppose that rank D = 2. Then the analytic family g near (y0 , β0 ) is equivalent to the analytic family fp+2 : Rp × R → R : ((x1 , . . . , xp ), α) '→ ±αp+2 + x1 α + x2 α2 + · · · + xp αp

(136)

near (0, 0).

!

Proof. See [8]. The family (136) is the normal form of the Ap+1 -singularity.

We now show that we can apply proposition 7.8.3.10 to the family W of rescaled potential functions (134) near ((X, Y ), z) = ((0, Y0 ), 0) with p = 2. From (126) and (127)–(131) it follows that at ((0, Y0 ), 0) the partial derivatives ∂ 2 WX,Y ∂WX,Y ∂ 3 WX,Y = = =0 ∂z ∂z 2 ∂z 3

308

The rolling disk ∂4W

and ∂zX,Y > 0. Moreover, using (127)–(131) we find that at ((0, Y0 ), z) 4 we have ∂ 2 WX,Y ∂ 3 WX,Y = (c + d)Y0 + O(z) = (c + d)Y02 + O(z) ∂X∂z ∂X∂z 2 ∂ 2 WX,Y ∂ 3 WX,Y = 2(c + d)Y0 z + O(z 2 ) = (c + d)Y0 + O(z 2 ). ∂Y ∂z ∂Y ∂z 2 1 0 2 So at ((0, Y0 ), 0) the matrix D =@(c + d)Y0 (c + d)Y0 A. But then det D = 0

(c + d)Y0

(c + d)2 Y02 = (c + d) > 0. Therefore D has rank equal to 2. Thus all the hypotheses of proposition 7.8.3.10 with p = 2 are satisfied. Hence the analytic family W of rescaled potential functions near (0, Y0 ) is equivalent to the normal form family f4 (136) near (0, 0) for the A3 -singularity. Corollary 7.8.3.11. The derivative of the local diffeomorphism R2 → R2 : (x1 , x2 ) 7→ X(x1 , x2 ), Y (x1 , x2 ) ,

(137)

which is defined by the equivalence between the normal form family near (0, 0) and the family W of rescaled potential functions near (0, Y0 ), maps the standard basis vector e2 into a positive multiple of Y0 e2 . Proof. Taking the partial derivative

∂2 ∂x2 ∂α

` ´ ` ´ W ( X(x1 , x2 ), Y (x1 , x2 ) , z (x1 , x2 ), α )

on both sides of the equation

= f ((x1 , x2 ), α) + h(x1 , x2 ),

(138)

which defines the equivalence between the famlies f and W gives ∂2f ∂x2 ∂α

=

∂ 2 W ∂X ∂z ∂z∂X ∂x2 ∂α

+

∂ 2 W ∂Y ∂z ∂z∂Y ∂x2 ∂α

+

∂W ∂2 z . ∂z ∂x2 ∂α

Evaluating both sides of the above equation at (x, α) = (0, 0) and ((X, Y ), z) = ((0, Y0 ), 0) gives ∂ 2 W ∂X ∂z 0= , (139) ∂z∂X ∂x2 ∂α because 2

∂2f ∂x2 ∂α

= 0 at (0, 0) and

∂2 W ∂z∂Y

∂ W ∂z∂X

=

∂W ∂z = ∂z ∂α 6= 0

0 at ((0, Y0 ), 0). Since

= (c + d)Y0 6= 0 at ((0, Y0 ), 0) and at (0, 0) by hypothesis, 3 ∂X from (139) we obtain ∂x2 = 0 at (0, 0). Taking the partial derivative ∂x∂2 ∂α2 on both sides of (138) gives 2 ∂ 3 W ∂Y ∂z = 2. (140) ∂Y ∂z 2 ∂x2 ∂α 3

∂ W ∂Y Since ∂Y ∂z 2 = 2(c + d)Y0 at ((0, Y0 ), z), equation (140) implies that ∂x2 at (0, 0) has the same sign as 0Y0 . Consequently, the derivative of the map 1

(137) at (0, 0) has the form

∂X B ∂x1 @ ∂Y ∂x1

0

∂Y ∂x2

C A

. Therefore it maps the standard

basis vector e2 into a positive multiple of Y0 e2 .

7.8. Bifurcations of a vertical disk

309

Taking the partial derivative of the equivalence equation (138) with ∂h ∂Y = ∂x . Using corollary respect to x2 and evaluating at (0, 0), we get Y0 ∂x 2 2 ∂Y 7.8.3.11, which states that ∂x2 at (0, 0) has the same sign as Y0 , it follows ∂h that ∂x > 0 at (0, 0). Therefore the function h cannot be eliminated from 2 the normal form family by adjusting the equivalence change of variables. For the parameter values (X, Y ) = (X0 , 0), the fact that ∂ 2 WX,Y ∂X∂z

= (d + cd + c2 )X0 + O(z 2 )

∂ 2 WX,Y ∂Y ∂z

= (c + d)X0 + O(z 2 )

∂ 3 WX,Y ∂Y ∂z 2

∂ 3 WX,Y ∂X∂z 2

= (d + cd + c2 )X0 + O(z)

= (c + 3d + 2cd + 2c2 )X0 z + O(z 2 )

shows that at z = 0 the matrix D in hypothesis 3 of proposition 7.8.3.10 is « 0 (d + cd + c2 )X0 . Thus D has rank 2, since its determinant is positive. (c + d)X0 0 Moreover,

„

∂ 2 WX,Y ∂WX,Y ∂ 3 WX,Y = = =0 ∂z ∂z 2 ∂z 3 at ((X0 ), 0) and ∂ 4 WX,Y = (12d + 56cd + 8c2 + 4c2 d + 4c3 )X02 > 0. ∂z 4 Therefore the hypotheses of proposition 7.8.3.10 hold and the family W of rescaled potential functions at (X0 , 0) is equivalent to the normal form family f4 (136) for the A3 -singularity. The following proposition summarizes our results about the normal form of the potential Vσ 3 ,σ 4 (83). Proposition 7.8.3.12. The potential Vσ 3 ,σ 4 on (−1, 1) has a degenerate critcal point at σ1 = 0 if and only if (σ ∗3 , σ ∗4 ) is equal to either (σ 3 , 0) or (0, σ 4 ), see (133) or (132). Moreover, there is 1. an open neighborhood U × A ⊆ R2 × R of ((0, 0), 0); 2. an analytic substitution of variables A → (−1, 1) : α '→ σ1 (x, α), depending analytically on x ∈ U , which maps (0, 0) to 0, 3. a local analytic diffeomorphism R2 → R2 : x '→ (σ 3 (x), σ 4 (x)), which maps (0, 0) to (σ ∗3 , σ ∗4 ), such that for every (x, α) ∈ U × A Vσ 3 (x),σ4 (x) (σ1 (x, α)) = α4 + x1 α + x2 α2 + h(x), where h : U ⊆ R2 → R is an analytic function on U .

(141)

310

7.8.4

The rolling disk

Cusps of the degeneracy locus

In this subsection we discuss the geometry of the normal form family f : R2 × R → R : ((x1 , x2 ), α) 7→ α4 + x1 α + x2 α2 + h(x1 , x2 ) (142) for the A3 singularity. The function f(x1 ,x2 ) has a degenerate critical point if and only if 0 f(x (α) = 4α3 + x1 + 2x2 α = 0 (143) 1 ,x2 ) and 00 f(x (α) = 12α2 + 2x2 = 0 (144) 1 ,x2 ) Equations (143) and (144) are equivalent to x1 = 8α3 and x2 = −6α2 . (145) Equations (145) parametrize the standard cusp Γ0 = {(x1 , x2 ) ∈ R2 27x21 + 8x32 = 0} (146) in the (x1 , x2 )-plane. The curve Γ0 , which is contained in the half plane {x2 ≤ 0}, has the origin as its cusp point with branches which straddle the x2 -axis, and the cusp points vertically upward. The branches of the cusp have contact of order 32 with the x2 -axis at the origin. The image Γ of the standard cusp Γ0 under the local diffeomorphism (x1 , x2 ) → X(x1 , x2 ), Y (x1 , x2 ) , which maps (0, 0) to (0, Y0 ) ((X0 , 0)), is equal to the set of parameter values (X, Y ) near (0, Y0 ) ((X0 , 0)) where the rescaled potential WX,Y has a degenerate critical point at z = 0. It follows from proposition 7.8.3.10 and corollary 7.8.3.11 that Γ is an analytic nondegenerate cusp with cusp point at (0, Y0 ) (or (X0 , 0)) pointing away from the origin in parameter space. The image of the x2 -axis under the equivalence diffeomorphism is a smooth curve γ, which is tangent to the Y axis at (0, Y0 ) (or the X-axis at (X0 , 0)). Therefore γ has contact of order at least 2 with the Y (or X) axis at the cusp point. But Γ has contact of order 32 there, which is smaller than 2. Therefore the branches of the cusp Γ straddle the Y (or X) axis. The following proposition summarizes what we have determined so far about the degeneracy locus ∆ of the potential function Vσ 3 ,σ4 . Proposition 7.8.4.13. The normal form (142) shows that the part of the degeneracy locus ∆, which comes from the local behavior of the potential Vσ 3 ,σ 4 at σ1 = 0 with the parameters (σ 3 , σ 4 ) near (σ ∗3 , σ ∗4 ), is a nondegenerate analytic cusp Γ, which points away from the origin of parameter space toward the cusp point (σ ∗3 , σ ∗4 ) and has branches, which straddle the relevant coordinate axis. Thus the degeneracy locus ∆ has four cusps.

7.9. The global geometry of the degeneracy locus

311

The function fx with f the family given by (142) has one critical point, which is a nondegenerate minimum if 27x21 + 8x32 > 0, that is, if x is in the complement of the region bounded by Γ0 (146) and containing the negative x2 semiaxis. If 27x21 + 8x32 < 0, that is, x lies in the region bounded by Γ0 containing the negative x2 semiaxis, then fx has three critical points: two nondegenerate local minima with a local maximum in between. If x ∈ Γ0 but x 6= (0, 0), then one of the local minima of fx has merged with the local maximum into a third order degenerate critical point; whereas the other local minimum survives as a nondegenerate critical point. Near the third order critical point the family f (142) is equivalent to the family R2 × R → R : (x1 , x2 ), α 7→ α3 + x1 α + x2 α2 ,

which is the A2 singularity. This follows by checking that the hypotheses of proposition 7.8.3.10 hold for p = 1. Using the local analytic diffeomorphism x 7→ (σ 3 (x), σ 4 (x)) of proposition 7.8.3.12, we see that near σ1 = 0, the potential Vσ 3 ,σ4 has one critical point, which is a nondegenerate local minimum if (σ 3 , σ 4 ) lies near one of the four cusp points of the degeneracy locus ∆ but not on the cusp Γ nor in the region bounded by the branches of the cusp which contains a coordinate axis. If (σ 3 , σ 4 ) lies inside the region bounded by the cusp, but is near the cusp point, then Vσ 3 ,σ4 has three critical points σ1 , two of which are nondegenerate minima with a nondegenerate maximum in between. If (σ 3 , σ 4 ) is near a cusp point and lies on the cusp curve Γ less its cusp point, then one of the local minima has merged with the local maximum into a third order degenerate critical point. During this merger the other local minimum survives. Thus we have proved Proposition 7.8.4.14. On the degeneracy locus ∆ near, but not equal to a cusp point, the family (σ 3 , σ 4 ) 7→ Vσ 3 ,σ4 has an A2 -singularity. 7.9

The global geometry of the degeneracy locus

In this section we give a global description of the degeneracy locus ∆. Recall that ∆ is the set of degenerate critical points of Vσ 3 ,σ4 .

312

The rolling disk

7.9.1

The circle of degenerate critical points

e be the set of points (σ1 , σ 3 , σ 4 ) ∈ (−1, 1) × R2 such that Let ∆

the potential Vσ 3 ,σ4 has a degenerate critical point at σ1 .

(147)

e under the projection map Then the degeneracy locus ∆ is the image of ∆ 2 2 π2 : (−1, 1) × R → R : (σ1 , σ 3 , σ 4 ) 7→ (σ 3 , σ 4 ), see (80). The goal of this subsection is to prove e is an analytically embedded circle, for Proposition 7.9.1.15. The locus ∆ e which the set ∆ ∩ {σ1 = 0} is four points and the connected components I e ∩ {σ1 6= 0} are mapped consecutively onto each other by the sequence of ∆ of reflections (σ1 , σ 3 , σ 4 ) 7→ (−σ1 , σ 3 , −σ 4 ) 7→ (σ1 , −σ 3 , −σ4 ) 7→ (−σ1 , −σ 3 , σ 4 ).

We make some preliminary observations. Recall the partial differential ∂ ∂ 1 ∂ operator C = ∂z + y ∂x + c 1−z 2 ∂y (96). The equations defining the locus of degenerate critical points of the rescaled potential WX,Y (95) are given f = C2 W f = 0. Here W f is the function of (z, x, y) given by equation by CW (94). A straightforward calculation shows that and

f=d CW

z 1 z x2 + (c + d) xy − √ 1 − z2 1 − z2 1 − z2

C W = 4d 2f

1 z2 2 + (d + cd + c ) x2 (1 − z 2 )3 (1 − z 2 )2 z 1 + 2(c + 2d) xy + (c + d) y2 (1 − z 2 )2 1 − z2 1 − . (1 − z 2 )3/2

(148)

(149)

f = C2 W f = 0. First we look at a special solution of CW

Lemma 7.9.1.16. When x = z = 0, then there are functions z = φ± (x), which are analytic near 0 with φ± (0) = 0, φ0± (0) = ±(c + d)1/2 such that f (φ± (x), x, ±(c + d)−1/2 ) = C2 W f (φ± (x), x, ±(c + d)−1/2 ) = 0. CW

f = 0 (149) gives y = Proof. When x = z = 0 holds, the equation C2 W 1 −1/2 ±(c + d) . At the point (z, x, y) = (0, 0, ± √c+d ) we see that f = (−1, ±(c + d)1/2 , 0) and grad C2 W f = (0, 0, ±(c + d)−1/2 ) grad CW

7.9. The global geometry of the degeneracy locus

313

are linearly independent. Therefore from the implicit function theorem, it follows that near each of the points p± = (0, 0, ±(c+d)−1/2 ), the set of soluf = C2 W f = 0 is a smooth analytic curve, whose tangent space at tions of CW p± is span{(±(c + d)1/2 , 1, 0)}. Under the projection map (z, x, y) 7→ (z, x) the image of these curves is the graph of two analytic functions z = φ± (x) such that φ± (0) = 0 and φ0± (0) = ±(c + d)1/2 . f = 0 (148) for y as a function of z and x and In general, solving CW then substituting the result into (149) gives z2 1 d(2c + 3d) z2 2 f = 4d C2 W + (d + cd + c ) − x2 (1 − z 2 )3 (1 − z 2 )2 c+d (1 − z 2 )3 2z 2 − 1 1 2 1 + + z (150) c + d x2 (1 − z 2 )3/2 Multiplying (150) by (c + d)(1 − z 2 )3/2 x2 and then replacing x by the f = 0 becomes expression (1 − z 2 )3/4 x e, the equation C2 W c2 (z)(e x2 )2 + c1 (z)e x2 + z 2 = 0,

(151)

where

c1 (z) = (c+d)(2z 2 −1) and c2 (z) = c d−(c+d)2 z 2 +(c+d)(d+cd+c2).

We now analyze (151). From the equation above we see that c2 is an inhomogeneous linear function λ of z 2 with λ(0) = (c + d)(d + cd + c2 ) > 0 and λ(1) = d(2c + d) > 0. Consequently, λ(z 2 ) > 0 when z 2 ∈ [0, 1]. Therefore c2 (z) > 0 when z ∈ [−1, 1]. So (151) holds for x e 6= 0 and z ∈ (−1, 1) only if c1 (z) < 0. From the defining equation for c1 (z) this implies that z 2 < 21 . As equation (151) is quadratic in x e2 , it has complex solutions given by p −c1 (z) ± discr (z) y = y± (z) = , (152) 2c2 (z) where discr (z) = c21 (z) − 4 z 2 c2 (z)

= 4(2c2 + 3cd + 2d2 )z 4 − 4(c + d)(2c + 3d + cd + c2 )z 2 + (c + d)2 .

Let δ be the function defined by discr(z 2 ) = δ(z 2 ). Then δ(0) = (c + d)2 > 0 and δ( 12 ) = −2c2 ( √12 ) = −2λ( 21 ) < 0. Thus there is a √ unique ζ ∈ (0, 12 ) such that δ(ζ) = 0. Let ze = ζ. Then discr (z) > 0 when z ∈ (−e z , ze), discr (±e z ) = 0, and discr (z) < 0 when ze2 < z 2 < 21 . Suppose 2 that z ∈ (−e z , ze), then c1 (z) < 0, c2 (z) > 0, and 0 < discr (z) ≤ c1 (z) .

314

The rolling disk

These conditions imply that the functions y = y± (z) defined by (152) are nonnegative and depend analytically on z ∈ (−e z , ze). Because c1 (z), c2 (z), and discr (z) are polynomials in z 2 , it follows that y± (z) = y± (−z). Also 0 < y− (z) < y+ (z) when 0 < |z| < ze. For z ∈ (−e z , ze) the function y− (z) equals 0 only when z = 0. Thus we have proved 2 Lemma p 7.9.1.17. When p z ∈ (0, ze), equation (151) has four real solutions x e = ± py+ (z) and ± y− (z), which depend analytically on z. The solutions x e = ± y+ (z) extend to nonzero analytic functions on (−e z , ze); while the p solutions x e = ± y− (z) converge to 0 as z → 0. When p either z ↑ ze or z ↓ −e z, we have discr (z) e = ± y− (z) merges p ↓ 0. There the solution x with the solution x e = ± y+ (z).

An immediate consequence of the above lemma is

Corollary 7.9.1.18. The set of solutions (z, x e) of (151) in the strip (−1, 1) × R is connected and compact.

Proof. This follows because this solution set is a closed subset of (−1, 1)×R which stays away from {±1} × R. We now give a

Proof of proposition 7.9.1.15. First we rewrite equation (151) as A(e x) z 2 = x e2 B(e x),

where

(153)

A(e x) = c(d − (c + d)2 )e x4 + 2(c + d)e x2 + 1

(154)

B(e x) = (c + d)(1 − (d + cd + c2 )e x2 ).

(155)

and

This leads to two solutions z = ±w(e x) = ±e x

s

B(e x) , A(e x)

x) which depend analytically on x e as long as A(e x) > 0 and B(e A(e x) > 0. Because A(0) = 1 > 0 and B(0) = c + d > 0, there is a maximal open interval J containing 0 such that A(e x) > 0 and B(e x) > 0. Since z = 0 when x e = 0, it follows that φ± (x) = ±w(e x), where x = (1 − z 2 )3/4 x e. Here φ± are the functions given by lemma 7.9.1.16.

7.9. The global geometry of the degeneracy locus

315

Now B(e x) > 0 when |e x| < (d + cd + c2 )−1/2 . Consequently, A(e x) > 0 2 −1/2 when |e x| < (d + cd + c ) because for solutions (z, x e) ∈ (−1, 1) × R of equation (151), the value of z stays away from ±1. Therefore |z| < 1 in equation (153). So |A(e x)| > |A(e x)z 2 | = x e2 B(e x).

This implies that A is positive when B is. On the other hand, the value of A at (d + cd + c2 )−1/2 equals cd + 2d(c + d) + c(c + d)2 + (d + cd + c2 )2 /(d + cd + c2 )2 ,

which is positive. Therefore the set of solutions (z, x e) ∈ (−1, 1) × R of (153) is connected. Let J be the interval |e x| < (d + cd + c2 )−1/2 . Then (±w(e x), x e) ∈ (−1, 1) × R satisfies equation (151) if and only if x e ∈ J and z = ±w(e x). Consequently, the solution set of (151) in the (z, x e)-strip (−1, 1) × R is an analytically immersed closed curve, which consists of the following successively x), x e)pfor x e∈ p overlaping analytically embedded curves: 1) (w(e J; 2) (z, y+ (z)) for z ∈ (−e z , ze); 3) (−w(e x), x e) for x e ∈ J; 4) (z, − y+ (z)) for z ∈ (−e z , ze) and finally returning to 1). This analytically immersed curve has possibly one transverse self intersection at x e = 0. But in lemma f = C2 W f = 0 is the union 7.9.1.16 we have shown that the solution set of CW of two disjoint analytic curves, which project to the curves (φ± (e x), x e) near x e = 0. Therefore the analytic curve 1)–4) is embedded. The last statement of proposition 7.9.1.15 follows by observing that the reflections (z, x, y) 7→ (−z, x, −y) 7→ (z, −x, −y) 7→ (−z, −x, y) correspond to w|J+ → −w|J− → w|J− → −w|J+ , where w is the curve x e 7→ w(e x), x e and J± = {e x∈J ±x e > 0}. This proves proposition 7.9.1.15. 7.9.2

A global description of the degeneracy locus

In this subsection we give a global description of the degeneracy locus ∆. We begin by proving Proposition 7.9.2.19. If at σ10 ∈ (−1, 1), the first, second and third derivatives of Vσ 3 ,σ4 vanish, then σ10 = 0 and (σ 3 , σ 4 ) is one of the four cusp points of ∆ found in proposition 7.8.4.13. Proof. We need only show that the equations f = C2 W f = C3 W f=0 CW

(156)

316

The rolling disk

imply that z = 0. Straightforward calculations show that (156) is equivalent to xy' − z 0 = dz x '2 + (c + d)' 5 2 4 ' 0 = (3d − c(c + d))z 2 + d + cd + c2 x

(157)

+ 2(c + 2d)z x 'y' + (c + d)' y2 − 1 (158) 5 2 4 2 3 2 ' 0 = 2 (6d − 4cd − 3c )z + (6d + 4cd + 3c )z x 4 5 2 2 + 2 (3c + 9d − 2cd − 2c )z + (c + 3d + 2cd + 2c2 ) x 'y' + 2(2c + 3d)z y'2 − 3z,

(159)

where we have replaced y by (1 − z ) y'. Equations (157)–(159) are 'y', and w ' = y'2 . Using (157) inhomogeneous linear equations in u '=x '2 , v' = x we can eliminate v' from equations (158) and (159). Using the equation which results from (158), we can eliminate w ' from (159). The resulting equation for u ' has a common factor of z, which we divide out under the hypothesis that z -= 0. This leads to the equation 2c D(z)' u = U (z), where 2 −1/4

and

D(z) = (c + d)(3d + c2 − d2 )(1 − z 2 ) + 2d2 + 4cd z 2

" # U (z) = −(c + d) 3c + 9d + 4cd + 4c2 (1 − z 2 ) − (3d2 + 8cd + c2 )z 2 .

Clearly, U (z) < 0, when z 2 ≤ 1. Moreover, d2 = (c + d)d − cd ≤ 2d − cd, since c + d ≤ 2. Therefore 3d + c2 − d2 ≥ d + c2 + cd > 0. So D(z) > 0 ' < 0 when z 2 ≤ 1 and z -= 0. But this contradicts when z 2 ≤ 1. Hence u the definition of u ', namely, u '=x '2 ≥ 0. Therefore equation (156) implies that z = 0. In §9.1 we have shown that the degeneracy locus ∆ in the (σ 3 , σ 4 )-plane ' in (−1, 1) × R2 under the is the image of an analytically embedded circle ∆ 2 2 ' projection map π : (−1, 1)×R → R : (σ1 , σ 3 , σ 4 ) '→ (σ 3 , σ 4 ). The circle ∆ intersects the plane {σ1 = 0} in exactly four points, which map to the four cusp points (0, ±σ ∗4 ) and (±σ ∗3 , 0) described in §8.2. Let I' be a connected ' ∩ {σ1 = 0}. Then the restriction of the projection map π component of ∆ ' to I is an embedding of I' onto an analytic curve I in an open quadrant Q of the (σ 3 , σ 4 )-plane.

Proposition 7.9.2.20. The curve I is transverse to the radial direction and is traced out from a cusp point of ∆ on half axis adjacent to Q to the cusp point on the other half axis adjacent to Q. The other connected components of ∆ ∩ {σ1 -= 0} are obtained by applying the reflections (σ 3 , σ 4 ) '→ (σ 3 , −σ 4 ) '→ (−σ 3 , −σ 4 ) '→ (−σ 3 , σ 4 ) to I.

317

7.9. The global geometry of the degeneracy locus

σ4

σ3

Fig. 7.2 The degeneracy locus ∆ for the uniform disk. Here we have I1 = I2 = 14 mr2 , I3 = 21 mr2 , gr = 54 , c = 43 , and d = 23 . Using (132) and (133), we get the cusp points q 5 (±1, 0) and (0, ± 12 ).

Thus the degeneracy locus ∆ is formed by four cusp points together with four connected components of ∆ ∩ {σ1 6= 0}. Therefore ∆ is a continuously embedded circle in the (σ 3 , σ 4 )-plane, which is an analytic set with four cusp points as its set of singular points. Proof. Let

f (z, x, y), C2 W f (z, x, y) . G : (−1, 1) × R2 → R2 : (z, x, y) 7→ CW

(160)

Then the degeneracy locus ∆ is defined by G(z, x, y) = (0, 0). In addition to the partial differential operator C coming from Chaplygin’s equations ∂ ∂ + y ∂y . Because Chaplygin’s (96), we consider the Euler operator E = x ∂x equations (76) are linear, the operators C and E commute when (x, y) 6= f = 0 (149) implies that (x, y) 6= (0, 0). When G = (0, 0). Note that C2 W (0, 0) we have ! ! f C2 W f f ECW 0 EG, CG = EC2W = , 2f 3f f 3f EC W C W

EC W C W

318

The rolling disk

f )(C3 W f ). From (148) we see that which has determinant equal to (ECW 2 −1/2 f is equal to the sum of (1 − z ) CW and a homogeneous polynomial q of degree 2 in (x, y). Therefore f = 2q = 2(CW f + 2(1 − z 2 )−1/2 ) = 2z(1 − z 2 )−1/2 , ECW

(161)

f 6= 0 when z 6= 0. From proposition when G = (0, 0). Therefore ECW 3f 7.9.2.19 it follows that C W 6= 0 when G = (0, 0) and z 6= 0. Therefore when G = (0, 0) and z 6= 0, the derivative of G, restricted to the plane Π spanned by the vectors E and C, is invertible. Consequently, {G = (0, 0)} ∩ {z 6= 0} is a smooth one dimensional manifold, whose tangent space ker DG is complementary to Π. Now a fiber of the map π e : R3 → R2 : (z, x(z; X, Y ), y(z; X, Y )) 7→ (X, Y ),

where z 7→ (x(z; X, Y ), y(z; X, Y )) is an integral curve of the rescaled Chaplygin vector field on R2 with initial condition (X, Y ), are integral curves of the vector field defined by the operator C. Moreover, the map π e intertwines E with the Euler vector field in the (X, Y )-plane. Therefore π e maps the tangent space to {G = (0, 0)} ∩ {z 6= 0} to a one dimensional linear subspace, which is complementary to E. So π e|({G = (0, 0)} ∩ {z 6= 0}) is an immersion from {G = (0, 0)} ∩ {z 6= 0} to the (X, Y )-plane, whose image is transverse to the radial direction. Let J be a connected component of {G = (0, 0)} ∩ {z 6= 0}. Then π e|J traces out the rescaled degeneracy locus from a cusp point on one of the half axes to another cusp point on the adjacent half axis in such a way that it stays in the complement of the origin of the (X, Y )-plane and always has a nonzero angular velocity with respect to the origin. If we continue on to an adjacent component J 0 of {G = (0, 0)} ∩{z 6= 0}, then π e|J 0 still encircles the origin in the same direction, because the branches of the cusps straddle the half axes. Note that J 0 is the image of J under one of the reflections (z, x, y) 7→ (−z, −x, y) or (z, x, y) 7→ (−z, x, −y). Moreover the map π e intertwines each of these reflections with the reflections (X, Y ) 7→ (−X, Y ) or (X, Y ) 7→ (X, −Y ), respectively. If π e|J intersects a nonnegative integer number k of half axes, then so does π e |J 0 , because this latter map is obtained from the former by a reflection. Therefore π e |({G = (0, 0)} ∩ {z 6= 0}) intersects 4 + 4k half axes in the same direction. Hence the winding number of π e|{G = (0, 0)} about the origin has absolute value k + 1. This winding number is equal to the linking number of {G = (0, 0)} with the z-axis in (−1, 1) × R2 with coordinates

319

7.10. Falling flat

(z, x, y). In turn this linking number is equal to the winding number of pe|{G = (0, 0)}, where pe : (−1, 1) × R2 → R2 : (z, x, y) 7→ (x, y).

f = 0, see (148). If y = 0, G = (0, 0), If x = 0 then z = 0, because CW and z 6= 0, then from equations (148) and (149) we find that both d x2 and [4d z 2 + (d + cd + c2 )(1 − z 2 )]x2 are equal to (1 − z 2 )3/2 . Therefore x 6= 0 and d = 4d z 2 + (d + cd + c2 )(1 − z 2 ) > d z 2 + d(1 − z 2 ) = d, which is a contradiction. Therefore pe|{G = (0, 0)} intersects a half axis only when z = 0. Because {G = (0, 0)} ∩ {z = 0} is four points, it follows that pe|{G = (0, 0)} intersects four consecutive half axes. Therefore the absolute value of the winding number of pe|{G = (0, 0)} is equal to 1, that is, k = 0. This completes the proof of proposition 7.9.2.21. We now prove the following supplement to proposition 7.9.1.15. e (147) less Proposition 7.9.2.21. Let Ie be a connected component of ∆ e where i ∈ I, the four points where σ1 = 0. For each ei = (σ1 , i) ∈ ∆, there is an open neighborhood U × A of (0, 0) ∈ R2 × R, an analytic substitution of variables R → R : α 7→ σ1 (x, α), depending analytically on x = (x1 , x2 ) ∈ U , a local analytic diffeomorphism Φ : x 7→ (σ 3 (x), σ 4 (x)) of R2 near (0, 0), and an analytic function h : U ⊆ R2 → R such that σ1 (0, 0) = σ1 , (σ 3 (0), σ 4 (0)) = i, and Vσ3 (x),σ4 (x) (σ1 (x, α)) = α3 + x1 α + h(x) for every (x, α) ∈ U × A. Moreover, the preimage i0 of the outward radial vector i under DΦ(0) of the local diffeomorphism Φ has nonzero x1 -component and Dh(0)i0 > 0. Proof. This follows immediately from proposition 7.9.2.19, proposition 7.8.4.14, and corollary 7.8.3.11. 7.10

Falling flat

In this section we discuss the behavior of the disk when it falls flat.

320

7.10.1

The rolling disk

When the disk does not fall flat

In this subsection we give the exact conditions when the disk does not fall flat . Because the kinetic energy T (82) is nonnegative and the total energy E (81) is a constant of the motion with value E, it follows that during a motion on (−1, 1) × R we have Vσ 3 ,σ 4 (σ1 ) ≤ E. Therefore if limσ1 →±1 sup Vσ 3 ,σ4 (σ1 ) ≤ E, then σ1 remains bounded away from ±1. In other words, the angle between the plane of the rim of the rolling disk and the horizontal plane remains bounded away from 0. Thus the disk does not fall flat. We now prove a much stronger statement about the asymptotic behavior of the potential Vσ 3 ,σ 4 (83) as σ1 → ±1, provided that (σ 3 , σ 4 ) ∈ R2 does not lie on one of the one dimensional linear subspaces `± defined by I1 r0 (0) σ 3 ∓ I3 r(0) σ 4 = 0.

(162)

Here r0 (0) and r(0) are positive constants given by (121) and (124). In what follows the quantity η± = I1 r0 (0) σ 3 ∓ I3 r(0) σ 4

(163)

will be used as a coordinate relative to the line `± . If η± 6= 0, then (σ 3 , σ 4 ) lies on one side of `± ; while if η± = 0, then (σ 3 , σ 4 ) lies on `± . The distance of (σ 3 , σ 4 ) from `± is of the same order as |η± |. Lemma 7.10.1.22. If (σ 3 , σ 4 ) does not lie on `± , then Vσ 3 ,σ4 (σ1 ) ∼

1 2 1 η 4I1 ± 1 ∓ σ1

as σ1 → ±1

(164)

as σ1 → ±1.

(165)

and Vσ0 3 ,σ4 (σ1 ) ∼

1 2 1 η± 4I1 (1 ∓ σ)21

These asymptotic relations are locally uniform for (σ 3 , σ 4 ) ∈ R2 \ `± . 1 r(0) X = −η± I31ue with u e

Proof. Write and =

1 a and r0 (0) X ± r(0) Y = , where a = I3 ueIr(0) q = I3mgr +mr2 , see (98). Then the hypothesis that

(σ 3 , σ 4 ) 6∈ `± implies that 6= 0. From lemma 7.7.2.9 it follows that x(z) = a r(∓z) + 12 c r(∓z) ln(1 ∓ z) + w(∓z) ,

321

7.10. Falling flat

where r(z) ∼ 1 ∓ z and w(z) ∼ 1 as z → ±1. Therefore x(z) ∼ as z → ±1. Similarly, r(±z) 0 0 0 0 1 x (z) = ∓ a r (∓z) + 2 c r (∓z) ln(1 ± z) + + w (∓z) , 1∓z r(z) where r0 (z) ∼ 1, 1+z ∼ 1, and w0 (z) = O(1) as z ↓ −1. Therefore x0 (z) ∼ 1 ∓ 2 c ln(1 ± z) as z → ±1. Because

1 − z 2 = (1 − z)(1 + z) ∼ 2(1 ± z) as z → ±1,

(166)

it follows that the rescaled potential WX,Y (95) satisfies 1 + 18 2 c2 ln(1 ∓ z) + O(1) 1∓z 1 ∼ 14 d2 as z → ±1. (167) 1∓z q I32 I3 +mr2 1 This implies (164), because d = I1 (I3 +mr 2 ) , = −η± I mgr , and 3 Vσ 3 ,σ 4 = mgr WX,Y . WX,Y (z) ∼ 41 d2

f = W 0 (z). Then Now consider the function CW X,Y f=d CW

z 1 z x + (c + d) xy − 1 − z2 1 − z2 (1 − z 2 )1/2

1 ln2 (1 ∓ z) + O( ) + O((1 ∓ z)1/2 ) (1 ∓ z)2 1∓z 1 as z → ±1. ∼ ± 41 d2 (1 ∓ z)2

∼ ± 41 d2

(168)

This implies (165). Let (σ 3 , σ 4 ) ∈ R2 \ (`− ∪ `+ ). From (164) it follows that a level set of the energy E (81) is a compact subset of (−1, 1) × R. Since E is a constant of motion, this implies that the vector field Xσ3 ,σ 4 (86) is complete. Thus its integral curves are either equilibrium points, or asymptotic to unstable equilibrium points, or nonconstant periodic functions of t with σ1 (t) oscillating between a minimum and a maximum value in (−1, 1). Note that the hypothesis that (σ 3 , σ 4 ) 6∈ `± and equation (164) imply that there are no relative equilibria with the plane of the rim of the disk almost horizontal.

322

The rolling disk

7.10.2

When the disk falls flat

We now discuss when the disk falls flat . The next lemma shows that we have a strong dichotomy. Lemma 7.10.2.23. Suppose that (σ 3 , σ 4 ) ∈ `± . Then the potential function V : (−π/2, π/2) → R : ϕ 7→ V (ϕ) = Vσ 3 ,σ 4 (sin ϕ) has an extension to an analytic function in some neighborhood of ±π/2 such that I 2 (I3 + mr2 ) 2 V (±π/2) = 1 2 σ3 (169) 2I3 r(0)2 and V 0 (±π/2) = ∓mgr.

(170)

In particular, V (ϕ) has a strict local minimum at ±π/2. Proof. We use the notation of the proof of lemma 7.10.1.22. The hypothesis (σ 3 , σ 4 ) ∈ `± implies that r0 (0) X ∓ r(0) Y = 0. From (104) and (105) X X it follows that x(z) = r(0) r(∓z) and x0 (z) = ∓ r(0) r0 (∓z). Here r(z) is analytic and r(z) = (1 + z) + O((1 + z)2 ) for z in an open neighborhood of −1. If we substitute z = sin ϕ, then for ϕ near π/2 we have 1 ∓ z = 1 ∓ sin ϕ = 1 − (1 − cos2 ϕ)1/2 ∼

1 2

cos2 ϕ.

(171)

2

This implies that r(∓z) ∼ 12 cos2 ϕ. Therefore ϕ 7→ r(∓z) cos2 ϕ extends to an analytic function with a double zero at ±π/2. In view of (171) we find that r0 (∓z)2 = 1 + O(1 ∓ z) = 1 + O(cos2 ϕ),

which shows that ϕ 7→ WX,Y (sin ϕ) has an analytic extension in a neighX2 0 borhood of ∓π/2. Moreover, WX,Y (∓π/2) = 2r(0) 2 . The value of WX,Y at ±π/2 is equal to the value of the derivative of cos ϕ at ±π/2, namely, ∓1. Using (98) the lemma follows. Suppose that (σ 3 , σ 4 ) ∈ `− \ {(0, 0)} and that E > V (−π/2). From the fact that V 0 (−π/2) > 0, see (170), it follows that the set {ϕ ∈ (−π/2, π/2) V (ϕ) < E} has a connected component of the form (−π/2, ϕ+ ) for some ϕ+ ∈ (−π/2, π/2). Note that the hypothesis (σ 3 , σ 4 ) 6∈ `+ implies that ϕ+ < π/2. Because limϕ↓−π/2 V (ϕ) = V (−π/2) < E, from the description of the solutions of the conservative Newtonian system in §5.3, every solution

7.10. Falling flat

323

of energy E, which starts at ϕ0 ∈ (−π/2, ϕ+ ) and for which ϕ˙ 0 < 0, will fall flat in finite time. More precisely, suppose that ϕ+ < π/2 (which implies E = V (ϕ+ )) and V 0 (ϕ+ ) 6= 0, which means that σ1+ = sin ϕ+ is not a critical point of Vσ 3 ,σ 4 , then for every solution of the conservative Newtonian system (85) of energy E with ϕ0 ∈ (−π/2, ϕ+ ) there exist finite times T1 and T2 with T1 < T2 such that limt↓T1 ϕ(t) = −π/2, limt↑T2 ϕ(t) = π/2, and ϕ(t) increases monotonically from −π/2 to ϕ+ as t goes from T1 to 21 (T1 + T2 ) and decreases monotonically from ϕ+ to −π/2 as t goes from 1 2 (T1 + T2 ) to T2 . In other words, the plane of the rim of the disk rises monotonically from the horizontal to a maximum angle with the horizontal plane and then falls back to the horizontal — all in finite time. Now suppose that ϕ+ < π/2 and σ1+ = sin ϕ+ is a critical point of Vσ 3 ,σ 4 . If ϕ0 ∈ (−π/2, ϕ+ ), then there is a finite time T such that the solution of the conservative Newtonian system (85) is defined for t ∈ (−∞, T ). Moreover, we have limt↓−∞ ϕ(t) = ϕ+ , limt↑T ϕ(t) = −π/2, and ϕ(t) decreases monotonically from ϕ+ to −π/2. Similar conclusions can be drawn if (σ 3 , σ 4 ) ∈ `+ \ {(0, 0)} by replacing ϕ with −ϕ. If (σ 3 , σ 4 ) ∈ `+ ∩ `− = (0, 0), then Chaplygin’s equations (76) have a trivial solution, that is, σ3 (σ1 ; 0, 0) = 0 = σ4 (σ1 ; 0, 0). Therefore the potential Vσ 3 ,σ4 (83) becomes V (ϕ) = mgr cos ϕ. In this situation the conservative Newtonian system (85) is the half of the mathematical pendulum where cos ϕ ≥ 0. If E > mgr, then ϕ traverses [−π/2, π/2] monotonically in finite time, increasing or decreasing depending on the sign of ϕ˙ 0 , which is constant during the motion. If E = mgr, then there is a finite time T and a solution, which is defined for t ∈ (−∞, T ) such that ϕ(t) ↑ π/2 as t ↑ T and ϕ(t) ↓ 0 as t ↓ −∞. The other solutions are obtained by reversing the sign of t or the sign of ϕ. If E < mgr, then ϕ(t) increases in finite E and then decreases symmetrically back to time from −π/2 to −cos−1 mgr E −π/2, or decreases in finite time from π/2 to cos−1 mgr and then increases symmetrically back to π/2. If we consider (σ 3 , σ 4 ) as a mapping from the fully reduced space (−1, 1) × R3 (with coordinates (σ1 , σ2 , σ3 , σ4 )) to R2 , then the preimage L± of the lines `± under this map is a nonempty open codimension one analytic subset of (−1, 1) × R3 . The set of initial conditions in the fully reduced space which lead to motions of the disk that fall flat in finite time contains L± and therefore has codimension one. It is quite surprising that this set of initial conditions has such a high dimension.

324

7.10.3

The rolling disk

Limiting behavior when falling flat

In this subsection we discuss the limiting behavior of the disk when it falls flat. Up until now we have only considered the angle ϕ between the oriented plane of the rim of the disk and the vertical direction. Here we treat other aspects of the motion. Proposition 7.10.3.24. Suppose that the disk falls flat or rises up from the flat position when t ↑ T∗ or t ↓ T∗ . The disk falls onto or rises from the flat position with a bang in the sense that limt→T∗ dϕ dt is finite and nonzero. Moreover, the solution on the constraint manifold C (3) and the point of contact converge as t → T∗ . Also the vector du dt converges to a nonzero vector in the horizontal plane as t ↑ T∗ or t ↓ T∗ . Proof. Since the disk falls flat as t → T∗ , we know that u(t) → ±e3 . Suppose that u(t) → −e3 . This corresponds to ϕ → −π/2, or equivalently, σ1 ↓ −1 as t → T∗ . In other words, we have assumed that (σ 3 , σ 4 ) ∈ `− . The condition ϕ → −π/2 means that the energy E of the solution is greater ˙ 2 + V (ϕ) is constant on a solution, than V (−π/2). Because E = 12 M (ϕ) using lemma 7.10.2.23 we find that 1/2 2 ϕ˙ → ϕ˙ ∗ = ± (E − V (−π/2)) 6= 0, (172) M as t → T∗ . The minus sign holds in the above limit, when the disk falls flat and the plus sign when it rises from being flat. Because σ1 (t) = sin ϕ(t), we get which implies

σ2 (t) = σ˙ 1 (t) = (cos ϕ(t))ϕ˙ = (1 − σ12 (t))1/2 ϕ, ˙ (1 − σ12 (t))−1/2 σ2 (t) → ϕ˙ ∗ 0

as t → T∗ .

(173)

The condition that (σ 3 , σ 4 ) ∈ `− implies r (0)X − r(0)Y = 0, using (162) and (98), and is equivalent to σ3 σ3 (σ1 ; σ 3 , σ 4 ) = r(σ1 ). r(0) Therefore I1 σ 3 0 I1 dσ3 σ4 (σ1 ; σ 3 , σ 4 ) = − =− r (σ1 ). I3 dσ1 I3 r(0) Because r(σ1 ) is analytic in a neighborhood of −1 with r(−1) = 0 and r0 (−1) = 1 we find that σ3 (t) 1 → σ 3 as t → T∗ (174) 1 − σ12 (t) 2r(0)

325

7.10. Falling flat

and σ4 (t) → −

I1 σ3 I3 r(0)

If we combine the equation ψ˙ = equations we find that

σ3 σ 1−σ12 1

I1 σ 3 ψ˙ → (− 21 + ) I3 r(0)

as t → T∗ .

(175)

− σ4 for ψ˙ with the above two as t → T∗ .1

(176)

Because ψ(t) is obtained from ψ˙ by integrating a differential equation, whose right hand side converges as t → T∗ , it follows that ψ(t) converges to angle ψ∗ as t → T∗ . In view of uhor = cos ϕ(cos ψ, sin ψ, 0), and u(t) → −e3 , we get u˙ → ϕ˙ ∗ (cos ψ∗ , sin ψ∗ , 0) 6= 0 as t → T∗ . Combining the reconstruction equation (26) for ω with u(t) → −e3 as t → T∗ , and using (173) and (174), we find that ω1 → −ϕ˙ ∗ sin ψ∗ , ω2 → ϕ˙ ∗ cos ψ∗ , ω3 → −

I1 σ3 , I3 r(0)

(177)

as t → T∗ . From u(t) → −e3 and (177), using the differential equation u˙ = u × ω (6) from the E(2)-reduced equations of motion, we recover the fact that u˙ has a limit as t → T∗ . By definition of the angular velocity vector ω, the rotational motion of the disk obtained by integrating A˙ = A(ω(t))× (29). Because ω(t) converges to (177) as t → T∗ , it follows that A(t) converges to A∗ ∈ SO(3). Similarly the position a(t) of the center of mass of the disk, which is obtained by integrating the differential equation (33) whose right hand side converges. Thus a(t) converges to some element a∗ of R3 . Therefore, the motion of the disk in the constraint manifold C ⊆ T E(3) converges. Finally, equation (38) implies that the point of contact p(t) converges to −rA∗ (cos ψ∗ , sin ψ∗ , 0) + a∗ as t → T∗ . The case when σ1 (t) → 1, that is, (σ 3 , σ 4 ) ∈ `+ , follows by applying the symmetry (σ1 , σ 3 , σ 4 ) 7→ (−σ1 , −σ 3 , σ 4 ) or the symmetry (σ1 , σ 3 , σ 4 ) 7→ (−σ1 , σ 3 , −σ 4 ). The angular velocity vector ω in body coordinates converges when the disk fall flat. Also the length of its horizontal component converges and is nonzero. Moreover, its limiting direction is perpendicular to the limit of u. This is not surprising in view of the falling rotation of the disk. The 1 Note

that this limit is 0 in the case of a hoop.

326

The rolling disk

1 vertical component of the limiting angular velocity is − I3Ir(0) σ 3 , which is nonzero as soon as σ 3 6= 0. Because σ 4 is proportional to σ 3 on `± , the condition σ 3 6= 0 is equivalent to (σ 3 , σ 4 ) 6= (0, 0). In other words, we are not in the special case of falling flat described in §5.4. Thus, at the same time that the disk is rotating about a horizontal axis when it falls flat, it is also spinning about its vertical symmetry axis with an angular speed equal 1 to − I3Ir(0) σ3.

7.11

Near falling flat

In this section we investigate the asymptotic behavior of the solutions where p = (σ 3 , σ 4 ) 6∈ `± approaches q ∈ `± , when the disk falls flat. For σ1 bounded away from ±1, the potential function Vp (σ1 ) is close to Vq (σ1 ). Accordingly, the solutions of the Newtonian system (85) in (−1, 1) × R and the reconstructed full motions are close to each other. However, when σ1 is close to ±1, the motions differ greatly, because Vp (σ1 ) → ∞ as σ1 → ±1, see lemma 7.10.1.22; whereas ϕ 7→ Vq (sin ϕ) has an analytic extension to a neighborhood of ±π/2, when the disk falls flat. See lemma 7.10.2.23. 7.11.1

Elastic reflection

The next lemma implies that as p approaches the point q, the motion ϕ(t) converges to a motion in a potential well with potential as given in lemma 7.10.2.23, but with an elastic reflection at ±π/2. Lemma 7.11.1.25. Let B be a bounded subset of R with σ 3 ∈ B. Let 0 (0) q = (σ 3 , ± II13rr(0) σ 3 ) be a point on `± , which we have parametrized by σ 3 . Suppose that p = (σ 3 , σ 4 ) 6∈ `± , is near q, and E > Vq (±1). Finally suppose that σ1 = sin ϕ is close to ±1. If Vq (±1) < Vp (σ1 ) ≤ E and Vp (σ1 ) is not close to Vq (±1), then cos ϕ is of order |η± |. In addition, during the time interval that Vp (σ1 ) is not close to Vq (σ1 ), the velocity dϕ dt changes 2 −1/2 monotonically from ± M (E − Vq (±1)) to its negative in time of order |η± |. Proof. All the asymptotic statements below are for η± → 0 and are locally uniform in σ 3 and E. We use the notation of the proof of lemma 7.10.1.22. This time we have two small variables: and 1 ∓ z. Because r(z)(1 + z) ∼ (1 + z) ln(1 + z) and w(z) ∼ 1 as z ↓ −1, we have x(z) ∼ a(1 ∓ z) + as z → ±1. Since

327

7.11. Near falling flat

r(z) r0 (z) ∼ 1, (1+z) ∼ 1, and w0 (z) = O(1) as z ↓ 1, we have y(z) = x0 (z) ∼ ∓ a + 12 c ln(1 ∓ z) as z → ±1. From (94) and the above it follows that 2 p 2 d f∼ W a(1 ∓ z) + + 12 a + 21 c ln(1 ∓ z) + 2(1 ∓ z) 4(1 ± z) d2 1 ∼ + 1 a2 + 21 ac ln(1 ∓ z). (178) 4 (1 ∓ z) 2

In (178) we have deleted all terms which are small — in particular the term 1 1 2 2 2 1 2 2 8 c ln (1 ∓ z), since it is small compared to 4 d (1∓z) . If (1∓z) is small, then 1 ∓ z >> 2 . Therefore | ln(1 ∓ z)| << | ln(2 )| << 1.

f − 1 a2 | is not small. It is bounded, only if 1 ± z is of order 2 . Hence |W 2 In this situation we have x(z) ∼

and y(z) = x0 (z) ∼ ∓a,

(179)

f ∼ 1 a2 + 1 d2 1 . Since z = because ln(1 ∓ z) is small. Therefore W 2 4 1∓z σ1 = sin ϕ is close to ±1, using (171) we see that cos ϕ is the same order as ||, and f∼ W

1 2

a2 +

1 2

d2

1 . cos2 ϕ

(180)

To estimate the derivative of the potential Vp (ϕ) = Vσ 3 ,σ4 (sin ϕ), we substitute x(z) ∼ a(1 ∓ z) + and y(z) ∼ ∓ a + 12 c ln(1 ∓ z) into the f and get equation (97) for CW f ∼ 1 a+ CW ) ± 12 d(a + 2 1∓z 1∓z 1 ∓(c + d)(a + 21 c ln(1 ∓ z) ± p . (181) 2(1 ∓ z) If we use the fact that 1 ∓ z is of the same order as 2 , then the above equation becomes 2 f ∼ ± d . (cos ϕ)CW cos3 ϕ

(182)

Because (166) implies that cos ϕ is of the same order as , we conclude −1 that ±V 0 (ϕ) is positive and of order || . Using the Newtonian equations 2 of motion (85), from the preceding statement we conclude that ∓ ddtϕ 2 is −1 positive and of order || during the time interval that Vp (σ1 ) is not close to Vq (−1). From the expression for the energy E (84) we see that this is

328

The rolling disk

2 the time interval when ϕ˙ is not close to C± = ±[ M (E − Vq (−1))]−1/2 . Let τ be the time needed to go from C± to C∓ . Then Z τ 2 d ϕ 2C± = ϕ(τ ˙ ) − ϕ(0) ˙ = dt. 2 0 dt

τ So 2C± ∼ || . Therefore τ is of order ||, which is of order |η± |. This proves the lemma.

If p 6∈ `± is close to q ∈ `± and the energy E > Vq (±1) is close to Vq (±1), then ϕ(t) stays close to ±π/2. In addition, if we take |η± | > 0 and sufficiently small compared to E − Vq (±1), then for most of the time d2 ϕ ∂Vp (sin ϕ) ∂Vq (sin ϕ) =− ∼− ∼ ±mgr, 2 dt ∂ϕ ∂ϕ see (170). If we start at the point where ϕ(0) is farthest away from ±π/2, which occurs when Vp (ϕ(0)) = E and ϕ˙ = 0, then ϕ(t) ˙ ∼ ±mgr, until 1/2 2 ϕ(t) ∼ π/2, where ϕ(t) ˙ ∼ ± M (E − Vq (±1) , see (87). The period of the periodic solution of the Newtonian system (85), that is, the time needed for ϕ(t) to go to the nearest ±π/2 and then return back to its initial position, is asymptotically equal to 23/2

1 √ (E − Vq (±1))1/2 . mgr I1 + mr2

Here we first let η± → 0 and then let E ↓ Vq (±1). The fact that this period goes to zero, that is, the frequency of the oscillations in ϕ(t) goes to infinity, is quite uncommon for oscillations in a potential well. If η± → 0, then the motion of ϕ(t) converges to that of the conservative Newtonian system (85) with (σ 3 , σ 4 ) = q ∈ `± together with the condition that we have an elastic reflection ϕ˙ 7→ −ϕ˙ when ϕ(t) = ±π/2, see §10.2. For E > Vq (±1) and E close to Vq (±1), the motion of ϕ(t) resembles that of a ball which is dropped on the floor and bounces back up elastically. The frequency of the bouncing tends to infinity as the height from which the ball falls goes to zero. 7.11.2

The increase of the angles ψ and χ

In order to reconstruct the motion of the disk during the short time of order |η± | that Vp (σ1 ) differs markedly from Vq (σ1 ), we look at what the angles χ and ψ in (35) and (27) do. The following lemma shows that in this short time they increase very rapidly from an initial limiting value to a finite final

329

7.11. Near falling flat

limiting value by an amount which in absolute value is equal to a constant whose sign depends on which side the point p approaches the line `± from. Lemma 7.11.2.26. We retain the assumptions of lemma 7.11.1.25. If cos ϕ << |η± |1/2 , then the time derivatives of χ(t) and ψ(t) are asymptotically equal to ηI±1 cos12 ϕ and ± ηI±1 cos12 ϕ , respectively. Moreover, they are of order ± η1± and η1± , respectively. However, the change is χ(t) and ψ(t) is small over any time interval for which 1 >> cos ϕ(t) >> |η± |. Proof. According to equation (36) we have σ3 dχ σ3 = = . dt 1 − σ12 cos2 ϕ It follows from the beginning of the proof of lemma 7.11.1.25 and (171) that x ∼ a(1 ∓ z) + ∼ 12 a cos2 ϕ + . This implies σ3 I3 1 I3 1 η± 1 =− u ex ∼− u e = , cos2 ϕ I1 cos2 ϕ I1 cos2 ϕ I1 cos2 ϕ

(183)

as long as cos ϕ << 1/2 , see (91) and the beginning of the proof of lemma 7.10.2.23. This shows that χ˙ ∼ ηI±1 cos12 ϕ , provided cos ϕ << |η± |1/2 . If cos ϕ >> ||, then (181) implies that Vp (sin ϕ) ∼ Vq (±1). Therefore ϕ˙ = ±

1/2 1/2 2 2 (E − V (ϕ)) ∼± (E − Vq (±1)) M M

(184)

remains bounded away from 0. As long as 1/2 >> cos ϕ >> ||, then Z t1 Z ϕ(t1 ) −1/2 σ3 (ϕ) 2 χ(t1 ) − χ(t0 ) = χ˙ dt = (E − V (ϕ)) dϕ (185) 2 t0 ϕ(t0 ) cos ϕ M

|| is of order cos ϕ << 1, Rbecause, using γ = cos ϕ as the integration variable, γ the integral becomes γ01 γ −2 dγ = γ0−1 − γ1−1 . If 1 >> cos ϕ and || = σ3 O(cos2 ϕ), then σ3 = − II13 u e x = O(cos2 ϕ). Therefore dχ dt = cos2 ϕ = O(1) and the change in χ(t) is small during the small time interval [ϕ(t0 ), ϕ(t1 )] when cos ϕ(t) remains small.

The statements about ψ(t) follow because (28) implies dψ σ3 σ3 = σ1 − σ4 ∼ ± 2 . dt 1 − σ12 cos ϕ Here we have used the facts that σ1 = sin ϕ ∼ ±1 and σ4 remains bounded in view of (19). This proves the lemma.

330

The rolling disk

q I1 +mr2 Proposition 7.11.2.27. Let ∆χ = π and let 0 < µ < ν be I1 positive constants. Consider the solutions of the Newtonian system (85) with constants of motion p = (σ 3 , σ 4 ) 6∈ `± near q ∈ `± and energy E ∈ [Vq (±1) + µ, Vq (±1) + ν]. Then for every δ and µ positive there are numbers 0 < µ e ≤ µ and ηe such that if 1) 0 < |η± | ≤ ηe and 2) [t0 , t1 ] is a time interval such that the value Vp (σ1 (t)) grows from Vq (±1) + µ e to E and then falls back to Vq (±1) + µ e as t traces out [t0 , t1 ], then |χ(t1 ) − χ(t0 ) − (sgn η± ) ∆χ| ≤ δ

(186)

|ψ(t1 ) − ψ(t0 ) − (sgn η± ) ∆χ| ≤ δ.

(187)

and

Here sgn η± is the sign of η± . Proof. We begin by estimating the integral on the right hand side of equation (185), where t0 < t1 and ϕ(t1 ) is the angle which is closest to ±1. In addition, V (ϕ(t1 )) = E, cos ϕ(t1 ) = O(||), and cos ϕ(t0 ) >> ||. Let V0 = Vq (±1). Then −1/2 V (ϕ) − V0 −1/2 −1/2 1− (E − V (ϕ)) = (E − V0 ) . (188) E − V0

f , (95), and (182) that It follows from V = mgr W V 0 (ϕ) ∼ ±mgr d2

1 . cos3 ϕ

(189)

This implies that V 0 (ϕ) has constant sign. Therefore there is an analytic substitution of variables ϕ = Φ(w) with w > 0 such that V (ϕ) − V0 = w2 , E − V0

(190)

f , (95), and (180) it follows that where ϕ(t1 ) = Φ(1). From V = mgr W V (ϕ) − V0 ∼

1 2

mgr d2

1 . cos2 ϕ

(191)

|| Therefore (190) and (191) imply that w ∼ C cos ϕ , where C is a positive constant. Consequently, ϕ(t0 ) = Φ(w0 ), where 0 < w0 << 1. If we differentiate (190) with respect to w, we obtain p V 0 (ϕ) Φ0 (w) = 2(E − V0 )w = 2 (E − V0 )(V (ϕ) − V0 ), (192)

331

7.11. Near falling flat

using (190) to obtain the second equality. Therefore using (188) and (192) we get p V (ϕ) − V0 √ (E − V (ϕ))−1/2 Φ0 (w) = 2 . V 0 (ϕ) 1 − w2 In view of (191) and (189) we obtain −1/2 0 2 ± (E − V (ϕ)) Φ (w) ∼ M

s

M cos2 ϕ 1 √ . mgrd || 1 − w2

Combining the above relation with σ3 (ϕ) = − II13 u ex ∼ − II31 u e and using the fact that sgn = −sgn η± , we see that the right hand side of (185) is asymptotically equal to Z 1 dw π √ ∼ (sgn η± )D , (sgn η± )D 2 2 1−w w0 q I32 M I3 where D = e. Using M = I1 + mr2 , d = I1 (I3 +mr e= 2 ) , and u mgrd I1 u q q mgr I1 +mr2 . I3 +mr2 , we obtain D = I1

During the time interval that Vp (σ1 (t)) decreases from E to being close to Vq (±1), we have the same increase in the angle χ(t). This follows because due to the elastic reflection at ±π/2, the angle ϕ(t) performs its time reversed motion during the time interval that Vp (σ1 (t)) increases from close to Vq (±1) to E. Note that dχ(t) dt does not change sign during this reversed motion, see the estimate for dχ dt in (183). The proof of the statement about the increase of χ(t) is complete. Again the statements about ψ(t) follow because (28) implies dψ σ3 σ3 = σ1 − σ4 ∼ ± 2 . dt 1 − σ12 cos ϕ Here we have used the facts that σ1 = sin ϕ ∼ ±1 and σ4 remains bounded in view of (19). During the elastic reflection described by proposition 7.11.2.27, the limiting angles χ and ψ increase by (sgn η± ) ∆χ and (sgn η± ) ∆ψ, respectively. 2 Note that ∆χ is greater than π and depends only on mr I1 (and not on the initial conditions of the solution). For the uniform disk and the hoop we √ √ have ∆χ = 5π and ∆χ = 3π, respectively.

332

The rolling disk

0

0

-2

-2

-4

-4

-6

-6

-8

-8

-10

-10

-12

-12 0

2

4

6

8

10

0

(a)

2

4

6

8

10

(b)

Fig. 7.3 The angle χ(t) for several periods of the motion of the uniform disk. In the figure (a) 0 < η+ << 1; while in figure (b) −1 << η+ < 0.

7.11.3

Motions near falling flat

In this subsection we describe the motions of the disk which are near to those where it falls flat. Because the E(2)-reduced energy (7) E(u, ω) =

1 2

hIω, ωi +

1 2

m hω × s, ω × si + mg h−s, ui

is the sum of positive terms and is conserved, being the total energy E of the disk, it follows that hIω, ωi ≤ E. Therefore the angular velocity vector ω and the rotational angular velocity ν = Aω are bounded during the short time interval that the rim of the disk is close to the horizontal position. From A˙ = A(ω(t))× it follows that the rotational motion A(t) of the near flat falling disk, during the time interval from the moment that the disk rises from close to the horizontal position until it falls close to the flat position, converges to the rotational motion A∗ of the disk falling flat, as described in §10.3. For q ∈ `± the solution of the conservative Newtonian system is discontinuous at the instant when the disk has fallen flat. However, nearby solutions for p 6∈ `± but near q, exist for all time – rising up from being near flat after having fallen to being nearly flat, and repeating this periodically for all time. To understand the rotational motion for all time, we look at the rotational angular velocity ν(t) as η± → 0. Lemma 7.11.3.28. We make the same hypotheses as in lemma 7.11.1.25. In addition, suppose that for η± = 0, the disk falls flat as t ↑ T∗ and rises from the flat position as t increases from T∗ . Let χ and ν be the angle and the vector defined in (35) and (21), respectively. When η± 6= 0 but η± → 0, 8 ( < χ↓ as t ↑ T∗ ν↓ as t ↑ T∗ then χ(t) →: χ as t ↓ T and ν(t) → ν↑ as t ↓ T∗ . Moreover, these limiting ↑

∗

333

7.11. Near falling flat

values satisfy χ↑ = χ↓ + (sgn η± ) ∆χ

(193)

ν↓ = (±A sin χ↓ , ∓A cos χ↓ , B)

(194)

ν↑ = (∓A sin χ↑ , ±A cos χ↑ , B), where A=

r

1/2 2 I1 E − Vq (±1) and B = σ3 . 2 I1 + mr I3 r(0)

(195)

(196)

Proof. Equation (193) comes from (186).

Consider the situation when the disk is close to being flat, but |η± | << cos ϕ << 1. Then the disk is not in the time interval where the angle χ in proposition 7.11.2.27 increases by order 1, that is, when || << 1 ∓ z << 1, see (171). Therefore x(z) ∼ a(1 ∓ z) = O(1 − z 2 ) and y(z) ∼ ∓a. In view of (91) this implies that σ3 (t) = O(1 − σ12 ) and 2

σ4 (t) ∼ ±

I1 σ 3 = ±B = (σ4 )∗ . I3 r(0)

(197)

In other words, σ4 (t) is close to its limiting value (175) when the disk is falling flat or rising up from being flat. From σ1 (t) = sin ϕ(t) and σ2 (t) = dσ1 dt it follows that (1 − σ12 )−1 σ2 = (cos−1 ϕ)ϕ, ˙ which together with v1 = cos ϕ cos χ and v2 = cos ϕ sin χ, see (35), gives (1 − σ12 )−1 σ2 v1 = ϕ˙ cos χ

and (1 − σ12 )−1 σ2 v2 = ϕ˙ sin χ.

Let χ∗ be the limit of the angle χ just before or just after the small time interval where the angle increases by order 1. Then from σ3 σ2 σ1 v1 + v2 ν1 = σ4 − 2 1 − σ1 1 − σ12 and

ν2 = − see (34), it follows that

σ2 σ3 ν1 + σ4 − σ1 ν2 2 2 1 − σ1 1 − σ1

ν ∼ (ϕ˙ sin χ∗ , −ϕ˙ cos χ∗ , B).

(198)

To obtain the third coordinate in (198) we used v3 = σ3 + σ1 σ4 ∼ ±(σ4 )∗ , see (197). Just before the disk falls flat we have ϕ˙ ∼ ±A, see (172), which changes sign to ϕ˙ ∼ ∓A just when the disk starts rising again. This proves the lemma.

334

The rolling disk

As η± ↓ 0 the rotational motion A(t) of the disk converges to a rotation A∗ as if it were falling flat, that is, when η± = 0. This motion is continued in time as follows. Lemma 7.11.3.29. Every time the disk falls flat, there are two cases. When η± ↓ 0, it rises up with a jump in the rotational velocity vector ν from ν↓ to ν↑ . Here ν↑ is obtained from ν↓ by applying a rotation around the vertical axis through an angle ∆χ − π. When η± ↑ 0, then ν↑ is obtained from ν↓ by applying a rotation around the vertical axis through an angle −(∆χ − π). Because ω and (1 − σ12 )−1/2 uhor = (cos ψ, sin ψ, 0) are bounded, from a˙ = r(1 − u23 )−1/2 A(ω × uhor)

it follows that a˙ remains bounded. Therefore the motion of the center of mass a(t) converges to the motion of the center of mass of the disk falling flat as described in §10.3, during the time interval that the disk rises up from being close to the flat position until it falls back close to the flat position. In order to understand how the limiting motion of the center of mass, which exists for all time, is a continuation of the motion of the center of mass of the flat falling disk, we investigate the limiting behavior of da dt . Lemma 7.11.3.30. We use the same hypotheses as in lemma 7.11.1.25. 8 ˙ as t ↑ T
∗

a˙ ↓ = r(±B sin χ↓ , ∓B cos χ↓ , A)

and a˙ ↑ = r(±B sin χ↑ , ∓B cos χ↑ , A), and A and B are given by (196). Proof. To estimate the right hand side of the equation

a˙ = r σ4 sin χ − ϕ˙ cos ϕ cos χ, −σ4 cos χ − ϕ˙ cos ϕ sin χ, −ϕ˙ sin ϕ ,

see (37), we observe that

σ4 sin χ − ϕ˙ cos ϕ cos χ ∼ ±B sin χ and −σ4 cos χ − ϕ˙ cos ϕ sin χ ∼ ∓B cos χ.

The asymptotic behavior of the component −ϕ˙ sin ϕ follows from lemma 7.11.1.25 and (196).

335

7.11. Near falling flat

0.2 0.15 0.1 0.05 0 -0.05 -0.1 -0.15

-0.15 -0.1 -0.05

0

0.05 0.1 0.15 0.2

-0.15 -0.1 -0.05

0

0.05 0.1 0.15 0.2

0.2 0.15 0.1 0.05 0 -0.05 -0.1 -0.15

Fig. 7.4 The horizontal projection of v(t) = A(t) e3 for the uniform disk, during 1.4 period. In the upper left figure 0 < η+ << 1; while in the lower left figure −1 << η+ < 0. Enlargement of the movement near the origin is given in the figures on the right.

336

The rolling disk

0.2 0.15 0.1 0.05 0 -0.05 -0.1 -0.15

-0.15 -0.1 -0.05

0

0.05 0.1 0.15 0.2

-0.15 -0.1 -0.05

0

0.05 0.1 0.15 0.2

0.2 0.15 0.1 0.05 0 -0.05 -0.1 -0.15

Fig. 7.5 The horizontal projection of v(t) = A(t) e3 for the hoop, during 1.4 period. In the upper left figure 0 < η+ << 1; while in the lower left figure −1 << η+ < 0. Enlargement of the movement near the origin are given in the figures on the right.

337

7.11. Near falling flat

Lemma 7.11.3.31. Every time the disk falls flat, it rises up again with the velocity a˙ there are two cases for continuing its motion a∗ when it is flat, see §10.3. First when η± ↓ 0, then the velocity a˙ of its center of mass jumps from a˙ ↓ to a˙ ↑ . Here a˙ ↑ is obtained from a˙ ↓ by applying a rotation about the vertical axis through an angle ∆χ to (a˙ ↓ )hor and changing the sign of (a˙ ↓ )3 . As η± ↑ 0 then (a˙ ↑ )hor is obtained from (a˙ ↓ )hor by applying a rotation about the vertical axis through an angle ∆χ. If a∗ denotes the limiting position of the center of mass when the disk is flat, it follows that σ1 p(t) = r p vhor + ahor, 1 − σ12

see (38), gives the position of the point of contact of the rolling disk with the horizontal plane. During the time interval when the disk is near the flat position, we have p(t) ∼ ±r(cos χ(t), sin χ(t), 0) + a∗ , since v is given by (35). From lemma 7.11.2.26 and proposition 7.11.2.27 it follows that during this time interval the point of contact races around the rim of the disk with velocity of order η1± . The angle p(t) − a∗ increases from the limiting angle χ↓ ±π, after falling down, to the limiting angle χ↑ ±π = χ↓ ±π+(sgn η± ) ∆χ before rising up. The point of contact has the limiting position p↓ = ±r(cos χ↓ , sin χ↓ , 0) + a∗ , after falling down, and p↑ = ±r(cos χ↑ , sin χ↑ , 0) + a∗ , when it is about to rise up. From (194) and (195) it follows that (ν↓ )hor is orthogonal to p↓ − a∗ and (ν↑ )hor is orthogonal to p↑ − a∗ with orientation so that p↓ and p↑ are the respective hinge points about which the disk is falling down or rising up, respectively. The vertical part ν3 of ν, which does not jump, is the spinning part of the limiting motion. The component ν3 vanishes if and only if σ 3 = 0, see (196). In view of the fact that η± = 0, the condition σ 3 = 0 implies that σ 4 = 0. Thus we are looking at limiting solutions near the special case of falling flat treated in §5.4. Even in this situation, nearby solutions exhibit the same jumps in the angles χ(t) and ψ(t) as do all the other near falling flat motions.

338

The rolling disk

2

1 5 1 5

1 1

0.5 0 5

-1

-0.5

1

0.5

1.5

0 5

0.5

0 5

-1

1

1

1 5

2

2 5

3

2

1 5

1 5

1

1

0 5

0 5

0 5

1

1 5

2

2 5

3

1 5

1

0 5

0 5

0 5

0 5

1

1

1

1 5

Fig. 7.6 The upper left and right figures give the trajectory of the center of mass of the uniform disk when it nearly falls flat, as in figure 7.4. In the upper left figure η± ↓ 0; while in the upper right figure η± ↑ 0. The lower left and right figures give the trajectory of the center of mass of the hoop when it nearly falls flat, as in figure 7.5. In the lower left figure η± ↓ 0; while in the lower right figure η± ↑ 0.

339

7.11. Near falling flat

3

2 2

1 1

-2

-1

1

2

-1

1

-1

-1

-2

-2

2

4

3

3

2

2

1

1

-1

1

2

3

4

-2

-1

1

-1

-1

-2

-2

2

Fig. 7.7 The upper left and upper right figures give the trajectory of the point of contact of the uniform disk as in the lower right and lower left figures of figure 7.4, respectively. The lower left and right figures give the trajectory of the point of contact of the hoop as in the upper right and upper left figures of 7.5, respectively.

340

7.12

The rolling disk

The bifurcation diagram

In this section we study in detail the behavior of the critical points of the family (σ 3 , σ 4 ) 7→ Vσ 3 ,σ4 of potential functions on (−1, 1) × R, see (83). 7.12.1

The bifurcation set B

Even though for points (σ 3 , σ 4 ) off the degeneracy locus ∆, the potential Vσ 3 ,σ 4 is a Morse function, that is, all of its critical points are nondegenerate, it can happen that as (σ 3 , σ 4 ) varies over a connected component of R2 \ ∆ the Morse data 2 of Vσ 3 ,σ 4 changes.

σ4 `−

`+ ∆ σ3

Fig. 7.8 The bifurcation set B of the family (σ 3 , σ 4 ) 7→ Vσ3 ,σ4 of potentials for the uniform disk.

Proposition 7.12.1.32. If (σ 3 , σ 4 ) varies in a connected component of R2 \ (∆ ∪ `+ ∪ `− ), then the Morse data of Vσ 3 ,σ4 does not change. Proof. From (165) it follows from that as (σ 3 , σ 4 ) varies in R2 \ (`+ ∪ `− ), no critical point σ10 of Vσ 3 ,σ4 can escape to or come in from the boundary of (−1, 1). Therefore as (σ 3 , σ 4 ) varies in R2 \ (`+ ∪ `− ), all critical points 2 Namely,

the number of critical points or their Morse index, that is, the number of negative eigenvalues of the second derivative at a critical point.

7.12. The bifurcation diagram

341

of Vσ 3 ,σ 4 remain in a compact subset of (−1, 1). As every critical point is then nondegenerate, it is isolated. Because its set of critical points is compact, it follows that Vσ 3 ,σ4 has only finitely many critical points for each (σ 3 , σ 4 ) ∈ R2 \ (∆ ∪ 3+ ∪ 3− ). From the implicit function theorem we see that these critical points vary in an analytic way as the parameters (σ 3 , σ 4 ) vary over R2 \ (∆ ∪ 3+ ∪ 3− ). Therefore their number is constant on each connected component of R2 \ (∆ ∪ 3+ ∪ 3− ). Since each of the subspaces 3+ and 3− intersect the degeneracy locus ∆ transversely at two diametrically opposite points, the set R2 \ (∆ ∪ 3+ ∪ 3− ) has eight connected components: four in the exterior of ∆ and four in the interior. The Morse type of each nondegenerate critical point, indicating whether it is a local minimum or maximum, that is, of Morse index 0 or 1, respectively, does not change under smooth perturbations of Vσ 3 ,σ4 . Therefore on a connected component of R2 \ (∆ ∪ 3+ ∪ 3− ) the Morse data of Vσ 3 ,σ 4 is constant. The conclusion of proposition 7.12.1.32 is the reason we call B = ∆ ∪ 3+ ∪ 3− the bifurcation set of the rolling disk in (σ 3 , σ 4 )–space. If p = (σ 3 , σ 4 ) is a smooth point of the degeneracy locus ∆, where it is permitted that p ∈ 3+ ∪ 3− , then at p the function Vp has a third order degenerate critical point σ10 . This critical point disappears if p moves to the exterior of ∆; whereas two nondegenerate critical points appear near σ10 if p moves into the interior of ∆: one of which is a local minimum and the other is a local maximum. Therefore, if we know the number of critical points of Vp and their Morse types for some p in each connected component of R2 \ B in the exterior of ∆, then we know these data for every p ∈ R2 \ B. 7.12.2

Off the bifurcation set B

First we consider the case when (σ 3 , σ 4 ) does not lie on B. Lemma 7.12.2.33. Let (σ ∗3 , σ ∗4 ) be one of the four cusp points of the degeneracy locus ∆, see proposition 7.9.2.19. If either |σ 4 | ≥ |σ ∗4 | or |σ 3 | ≥ |σ ∗3 |, then V0,σ 4 or Vσ 3 ,0 , respectively, has σ1 = 0 as its only critical point, where each of the functions attains its minimal value. Proof. Using the notation of §6 and §7.3, if X = 0 and z -= 0, then the solution of the rescaled Chaplygin equations (92) is given by (x(z), y(z)) =

342

The rolling disk

(Y o(z), Y o* (z)). Therefore

0 dY 2

1 o(z)2 (1 − z 2 )3/2 1 1 o(z) * o (z) − 1 , +(c + d)Y 2 √ 1 − z2 z

6=√ z CW 1 − z2

(199)

see (148). From §7.3 we see that the normalized odd solution of the rescaled 1 Chaplygin equation satisfies o(z)/z > 1 and o* (z) > 1. Because √1−z >1 2 and the first term in the square brackets in (199) is positive; while the second term is greater than or equal to 1 if (c + d) Y 2 ≥ 1. Therefore the expression between the square brackets in (199) is strictly positive if (200) (c + d) Y 2 ≥ 1. 7 Since σ 4 = u ' Y (98) and u ' = I3mgr +mr2 , equation (200) is equivalent to

mgr 1 = (σ ∗4 )2 , c + d I3 + mr2 using (132). Therefore V0,σ 4 has no critical points other than σ1 = 0, if |σ 4 | ≥ |σ ∗4 |. From (164) it follows that this critical point is a global minimum of V0,σ 4 . σ 24 ≥

If Y = 0 and z -= 0, then (x(z), y(z)) = (X e(z), X e* (z)) is a solution of the rescaled Chaplygin equations. Therefore 0 1 z 6 dX 2 CW = √ e(z)2 2 (1 − z 2 )3/2 1−z 1 1 e* (z) 2 −1 . (201) +c(c + d)X √ e(z) z 1 − z2 But the normalized even solution of the rescaled Chaplygin equations sat1 isfies e(z) > 1 and e* (z)/z > c, see §7.3. Because √1−z > 1 and 2 1 > 1, the expression in (201) between square brackets is strictly (1−z 2 )3/2 positive if (d + cd + c2 ) X 2 ≥ 1. 7 Since σ 3 = − II31 u ' X (98) and u ' = I3mgr +mr2 , we see that σ 23 =

(202)

I32 mgr I32 mgr 1 X2 ≥ = (σ ∗3 )2 , 2 2 2 I1 I3 + mr d + cd + c I12 I3 + mr2

using (133). Therefore Vσ 3 , 0 has no critical points other than σ1 = 0, if |σ 3 | ≥ σ3∗ . Again from (164) it follows that this critical point is the global minimum of Vσ 3 , 0 .

343

7.12. The bifurcation diagram

Because each connected component of R2 \B in the exterior of ∆ contains points where lemma 7.12.2.33 holds, we conclude that if (σ 3 , σ 4 ) lies in the exterior of ∆ and does not lie on either of the lines `+ or `− , then Vσ 3 ,σ4 has precisely one critical point, which is a global minimum. If (σ 3 , σ 4 ) ∈ R2 \ (`+ ∪ `− ) is in the interior of ∆, then Vσ 3 ,σ4 has three critical points, one of which is a local maximum and the other two are local minima, which are separated by the local maximum. This follows because on ∆ \ (`+ ∪ `− ) less its four cusp points, the family (σ 3 , σ 4 ) → Vσ 3 ,σ4 has an A2 -singularity. In other words, Vσ 3 ,σ 4 has two critical points, one of which is a third order degenerate critical point and the other a nondegenerate global minimum, see proposition 7.8.4.14. If (σ 3 , σ 4 ) is a cusp point of ∆, then from proposition 7.8.3.12 and lemma 7.12.2.33 it follows that Vσ 3 ,σ4 has only one critical point σ1 = 0, which is degenerate of degree four, and is a global minimum of Vσ 3 ,σ 4 . 7.12.3

On a coordinate axis or in an open quadrant

In this subsection we consider the case when (σ 3 , σ 4 ) lies on either the σ 3 -axis or the σ 4 -axis or in an open quadrant of the (σ 3 , σ 4 )-plane. V0,σ4

V0,σ4

V0,σ4

σ1 1

-1 0 < |σ 4 | <

q

5 12

σ1 -1

1 |σ 4 | =

q

5 12

σ1 1

-1 |σ 4 | >

q

5 12

Fig. 7.9

The qualitative behavior of the potential V0,σ 4 along the σ 4 -axes for the uniq 5 form disk, when it crosses the bifurcation set B at one of the cusps (0, ± 12 ).

Suppose that either σ 3 = 0 or σ 4 = 0. Then under either of the symmetries (σ1 , σ 3 , σ 4 ) 7→ (−σ1 , −σ 3 , σ 4 ) or (σ1 , σ 3 , σ 4 ) 7→ (−σ1 , σ 3 , −σ 4 ) the potential Vσ 3 , σ4 is invariant, that is, Vσ 3 , σ4 (σ1 ) = Vσ 3 , σ 4 (−σ1 ). Therefore, if σ10 is a critical point of Vσ 3 , σ4 , then so is −σ10 . Moreover, if σ10 is nondegenerate, then the Morse index of −σ10 is the same as the Morse index of σ10 . If (σ 3 , σ 4 ) is outside ∆, then Vσ 3 , σ4 has only one critical point, which must be σ1 = 0, see lemma 7.12.2.33. If (σ 3 , σ 4 ) is inside ∆, then Vσ 3 , σ 4 has a unique local maximum, which is equal to σ1 = 0, and two local

344

The rolling disk

minima σ1− , σ1+ with p σ1− < σ1+ , for which σ1− = −σ1+ . If (σ 3 , σ 4 ) = (0, 0) then V0, 0 (σ1 ) = mgr 1 − σ12 , whose graph is the upper semicircle. The situation when (σ 3 , σ 4 ) is on one of the coordinate axes is illustrated in figure 7.9. Conversely, if σ10 = 0 is a critical point of Vσ 3 , σ4 , then from (45) it follows that σ3 σ4 = 0. Because σ10 = 0 implies that σ3 = σ 3 and σ4 = σ 4 (see (78) and (79)), we obtain σ 3 σ 4 = 0. In other words, if (σ 3 , σ 4 ) is not on one of the coordinate axes, then every critical point σ10 of Vσ 3 , σ4 is nonzero. Lemma 7.12.3.34. If σ 3 σ 4 > 0 and σ1 > 0, then Vσ 3 , σ4 (−σ1 ) > Vσ 3 , σ4 (σ1 ) and Vσ0 3 , σ 4 (−σ1 ) < −Vσ0 3 , σ 4 (σ1 ).

(203)

The other cases can be obtained by applying either of the symmetries (σ1 , σ 3 , σ 4 ) 7→ (−σ1 , −σ 3 , σ 4 ) or (σ1 , σ 3 , σ 4 ) 7→ (−σ1 , σ 3 , −σ 4 ). Proof. We insert x(z) = X e(z) + Y o(z) and y(z) = x0 (z) = X e0 (z) + Y o0 (z), see (110), into equation (95), which defines WX,Y . Then we use the fact that x(−z) = X e(−z) + Y o(−z) = X e(z) − Y o(z)

y(−z) = X e0 (−z) + Y o0 (−z) = −X e0 (z) + Y o0 (z),

which implies x(−z)2 − x(z)2 = −4XY e(z) o(z)

y(−z)2 − y(z)2 = −4XY e0 (z) o0 (z).

(204)

In this way we obtain WX, Y (−z) − WX, Y (z) = −2XY d

1 0 0 e(z) o(z) + e (z) o (z) . 1 − z2

(205)

Because the functions e(z), o(z), e0 (z), and o0 (z) are positive when z > 0. From the second inequality in (203) it follows that WX,Y (−z) > WX, Y (z), when z > 0 and XY < 0. In view of the fact that σ 3 = − II13 u e X and σ4 = u e Y (98), where z = σ1 , the first inequality in (203) follows. If we insert

−z x(−z)2 + z x(z)2 = 4XY e(z) o(z)

345

7.12. The bifurcation diagram

and x(−z) y(−z) + x(z) y(z) = 2XY (e(z) o0 (z) + o(z) e0 (z)), f (−z) + CW f (z) we get into the expression for CW z f )(−z) + (CW f )(z) = 2XY 2d (CW e(z)o(z) (1 − z 2 )2

+e(z)o0 (z) + o(z)e0 (z) .

(206)

This yields the second inequality in (203). Suppose that (σ 3 , σ 4 ) 6∈ `+ ∪ `− and that σ 3 σ 4 > 0. If Vσ 3 ,σ4 attains its global minimum at σ10 , then σ10 6= 0 However, the first inequality in (203) shows that σ10 > 0. From proposition 7.8.3.12 it follows that if (σ 3 , σ 4 ) is near one of the four cusp points of ∆ and inside ∆, then Vσ 3 ,σ 4 has two local minima σ1− and σ1+ with σ1− < 0 < σ + . Because these critical points do not pass through 0, when (σ 3 , σ 4 ) varies in an open quadrant of the (σ 3 , σ 4 )-plane and lies in that part of R2 \ B which is inside ∆, it follows that Vσ 3 ,σ4 has two local minima σ1− and σ1+ with σ1− < 0 < σ + . Since there are no other local minima, σ1− is a global minimum of Vσ 3 ,σ4 |(−1, 0] and σ1− is a global minimum of Vσ 3 ,σ4 |[0, 1). Now suppose that σ 3 σ 4 > 0. Because Vσ 3 ,σ 4 attains its global minimum at σ10 > 0, it follows that σ10 = σ1+ . So Vσ 3 ,σ 4 (σ1+ ) < Vσ 3 ,σ 4 (σ1− ). From (203) we see that 0 > Vσ0 3 ,σ 4 (−σ1− ) + Vσ0 3 ,σ4 (σ1− ) = Vσ0 3 ,σ4 (−σ1− ), since σ1− is a critical point of Vσ 3 ,σ4 . This in combination with (164) implies that Vσ 3 ,σ4 has a local minimum in (−σ1− , 1). But σ1+ is the unique minimum of Vσ 3 ,σ 4 |[0, 1). Therefore σ1+ > −σ1− , assuming that σ 3 σ 4 > 0. If (σ 3 , σ 4 ) 6∈ `+ ∪ `− is inside ∆, Vσ 3 ,σ 4 has one more critical point σ1∗ , which is a local maximum. If (σ 3 , σ 4 ) approaches a point of ∆ in the positive quadrant, then σ1∗ merges with a local minimum to form a third degree degenerate critical point, which is not the global minimum of Vσ 3 ,σ 4 . Therefore σ1∗ merges with σ1− , which implies that σ1∗ < 0. Because the critical points of Vσ 3 ,σ 4 do not cross 0 as long as (σ 3 , σ 4 ) stays in the same open quadrant, we obtain that σ1∗ < 0 for the local maximum σ1∗ of Vσ 3 ,σ 4 as long as σ 3 σ 4 > 0.

346

7.12.4

The rolling disk

Near `±

In this subsection we look at the behavior of Vσ 3 ,σ4 where (σ 3 , σ 4 ) is near one of the lines `± . Proposition 7.12.4.35. If p = (σ 3 , σ 4 ) 6∈ `± approaches a point q = (σ 3 , σ 4 ) of `± , then there is a critical point of Vσ 3 , σ4 which converges to ±1. Proof. Let µ > 0 be given. From lemma 7.10.2.23 it follows that there exist numbers a and b with 1−µ ≤ a < b < 1 such that Vq (a)−Vq (b) > 0. Because the potential Vσ 3 ,σ4 depends continuously on the parameters (σ 3 , σ 4 ), there is a δ > 0 such that if |p−q| < δ, then Vp (a)−Vp (b) > 0. If p 6∈ `+ , then from (164) it follows that there is a number α ∈ (b, 1) such that Vp (α) > Vp (b). Because Vp is continuous on [a, α], it attains its minimum at some point σ ∈ [a, α]. So Vp (σ) ≤ Vp (b). Therefore σ 6= a and σ 6= α, which implies that Vp0 (σ) = 0. Thus for every p ∈ / `+ such that |p − q| < δ, the potential Vp has a critical point σ ∈ (−1, 1) such that |σ − 1| < µ. The case that q ∈ `− is proved similarly. If σ 3 σ 4 > 0, then p = (σ 3 , σ 4 ) can only approach a point q ∈ `+ . In this situation Vp has only one critical point σ10 in [0, 1), at which Vp attains its global minimum. We conclude that σ10 → 1 if p → q. Using lemma 7.10.2.23, we know that Vq (σ1 ) has a continuous extension to (−1, 0] and it attains it global minimum at σ1 = 1 with minimal value Vq (1) = I12 (I3 +mr2 ) 2I32 r(0)

σ 3 , see (169).

If σ10 stays away from ±1, then the only bifurcations which can happen are the creation or annihilation of critical points at degenerate critical points, that is, when q ∈ ∆ ∩ `+ . In this situation there is a unique point σ10 (q) at which Vq has a third degree degenerate critical point. For p ∈ ∆\`+ close to q, the degenerate critical point σ10 (p) satisfies σ10 (p) < 0. Because e see proposition σ10 (p) depends analytically on p ∈ ∆, when (σ10 (p), p) ∈ ∆, 0 7.9.1.15, it follows that σ1 (p) < 0 as well. Therefore, if q ∈ `+ but is outside of ∆, then Vq has no critical points at all and is montonically decreasing from +∞ at σ1 = −1 to its global minimum Vq (1) at σ1 = 1. If q ∈ `+ is inside of ∆, then Vq has a local minimum at σ1− and a local maximum at σ1∗ such that σ1− < σ1∗ < 0 and Vq (σ1∗ ) > Vq (σ1− ) > Vq (1).

7.12. The bifurcation diagram

7.12.5

347

Global qualitative description of Vσ3 ,σ4

In this subsection we summarize the results we have obtained about the family (σ 3 , σ 4 ) → Vσ 3 ,σ4 of potential functions. Figure 7.10 illustrates this family of potentials for the uniform disk. Proposition 7.12.5.36. Write p = (σ 3 , σ 4 ). a) Suppose that p = (0, 0). Then Vp (σ1 ) = mgr b) Suppose that p = - (0, 0) but σ 3 σ 4 = 0.

8 1 − σ12 .

i) If p is outside ∆, then Vp has one critical point at σ1 = 0, which is a nondegenerate global minimum of Vp . ii) If p is inside ∆, then Vp has three nondegenerate critical points σ1− , 0, and σ1+ such that σ1− < 0 < σ1+ . Here σ1− and σ1+ are local minima and 0 is a local maximum. Moreover σ1− = −σ1+ , Vp (σ1− ) = Vp (σ1+ ), and Vp attains its global minimum at both σ1− and σ1+ . As σ1 → ±1, we have Vp (σ1 ) ↑ ∞. iii) If p ∈ ∆, then Vp has a unique critical point, which is degenerate of fourth degree. Moreover, V (σ1 ) ↑ ∞ as σ1 → ±1.

c) Suppose that p -∈ 3+ and σ 3 σ 4 > 0.

i) If p is outside ∆, then Vp has only one critical point σ10 , which is nondegenerate and satisfies σ10 > 0. At σ10 the potential Vp attains its global minimum value. ii) If p is inside ∆, then Vp has three nondegenerate critical points, σ1− , σ1∗ , and σ1+ such that σ1− < σ1∗ < 0 < σ1+ with σ1+ > −σ1− and Vp (σ1+ ) < Vp (σ1− ) < Vp (σ1∗ ). The critical points σ1− and σ1+ are local minima of Vp , while σ1∗ is a local maximum. Vp attains its global minimum value at σ1+ and as σ1 → ±1, we have Vp (σ1 ) → ∞. iii) If p ∈ ∆, then Vp has two critical points σ1− and σ1∗ with σ1− < σ1∗ < 0 and Vp (σ1∗ ) > Vp (σ1− ) > Vp (1). The critical point σ1− is a nondegenerate local minimum and σ1∗ is a degenerate critical point of third degree. Vp |[σ1∗ , 1) monotonically decreases to Vp (1).

d) Suppose that p ∈ 3+ and p -= (0, 0). Then Vp (σ1 ) → ∞ as σ1 → −1 I 2 (I +mr2 )

and Vp (σ1 ) → Vp (1) = 12I 23r(0)2 σ 3 as σ1 ↑ 1. For every σ1 ∈ 3 (−1, 1) we have Vp (1) < Vp (σ1 ).

348

The rolling disk

σ4 A

I F

0

0

B

0

0

I

B

10 D

B 1

E 0

I

0

1

0

F I0

E B

D 1

A

C

A

σ3

0

1 D D0 1 B0 0 0 E E 1 1 I0 0 F F B0 B I I A

B I0

1

I

B

A

-1 C

1

-1 D

1

-1 E

1

-1 F

1

-1

1

-1

1

-1

1

-1

1

Fig. 7.10 The horizontal axis is σ1 . The vertical axis gives the values of the potential Vσ 3 ,σ4 , where (σ 3 , σ 4 ) is in the numbered region or lettered position. ◦ is a limiting value of the potential; • is a nondegenerate critical point; a black box is a degenerate critical point; vertical dotted line is a vertical asymptote. The bifurcation diagram for the family (σ 3 , σ4 ) → Vσ3 ,σ4 of potential functions. The insets labeled with prime are reflections in the vertical axis of the corresponding unprimed inset.

7.12. The bifurcation diagram

349

i) If p is outside ∆, then Vp has no critical points in (−1, 1). ii) If p is inside ∆, then Vp has two nondegenerate critical points σ − and σ1∗ such that σ1− < σ1∗ < 0 and Vp (σ1− ) < Vp (σ1∗ ). Moreover, σ1− is a local minimum and σ1∗ is a local maximum of Vp . iii) If p ∈ ∆, then σ1− and σ1∗ merge to form a third degree degenerate critical point σ 0 < 1. Also Vp decreases monotonically from ∞ to Vp (1) as σ1 increases from −1 to 1. 7.12.6

Global description of the orbits of Xσ3 ,σ4

In this subsection we give a global qualitative description of the solutions of the conservative Newtonian system (85) in the (ϕ, ϕ)-plane, ˙ which arises from the potential V (ϕ) = Vσ 3 ,σ4 (sin(ϕ)) (83). This is the same as giving 1 a description of the orbits in (σ1 , σ2 )-space, where dσ dt = σ2 . We use the results we have obtained in §5.3. The results of this discussion for the uniform disk are summarized in figure 7.11 below. When p = (σ 3 , σ 4 ) is in the first quadrant with σ 3 > 0 and σ 4 > 0, is inside ∆, and does not lie on `+ , then the potential Vp has three nondegenerate critical points σ1− , σ1∗ , and σ1+ such that σ1− < σ1∗ < 0 < σ1+ and Vp (σ1∗ ) < Vp (σ1− ) < Vp (σ1+ ). When the total energy E (81) is greater than Vp (σ1∗ ), then the level set of the energy is a closed curve on which the integral curve of Xσ3 ,σ 4 is periodic, that is, σ1 oscillates between two values, where the value of the potential Vp is E. If E = V (σ1∗ ), then because σ1∗ is a nondegenerate critical point of Vp of Morse index 1, it follows that the level set of the energy is a figure eight, consisting of a hyperbolic equilibrium point (σ1∗ , 0) and its stable and unstable manifolds (which are equal). If E ∈ (Vp (σ1− ), Vp (σ1∗ )), then the energy level set consists of two closed curves, on which the motion is periodic. More precisely, the set {σ1 ∈ (−1, 1) Vp (σ1 ) ≤ E} consists of two intervals I + and I − , where σ1± ∈ I ± . Because Vp (σ1+ ) < Vp (σ1− ) we call I + the lowest potential well and I − the highest . The periodic orbits of Xσ 3 ,σ4 correspond to the angle ϕ(t) oscillating in one of the potential wells. When E ↓ V (σ1− ), the highest potential well shrinks to the point σ1− and the corresponding periodic orbit shrinks to the elliptic equilibrium point (σ1− , 0). The periodic orbit in the lowest potential well still exists. When E ∈ (Vp (σ1+ ), Vp (σ1− )) the level set of the energy is one closed curve, where the angle oscillates in the lowest

350

The rolling disk

σ4 A

I F 0

0

0

B 0

I

B

B 1

E 0

10 D

I

0

1

F I0

E

0

B

D 1

A

C

A

σ3

0

1 D D0 1 B0 0 0 E E 1 1 I0 0 F F B0 B I I A

B I0

1

I

-1 A -1 D -1

1 B 1 -1 E 1 -1

-1

1 C 1 -1

1

F 1 -1

1

Fig. 7.11 The horizontal axis is σ1 . The vertical axis is σ2 . The curves are the level lines of the Hamiltonian Hσ3 ,σ 4 , where (σ 3 , σ4 ) is in the numbered region or lettered position. • is a nondegenerate critical point; a black box is a degenerate critical point; ◦ is a limit point of a level set. The level sets of the total energy Hσ 3 ,σ4 on the (σ1 , σ2 )-plane. The insets labeled with prime are reflections in the vertical axis of the corresponding unprimed set.

7.13. The integral map

351

potential well. As E ↓ Vp (σ1+ ), the lowest potential well shrinks to the elliptic equilibrium point (σ1+ , 0). Because Vp attains its global minimum at σ1 = σ1+ , the energy level set is empty when E < Vp (σ1+ ). When p = (σ 3 , σ 4 ) lies on `+ but is outside ∆, then all solutions rise from the flat position σ1 = 1 and fall back to the flat position in finite time. When p ∈ `+ but is inside ∆, then Vp has three nondegenerate critical points σ1− , σ1∗ , and σ1+ , as described above, but σ1+ has moved to 1. If E > Vp (σ1∗ ), then the energy level set is an open interval with end points at (1, 0) in the (σ1 , σ2 )–plane. The solution leaves (1, 0) and returns there in finite time. If E ∈ (Vp (σ1− ), Vp (σ1+ )), then the highest potential well survives; whereas the solution which was in the lowest well now comes from and goes to (1, 0) in finite time. When E ∈ (Vp (1), Vp (σ1− )) only the solution rising from the flat position survives. It shrinks to (1, 0) as E ↓ Vp (1). Because Vp (1) is a global minimum of Vp , the energy level set is empty when E < Vp (1). When p ∈ ∆ ∩ `+ , then σ1∗ is a degree three degenerate critical point of Vp; whereas 1 is an absolute minimum. If E > Vp (σ1∗ ) or E ∈ Vp (σ1∗ ), Vp (1) , then the energy level set is an open interval whose ends are at the point (1, 0). The disk rises from being flat and reverses its motion as it falls flat. If E = Vp (σ1∗ ), then the energy level set is a closed curve with a cusp at σ1 = σ1∗ . As E ↓ Vp (1) the energy level set shrinks to the point σ1 = 1; while for E < Vp (1) it is empty, because Vp (1) is the absolute minimum value of Vp . The other cases follow using the symmetries (σ1 , σ 3 , σ 4 ) 7→ (−σ1 , −σ 3 , σ 4 ) and (σ1 , σ 3 , σ 4 ) 7→ (−σ1 , σ 3 , −σ 4 ), reflect the graph of the potential Vσ 3 ,−σ4 in the vertical coordinate axis and reflect the orbits in the first quadrant to those in the second or the fourth quadrant, respectively. 7.13

The integral map

In this section we consider σ 3 and σ 4 as functions on the fully reduced phase space (−1, 1) × R3 . In other words, (σ 3 , σ 4 ) are components of the map π1 ◦ ψ −1 ◦ P introduced in §5.1. By construction, the functions σ 3 and σ 4 are integrals, or constants of motion, of the fully reduced vector field V (13). These integrals are not defined algebraically, but instead are found by solving Chaplygin’s equations (76).

352

The rolling disk

If we combine these integrals with the energy integral E (19), we obtain the integral map I : (−1, 1) × R3 → R3 : (σ1 , σ2 , σ3 , σ4 ) 7→ σ 3 , σ 4 , Eσ 3 ,σ4 (σ1 , σ2 ) , (207)

where

Eσ 3 ,σ4 (σ1 , σ2 ) = E σ1 , σ2 , σ3 (σ1 ; σ 3 , σ 4 ), σ4 (σ1 ; σ 3 , σ 4 ) .

Because the derivative of the map (σ1 , σ2 , σ3 , σ4 ) 7→ (σ 3 , σ 4 ) has rank two everywhere, (in fact this map is an analytic fibration), it follows that the derivative of the integral map I has rank less than three if and only if (σ1 , σ2 ) is a critical point of Eσ3 ,σ4 . This in turn is equivalent to saying that σ2 = 0 and σ1 is a critical point of the potential Vσ 3 , σ 4 (83). 7.13.1

Regular values of I

If (σ 3 , σ 4 , E) is a regular value of I, which lies in its image, then the fiber I −1 (σ 3 , σ 4 , E) is a smooth one dimensional submanifold of (σ1 , σ2 )-space (−1, 1)×R. This submanifold is a union of integral curves of the vector field Xσ3 ,σ4 (86). If (σ 3 , σ 4 ) 6∈ `+ ∪ `− , then the fiber I −1 (σ 3 , σ 4 , E) consists of one or two periodic orbits of Xσ3 ,σ 4 ; whereas if (σ 3 , σ 4 ) ∈ `+ ∪ `− , then the fiber also has orbits which are open intervals corresponding to motions where the disk falls flat in finite time. Let R be the set of regular values of the integral map I, which lie in its image. Let C be a connected component of the complement of the bifurcation set B = ∆ ∪ `+ ∪ `− . Let π be the projection map π : R3 → R2 : (σ 3 , σ 4 , E) 7→ (σ 3 , σ 4 ). The set R has three connected components. One is

R2 = {(σ 3 , σ 4 , E) p = (σ 3 , σ 4 ) ∈ C, p inside ∆, v− (p) < E < v∗ (p)}.

We now define what v− and v∗ are. If σ10 is a nondegenerate critical point of Vp0 with p0 = (σ 03 , σ 04 ), then by the implicit function theorem there is a neighborhood U of p0 in R2 and a neighborhood W of σ10 in (−1, 1) such that the potential Vp has exactly one critical point σ1 (p). Let v(p) = Vp (σ1 (p)). From proposition 7.12.5.36 b) case ii) for every p = (σ 3 , σ 4 ) ∈ R2 , the potential Vp has three nondegenerate critical points σ1− (p) < σ1∗ (p) < σ1+ (p) with corresponding critical values v− (p) < v∗ < v+ (p). Therefore R2 is a connected component of the set R of regular values of the integral map I, which lie in π −1 (C). The fiber of I over a point in R2 consists of two periodic orbits of Xσ3 ,σ4 . The other two connected components are Ri1 = {(σ 3 , σ 4 , E) p = (σ 3 , σ 4 ) ∈ C, p inside ∆, v− (p) < E < v+ (p)},

7.13. The integral map

353

and Ro1 = {(σ 3 , σ 4 , E) p = (σ 3 , σ 4 ) ∈ C, p outside ∆, v0 (p) < E}, where v0 (p) = Vp (σ10 (p)) and σ10 (p) is the unique global minimum of Vp . The fiber of I over a point in either Ri1 or Ro1 consists of one periodic orbit of Xσ3 ,σ 4 . This completes the description of the set of regular values of the integral map I. 7.13.2

The global geometry of the critical value surface

In this subsection we define the surface of critical values of the integral map I, find its singularities, and describe its global geometry. 7.13.2.1

The critical value surface

Let Σ be the set of singular values of the integral map I (207), that is, the image under I of the set of points in (−1, 1) × R3 where the derivative of I has rank less than three. Then Σ = (σ 3 , σ 4 , v) ∈ R3 v is a critical value of Vσ 3 ,σ4 .

Let S be the set of critical points of the family Vσ 3 ,σ 4 , that is, S = (σ1 , σ 3 , σ 4 ) ∈ (−1, 1) × R2 Vσ0 3 ,σ4 (σ1 ) = 0 . Then Σ is the image of S under the map

Φ : (−1, 1) × R2 → R3 : (σ1 , σ 3 , σ 4 ) 7→ σ 3 , σ 4 , Vσ 3 ,σ4 (σ1 ) .

(208)

Above we showed that the set Snd of points in (σ1 , σ 3 , σ 4 ) ∈ (−1, 1)×R2 , where σ1 is a nondegenerate critical point of Vσ 3 ,σ 4 , is a smooth analytic surface in (−1, 1) × R3 . Moreover, the restriction of the map Φ (208) to Snd is an analytic immersion into R3 with Φ(Snd ) ⊆ Σ. The set of critical points S is the disjoint union of Snd and the set Sdeg of points (σ1 , σ 3 , σ 4 ) ∈ (−1, 1) × R2 where σ1 is a degenerate critical point of Vσ 3 ,σ 4 . Therefore Σ = Φ(Snd ) ∪ Φ(Sdeg ). From proposition 7.9.1.15 it follows that there is a continuous function ∆ → (−1, 1) : p 7→ σ1 (p) on the degeneracy locus ∆, which is analytic on the smooth part of ∆ such that {(σ1 (p), p) ∈ (−1, 1) × R2 p ∈ ∆} = Φ(Sdeg ). Below in §13.2.3 we show that Φ(Sdeg )∩Φ(Snd ) = ∅. Therefore we call Φ(Snd ) the nondegenerate part Σnd of Σ and Φ(Sdeg ) the degenerate part Σdeg of Σ. For p = (σ 3 , σ 4 ) on the smooth part of the degeneracy locus ∆, every degenerate critical point of Vp is of third degree and can be approximated by

354

The rolling disk

two nondegenerate critical points of Vp0 , where p0 lies inside of ∆ and close to p. The corresponding critical values of Vp0 and Vp depend continuously on p0 and p. Therefore Σdeg ⊆ Σnd . Because Σ is the union of the analytically immersed surface Σnd and the analytic curve Σdeg , we call Σ the critical value surface of the family (σ 3 , σ 4 ) 7→ Vσ 3 ,σ 4 of potential functions. 7.13.2.2

The singularities of Σ

In order to describe the critical value surface Σ near the singular points Σnd , we use the normal form given in §8.3. We start with the observation that the local diffeomorphism R2 × R → R3 : (x, v) 7→ (σ 3 (x), σ 4 (x), v) maps a local piece of the set of critical values of the normal form family onto the corresponding set of critical values of the family of potential functions (σ 3 , σ 4 ) 7→ Vσ 3 ,σ4 . The set of critical values of the normal forms is the image under the map R2 × R → R2 × R : (x, v) 7→ (x, v + h(x)) of the set of critical values of the normal form (136) for the A2 and A3 -singularities, namely, p = 1 and p = 2, respectively. The set of critical values of the A2 normal form family ((x1 , x2 ), α) 7→ α3 + x1 α is {(−3 α3 , x2 , −2 α2 ) ∈ R3 α ∈ R, x2 ∈ R}. The x2 = const. slice of the set of critical values is a standard cusp. Therefore the critical value surface of the A2 -family is a surface with a cusped edge along the x2 -axis. Transfering this to the critical value surface Σ, we find that along the smooth part of Σdeg , the critical value surface has a cusped edge with a tangent plane Π which contains the tangent plane of the smooth part of the curve Σdeg . The fact that Dh on the cusp edge is increasing along the axis tangent to the branches of the cusp at the cusp point, see corollary 7.8.3.11, implies that Π tilts upward if we move outward in a radial direction in the (σ 3 , σ 4 )-plane. The upper and lower sheets of the smooth part of Σ near Σdeg correspond to nondegenerate local maxima and minima of our family of potential functions. These sheets meet at the cusped edge, which corresponds to the set of third degree degenerate critical points. Because Σ is singular along Σdeg , we call Σdeg the singular part of Σ. Even though Σnd is an analytically immersed surface in R3 with self intersections, we will call Σnd the regular part of Σ.

355

7.13. The integral map

The set of critical values of the A3 -normal form family ((x1 , x2 ), α) 7→ α + x1 α + x2 α2 is (−4α3 − 2x2 α, x2 , −3α4 − x2 α2 ) ∈ R3 α ∈ R, x2 ∈ R , 4

which is the standard swallow tail in (x1 , x2 , x3 )-space with its axis along the x2 -axis and its tail upward, see figure 7.12. For negative x2 the x1 -x3 planar cross sections of the swallow tail has two cusps lying over the degeneracy locus Γ0 (146) of the normal form, which is a cusp. As x2 moves toward the origin, the cusped edges merge at the origin, which is called the organizing center of the swallow tail. The upper sheet of the tail, which connects the cusped edges, corresponds to the nondegerate local maxima of the normal form; whereas the remaining smooth part corresponds to the nondegenerate minima. Note that the transverse self-intersection of the two sheets, corresponding to the local minima, lies below the sheet corresponding to the local maxima. With the geometry of the standard

1

0

-1 1 0.5 0

-2 -1

-0.5 -0.5 0

-1 0.5 1

Fig. 7.12

-1.5

The standard swallow tail.

swallow tail in hand, we look at the corresponding singularities of the critical value surface Σ of our family of potentials. Let p0 = (σ 03 , σ 04 ) be a cusp point of the degeneracy locus ∆ and let q 0 = (p0 , Vp0 (0)) be the corresponding point of Σ. Using proposition 7.9.2.19 we see that the part Σ0 of

356

The rolling disk

Σ lying over a neighborhood of p0 is analytically diffeomorphic to the standard swallow tail surface with q 0 as its organizing center. The axis of Σ0 lies in the vertical plane over the coordinate axis straddled by the branches of the cusp in ∆ with cusp point p0 . The fact that Dh(0) is positive along the positive direction of this axis implies that the swallow tail surface is tilted upward if we move radially outward in the (σ 3 , σ 4 )-plane, see proposition 7.9.2.21. The two cusped edges of Σ, which meet at p0 , lie over the branches of the cusp in ∆, which meet at p0 . The cusped edges of Σ, which form part of Σdeg near q 0 , meet in a cusped fashion at q 0 . Over the interior of ∆ near p0 , the surface Σ0 has three sheets: the upper one, corresponding to the nondegenerate local maxima of the potential functions, and two lower ones, which correspond to the nondegenerate minima. At the edge Σdeg \ {q 0 } near q 0 , the upper sheet merges with the highest of the lower sheets. In addition, the lower sheets meet transversely over the coordinate axis straddled by the cusp of ∆ at p0 , because the potential functions with parameter values along this axis are even functions. 7.13.2.3

The global structure of Σ

Using proposition 7.12.5.36 we now describe the global structure of the surface of critical values. The smooth part Σnd = Φ(Snd ) of Σ is the disjoint union of the stable part Σs , consisting of critical values of local nondegenerate minima of members of our family of potential functions, and an unstable part Σu , consisting of critical values corresponding to nondegenerate maxima. Σs is the graph of an analytic function v∗ , defined on the region inside of ∆. The closure Σu of Σu is equal to Σu ∪ Φ(Sdeg ) = Σu ∪ Σdeg , which is a compact subset of the interior of the image of the integral map I.3 On each closed quadrant Q in the (σ 3 , σ 4 )-plane bounded by coordinate semiaxes, there is an analytic function v+ defined on the complement of `± , whose graph is a subset of Σs . This subset of Σ corresponds to the global minimum of a member of our family of potentials. The function v+ can be extended continuously to p ∈ `± by defining v+ = Vp (±1). Therefore v+ is continuous on Q. The image of the integral map I is the subset of R3 which lies on or above the graph of v+ , excluding the points (p, Vp (±1)) with p ∈ `± . Thus 3 This

leads to the following global form of gyroscopic stabilization. If the energy is larger than that of the restriction of the integral map I = (σ 3 , σ 4 , v) to Σu ∪ Σdeg , then for every (σ1 , σ 3 , σ 4 ) such that Vσ0 3 ,σ 4 (σ1 ) = 0 and Vσ3 ,σ4 (σ1 ) > E, the point σ1 is a nondegenerate local minimum of Vσ 3 ,σ 4 (σ1 ).

357

7.13. The integral map

1 0.5 0 -0.5 2-1

1.5 1 0.5 0 -1 0 1 Fig. 7.13

The critical value surface.

it is not a closed set. Over the exterior of ∆, the function v+ extends to an analytic function whose graph is equal to Σ. The remaining subset of Σs is the graph of an analytic function v− , defined on the inside of ∆ less the coordinate axes. For every p in the domain of v+ we have v+ (p) < v− (p) < v∗ (p), because v+ (p) = Vp (σ + (p)), v− (p) = Vp (σ − (p)), and v∗ (p) = Vp (σ ∗ (p)) when p = (σ 3 , σ 4 ) and σ 3 σ 4 > 0, see proposition 7.12.5.36 c) case ii). This description shows that Σdeg ∩ Σnd = Φ(Sdeg ) ∩ Φ(Snd ) = ∅. The function v− extends to an analytic function on each closed quadrant in the inside of ∆ so that on its bounding half-axis we have v− = v+ . This corresponds to the self intersections of Σ, which are a continuation of the self intersections of the swallow tail surface near the swallow tail point (= its organizing center). If p is in the open quadrant inside of ∆, then v+ (p) and v− (p) are the values of the two nondegenerate local minima of the potential Vp , which lie symmetrically in different connected components of R \ {0}. When p moves to an adjacent quadrant by crossing a bounding semiaxis, the nondegenerate local minima persist and do not cross 0, but

358

The rolling disk

the respective values of Vp get interchanged. In other words, the function v+ in quadrant Q1 analytically continues to v− in an adjacent quadrant Q2 . Similarly, v− in Q1 becomes v+ in Q2 . The graphs of v+ and v∗ merge at the cusped edge of Φ(Snd ) = Σnd . At the swallow tail points all the sheets of Σ merge together, see figure 7.13. 7.14 7.14.1

Constant energy slices Numerical pictures of the constant energy slices

All the figures for the constant energy slice of the critical value surface have been computed numerically for the uniform disk with m r2 = m g r = 1, and therefore I1 = 1/4 = I2 , I3 = 1/2.

(a)

(b)

Fig. 7.14 (a) is the h = 1.251 slice of the critical value surface in the (σ 3 , σ 4 )-plane. (b) is an enlargement of the region 3.6 < σ3 < 3.9 and 3.6 < σ 4 < 3.9.

Fig. 7.15

The h = 1.11 slice of the critical value surface.

7.14. Constant energy slices

Fig. 7.16

The h = 1.065 slice of the critical value surface.

Fig. 7.17

The h = 1.061 slice of the critical value surface.

359

360

The rolling disk

Fig. 7.18

The h = 1.01 slice of the critical value surface.

Fig. 7.19

The h = .05 slice of the critical value surface.

7.14. Constant energy slices

361

In figures 7.14–7.19 we have calculated the horizontal slice of the critical value surface at the respective heights 1.251, 1.1, 1.065, 1.061, 1.01, and 0.5. The scenario is as follows. When the height, corresponding to the energy level, is larger than 1.25 (= the energy at critical straight rolling), then the horizontal slice is a closed curve, which is smooth except at the four points over `± , where it has cusps pointing outward. These cusps are practically invisible in figure 7.14, but the enlargement indicates that the cusp sits like a nail on an otherwise round fingertip, where the round fingertip recedes gradually as the energy decreases. The interior of the closed curve is the domain of (σ 3 , σ 4 )-values for which we have one periodic orbit, whereas the exterior does not belong to the image of the integral map. In proposition 7.14.3.37 we show that at each energy level the (σ 3 , σ 4 )-values form a bounded set. If the energy is between 1.1 (= the energy at critical rectilinear rolling) and 1.25, then two opposite swallow tails have formed in the horizontal slice, see figure 7.15. The interiors of the swallow tails is the domain of (σ 3 , σ 4 )-values for which we have two periodic orbits. If the energy is in between approximately 1.061. . . and 1.1, see figures 7.15 and 7.16, then the horizontal slice has four swallow tails. The interior of each of these is the domain of (σ 3 , σ 4 )-values for which we have two periodic orbits. When the energy is approximately equal to 1.061. . . , see figure 7.17 then the cusps of the swallow tails meet in the beak to beak singularity, which is discussed in §14.5 and the paragraph preceding it. If the energy is between .5 and 1.061. . . , see figures 7.17–7.19, then the horizontal slice consists of a crossing of two railway tracks, which when going out meet in cusps over l± ; whereas in the interior of the railway track crossing there is another component which is a smooth oval. The region within the crossing and outside the oval is the domain of (σ 3 , σ 4 )-values for which we have two periodic orbits, whereas the other bounded components of the complement of the horizontal slice form the domain of (σ 3 , σ 4 )-values for which we have one periodic orbit. Note that in the interior of the oval the energy is larger than the local maximum of the potential function (= the energy at the hyperbolic relative equilibrium). The oval decreases with decreasing energy, see figures 7.17–7.18. It shrinks to the origin when the energy reaches the value 1. For energy values between 0 and 1, see figure 7.19 we only have the afore-

362

The rolling disk

mentioned railway crossing. The railway tracks become extremely narrow when the energy shrinks to and the figure looks like a cross of very thin needles. Indeed, it follows from lemma 7.14.7.44 below that for small energy the width of each needle is of the order of magnitude of the third power of the length of the needle. 7.14.2

Geometric features of the constant energy slices

Here we note the prominent geometric features of the constant energy slices of the critical value surface Σ. If one intersects Σ with the horizontal plane Eσ 3 ,σ4 = h and then projects the result to the (σ 3 , σ 4 )-plane, one obtains the constant energy slice Σh . We note the following geometric features shown for the unform disk in figures 7.14–7.19 of §14.1. 1. For each h the slice Σh is a union of compact analytic curves which expand as h increases. 2. There is a “finger nail” at the ends of the extremes of Σh when 1.0.1 < h < 1.251. 3. There are sections of four swallow tails when 1.01 < h < 1.061, whose cusps merge to form an oval when h = 1.061 . . . . 4. The oval shrinks to a point when h = 1 and disappears when h < 1. 5. A slice of Σ, which is transverse to one of the lines `± , is curve with a cusp point at the point where `± intersects the slice. This point is not in the image of the integral map. In the next few subsections, we prove some of the geometric features noted above. 7.14.3

Outward radial growth

In this section we study the critical values of the potential Vσ 3 ,σ4 as the parameter (σ 3 , σ 4 ) increases in a radial direction. We prove

363

7.14. Constant energy slices

Proposition 7.14.3.37. We have inf

σ1 ∈(−1,1)

Vσ 3 ,σ4 → ∞

as k(σ 3 , σ 4 )k → ∞.

Proof. Suppose that the rescaled potential WX,Y (95) stays bounded as k(σ 3 , σ 4 )k → ∞. Then the solutions (x(z), y(z)) of the rescaled Chaplygin equations (92) stay bounded. Because the Wronskian of the normalized even and odd solutions of the rescaled Chaplygin equations, see §7.3, is we may solve

e(z)o0 (z) − o(z)e0 (z) = 1 · 1 − 0 · 0 = 1,

„ « x(z) y(z)

=

„

«„ « e(z) e0 (z) X Y o(z) o0 (z)

to obtain

X = x(z) o0 (z) − y(z) o(z)

Y = −x(z) e0 (z) + y(z) e(z).

Since e(z) and o(z) are bounded as a consequence of our hypothesis, k(X, Y )k can only become large if either |x(z)o0 (z)| or |x(z)e0 (z)| becomes large as z → ±1. However, we know that o0 (z) ∼ − 21 cr(0) ln(1 ∓ z) and 0

e0 (z) ∼ ± 12 cr0 (0) ln(1 ∓ z) as z → ±1. Consequently, oe(z) ∼ ±r(0) 0z r 0 (0) as z → ±1. Therefore both |x(z)o0 (z)| and |x(z)e0 (z)| become large as z → ±1. In particular, X becomes large as z → ±1. But then 2 x(z)2 1 X + y(z)o(z) 2 1 WX,Y (z) ≥ = ∼ X2 d 1 − z2 1 − z2 o0 (z) (1 − z 2 )(o0 (z))2 2 2 1 ∼ X2 , (209) 2 cr(0) (1 − z )ln2 (1 − z 2 ) which tends to ∞ as z → ±1. This contradicts the hypothesis that WX,Y is bounded as k(X, Y )k → ∞. An immediate consequence of the above proposition is that all constant energy slices Σh of the critical value surface are compact. This proves half of observation 1. Lemma 7.14.3.38. If σ1 (p) is a nondegenerate critical point of Vp with p = (σ 3 , σ 4 ) 6= (0, 0), then the derivative of the function p 7→ σ1 (p) is strictly positive. Proof. Because σ1 (p) is a nondegenerate critical point of Vp , the function p 7→ σ1 (p) is analytic. Therefore the function p 7→ Vp (σ1 (p)) is analytic. ∂ ∂ Using the Euler operator E = x ∂x + y ∂y and p f = 1 d 1 x2 + 1 y 2 + 1 − z 2 , W 2 2 1 − z2

364

The rolling disk

f = d 1 2 x2 + y 2 > 0, when (X, Y ) 6= (0, 0) (which see (94), we obtain EW 1−z implies (x, y) 6= (0, 0)). Because the rescaled Chaplygin equations (92) are linear, the diffeomorphism ψ (79), giving its flow leaves E invariant. This proves the lemma. Because every ray from the origin intersects ∆, lemma 7.14.3.38 implies Corollary 7.14.3.39. The supremum of the critical values Vp (σ1 (p)), in which σ1 (p) is a local maximum of Vp is equal Vq (σ1 (q)) for some q ∈ ∆. Here σ1 (q) denotes the degenerate critical point of Vq . It follows that the maximum of the critical values on Σunst ∪ Σdeg is attained at the edge Σdeg . The minimum of the critical values on Σunst ∪ Σdeg is attained over the origin, where it is equal to mgr. This leads to the following sharpening of the gyroscopic stabilization principle. Proposition 7.14.3.40. The energy E at any unstable or degenerate relative relative equilibrium satisfies mgr ≤ E ≤ maxp∈∆ Vp (σ1 (p)). Here σ1 (p) is the degenerate critical point of Vp . Proof of observation 1. Lemma 7.14.3.38 implies that in the complend ment of the origin the h-slice Σnd of critical h of the analytic surface Σ values of nondegenerate critical points of Vσ 3 ,σ4 is a collection of immersed analytic curves, which are transverse to every radial direction. Moreover, as h increases, these curves move outward in the σ 3 − σ 4 -plane away from the origin. If h is larger than all the values of Vp (σ1 ), where (σ1 , p) ∈ Snd , then Σh is a simple closed analytic curve which is everywhere transverse to the radial directions and has the degeneracy locus ∆ in its interior. This proves observation 1. f = d (1 − z 2 )−1 x2 + Remark 7.14.3.41. It follows from the equation EW 2 2 1/2 f − 2(1 − z ) f − 2 ≤ EW f < 2W f. y = 2W that 2W The second inequality implies that, if (X, Y ) 6= (0, 0) and r ≥ 1, then inf Wr X, r Y ≤ Wr X, r Y (z) ≤ r2 WX, Y (z) for all z ∈ (−1, 1),

and therefore

inf Wr X, r Y ≤ r2 inf WX, Y if r ≥ 1. f − 1) ≥ (W f − 1), and therefore The first inequality implies that E(W 2 Wr X, r Y (z) − 1 ≥ r (WX, Y (z) − 1) when WX, Y (z) > 1 and r ≥ 1. This implies that inf Wr X, r Y ≥ 1 + r2 (inf WX, Y − 1) if inf WX, Y > 1, r ≥ 1.

7.14. Constant energy slices

365

In combination with proposition 7.14.3.37 this shows that the global minimum of Vp grows of order |p|2 as |p| → ∞. 7.14.4

The swallow tail sections

In this subsection we explain why sections of a swallow tail appear in the constant energy slices Σh . Let q0 be a swallow tail point of the critical value surface Σ. Then q0 lies over a cusp point p0 of the degeneracy locus ∆, see §13.2.2. The positivity of the outward radial directional derivative of the function p 7→ Vp , see proposition 7.14.3.37, implies that the swallow tail tilts upward in the radial direction. Therefore the local piece of Σh , when h is slightly smaller than Vp0 (0), exhibits cross sections of a swallow tail surface with its tail pointing inward and whose self intersections lie along an axis pointing away from the origin. When h is larger than Vp0 (0), the swallow tail surface disappears and the local piece of Σh becomes smooth. This proves observation 3 except for the merger of the cusps of the swallow tail sections at h = 1.061 . . . for the uniform disk, for which we have no proof. 7.14.5

Behavior of cusp points

As h decreases further, the cusp points p in Σh move on ∆, away from the cusp point p0 of ∆. Whereas Σh near p lies inside ∆. This situation continues as long as the derivative of the degenerate critical value function v(p) is negative, as p moves on ∆ away from p0 . If for p ∈ ∆, the function v(p) has a nondegenerate local minimum at pm ∈ ∆ and h is slightly larger than v(pm ), then there are two points p1 , p2 ∈ ∆ close to pm such that v(p1 ) = v(p2 ) = h, with pm in between p1 and p2 on ∆. Here p1 and p2 are cusp points of Σh , where the cusps are approximately parallel to ∆, lie in the interior of ∆, and point approximately in the direction from p1 and p2 to pm , respectively. When h = v(pm ) then the cusps meet beak to beak. When h is slightly smaller than v(pm ) then two of the branches of the two cusps in Σh closest to ∆ have merged to form a smooth curve. The other branches of the two cusps have merged to form another smooth curve. Both curves near pm lie in the interior of ∆, approximately parallel to ∆. Computer pictures for the uniform disk, see figures 7.14–7.19 in §14.1, suggest that each smooth interval I of ∆ curves outward, but we have no

366

The rolling disk

proof of this. Lemma 7.14.5.42. If I curves outward, then the degenerate critical value function dcv : ∆ → R : p '→ v(p) has only nondegenerate local minima as critical points. This implies that the function dcv has only one critical point pm which is a nondegenerate global minimum. ∂V

Proof. Write Vp (z) for V (z, p), (∂z V )(z, p) for ∂zp (z) and (∂p V )(z, p) ∂Vp (z) . Let J ⊆ R → I ⊆ ∆ : s '→ p(s) be a parametrization of a for ∂p smooth interval I on ∆. Write z(s) for the degenerate critical point of Vp(s) , which depends smoothly on s. In other words, (∂z V )(z(s), p(s)) = 0 and (∂z2 V )(z(s), p(s)) = 0 for every s ∈ J. It follows that d (∂z V )(z(s), p(s)) ds = (∂z2 V )(z(s), p(s)) z * (s) + (∂p ∂z V )(z(s), p(s)) p* (s)

0=

= (∂p ∂z V )(z(s), p(s)) p* (s).

(210)

Moreover, d V (z(s), p(s)) = (∂z V )(z(s), p(s)) z * (s) + (∂p V )(z(s), p(s)) p* (s) ds = (∂p V )(z(s), p(s)) p* (s), and therefore d2 V (z(s), p(s)) = (∂z ∂p V )(z * , p* ) + (∂p2 V ) (p* , p* ) + ∂p V p** d2 s = (∂p2 V ) (p* , p* ) + ∂p V p** in view of (210). Now Vσ 3 , σ 4 (σ1 ) is equal to mgr (1 − σ1 2 )1/2 plus a positive definite quadratic form in the parameters (σ 3 , σ 4 ), see (83) and therefore ∂p2 V (p* , p* ) > 0. At a critical point of s '→ V (z(s), p(s)) we have ∂p V p* = 0, and because the critical value function has a postive outward radial derivative, we conclude that ∂p V p** ≥ 0 if ∆ curves outward. If the degenerate critical value function dcv : ∆ → R : p '→ v(p) has only one critical point pm , which is a nondegenerate global minimum, then we have the beak to beak configuration for Σh where h = v(pm ), see figure 7.17. This also implies that dcv attains its maximum at the cusp points of ∆. In view of corollary 7.14.3.39 and remark 7.14.3.41 this leads to the conclusion that if E is the energy at an unstable or degenerate relative

367

7.14. Constant energy slices

equilibrium then

E I1 I1 1≤ ≤ max 1 + . , 1+ mgr 2I3 2(I1 + mr2 )

(211)

This can be viewed as a further specification of the global gyroscopic stabilization principle. 7.14.6

Over the coordinate axes in the (σ 3 , σ 4 )-plane

Regarding the self-intersections of Σ over the coordinate axes, we have ∂Vσ3 , σ4 (σ1 ) Lemma 7.14.6.43. If σ1 6= 0 and σ 4 6= 0, then 6= 0. If ∂σ 3 σ 3 =0 ∂Vσ 3 , σ4 (σ1 ) σ1 6= 0 and σ 3 6= 0, then 6= 0. ∂σ 4 σ4 =0

Proof. Using (95) and (125) we have ∂WX, Y (z) = Y [d (1 − z 2 ) o(z) e(z) + o0 (z) e0 (z)]. ∂X X=0

The expression between square brackets is an odd function of z, which for z > 0 is strictly positive because e(z) and o(z) is an even and odd power series with positive coefficients, respectively, and therefore the same holds for o0 (z) and e0 (z), respectively. Similarly we have ∂WX, Y (z) = X [d (1 − z 2 ) e(z) o(z) + e0 (z) o0 (z)], ∂X Y =0

which is nonzero when X 6= 0 and z 6= 0. The statement of the lemma follows in view of the scaling (98).

From §14 recall the analytic functions v and w on the closed quadrants in the interior of ∆ defined by v(p) = Vp (σ1+ (p)) and w(p) = Vp (σ1− (p)), where p = (σ 3 , σ 4 ) is and σ 3 σ 4 > 0, see case c) in proposition 7.12.5.36. Note that σ1− (p) < 0 < σ1+ (p), even if p is on one of the half-axes bounding the open quadrant, whereas σ1∗ (p) = 0 when that happens. Let us consider v and w in the positive quadrant. In view of the fact that both v and w are analytic, it follows from lemma 7.14.6.43 that if p is on the σ 4 -axis, then the partial derivative v 0 of v with respect to σ 3 at σ 3 = 0 is nonzero. Because w is equal to the image of the analytic extension of v to the adjacent quadrant under the reflection about the σ 4 -axis, the partial derivative w0 of w with respect to σ 3 at σ 3 = 0 satisfies w0 = −v. Because v < w in the interior of the positive quadrant, it follows that v 0 < 0

368

The rolling disk

and w0 = −v > 0. This implies that the self-intersection of Σ over the part of the σ 4 -axis inside ∆ is transversal. In view of the positivity of the radial derivative, the derivative with respect to σ 4 of v = w on the positive σ 4 -axis is positive. It follows that near the σ 4 -axis, the level set Σh consists of two analytic curves which intersects each other and the positive σ 4 -axis transversally. Moreover, the two curves are each other mirror images under the reflection about the σ 4 -axis. At the σ 3 -axis we have a similar behavior. If σ 3 σ 4 = 0, then σ1 = 0 is a critical point of Vσ 3 , σ 4 , a nondegenerate local maximum when (σ 3 , σ 4 ) is inside ∆ and the nondegenerate global minimum when (σ 3 , σ 4 ) is outside ∆. Because σ3 = σ 3 and σ4 = σ 4 when σ1 = 0, we have Vσ 3 , 0 (0) = 12 I1 σ 3 2 +mgr and V0, σ 4 (0) = 21 (I3 +mr2 ) σ 4 2 + mgr. For p inside ∆ we saw that Vp had the nondegenerate local maximum at σ1∗ (p) and the corresponding critical value v ∗ (p) = Vp (σ1∗ (p)). Note that v ∗ attains its minimum at the origin, because of the positivity of the radial derivatives. Because σ1∗ (p) = 0 when p is on a coordinate axis, the above identities imply that ∂12 v ∗ (0, 0) = I1 > 0 and ∂22 v ∗ (0, 0) = I3 + mr2 > 0, where ∂j denotes the partial derivative with respect to the j-th variable. On the other hand it follows from the symmetry v ∗ (σ 3 , σ 4 ) = v ∗ (−σ 3 , σ 4 ) that ∂1 v ∗ (0, σ 4 ) = −∂1 v ∗ (0, σ 4 ), hence ∂1 v ∗ (0, σ 4 ) = 0. Therefore (∂2 ∂1 v ∗ )(0, 0) = 0. We conclude that the Hessian of v ∗ at the origin is a diagonal matrix with positive entries. Consequently, the minimum of v ∗ at the origin is nondegenerate. Because v ∗ (0, 0) = mgr, we find for h slightly larger than mgr a connected component of Σh which looks like a small ellipse around the origin, which shrinks to the origin as h ↓ mgr, and disappears when h < mgr. The points p inside and outside the ellipse correspond to cases where the energy E is larger and smaller than the value Vp (σ1∗ (p)) of the potential energy Vp at the local maximum σ1 (p), respectively. The description of the solutions of the conservative Newtonian system in §5.3 tells what this means for the dynamics of the reduced system. This proves observation 4 in §14.2. 7.14.7

Σ over `±

We discuss the behavior of the critical value surface Σ near the lines `± , corresponding to the solutions which fall flat.

369

7.14. Constant energy slices

Lemma 7.14.7.44. For p 6∈ `± but near q ∈ `± let σ1 (p) = sin ϕ(p) be the critical point of Vp near ±1 and let v(p) = Vp (σ1 (p)) be its corresponding critical value. Then cos ϕ(p) ∼ (I1 mgr)−1/3 |η± |2/3

(212)

and 3 (mgr)2/3 I1 −1/3 |η± |2/3 (213) 2 as p = (σ 3 , σ 4 ) → q. Here the asymptotics is locally uniform in q. Moreover, v(q) is equal to the right hand side of (169) and η± is defined in (163). v(p) ∼ v(q) +

Proof. If and 1 ∓ z are small, then we have (1 − z 2 )−1 x(z) ∼ (1/2)(a + /(1 ∓ z)), y(z) ∼ ∓ {a + (c/2) ln(1 ∓ z)} and (167) as z → ±1, see the beginning of the proof of lemma 7.11.1.25. If we substitute cos ϕ = C ||2/3 , where C is a positive constant, into (171), it follows that 1 ∓ z = O(||4/3 ) = o(||) and therefore x(z) ∼ and y(z) ∼ ∓ a. Then (97) leads to f ∼ ± d C −4 ||−8/3 2 ∓ (d + c) C −2 ||−4/3 a ∓ C −1 ||−2/3 CW ∼ ± ||−2/3 C −4 (d − C 3 ).

Because d − C 3 > 0 when C < d1/3 and < 0 when C > d1/3 , it follows f > 0 and ±CW f < 0 when cos ϕ = C ||2/3 that for small || we have ± CW f , we with C < d1/3 and C > d1/3 , respectively. Using the continuity of CW f obtain the existence of a family of zeros z = z(p) = sin ϕ(p) of CW , which are critical points of Vp , such that cos ϕ(p) ∼ d1/3 ||2/3

In view of the relations d = 2

2

I32 I1 (I3 +mr2 )

as → 0.

and = −η± I13

q

(214)

I3 +mr2 mgr ,

we obtain

that d = |η± | /(I1 mgr). Therefore (214) implies (212). Inserting (214) and x ∼ , y ∼ ∓a into (94), we obtain

f = (d/2) d−2/3 ||−4/3 2 +(1/2)a2 +d1/3 ||2/3 ∼ (1/2) a2 +(3/2) d1/3 ||2/3 , W which implies (213) in view of the rescalings in (94) and d 2 = |η± |2 /I1 mgr. Moreover, we need the observation that when η± = 0 we have = 0. Thus 1 2 2 mgr a = v(q).

From lemma 7.14.7.44 it follows that, for p near q ∈ `± , the global minimum v(p) of Vp behaves asymptotically like v(p) ∼ v(q) + const. |p − q|2/3 ,

370

The rolling disk

which means that the lowest part of the critical value surface has a downward pointing cusped edge over `± , where the edge itself does not belong to the image of the integral map. Because v(q), q ∈ `± has a positive outward radial derivative, it follows that the corresponding horizontal slices Σh have a cusp at l± \ {(0, 0)}, straddling `± and pointing away from the origin. 7.15 7.15.1

The spatial rotational shift The shift

The action of the symmetry group E(2) × S 1 on the constraint manifold C (3) of the rolling disk is free and proper. Therefore its orbit space M = C/(E(2) × S 1 ) is a smooth manifold, called the fully reduced phase space and the projection π : C → M : c 7→ m = π(c) is an E(2) × S 1 -principal bundle over M. Because most solutions of the fully reduced vector field V (13) on M are periodic, we can apply the theory of §2.2 of chapter 4. Let c ∈ C and let γc : R → C be an integral curve starting at c of the vector field V (4), which governs the motion of the rolling disk. The curve γc is a relative periodic solution of V if its projection γ m = π ◦ γc : R → M : t 7→ π(γc (t)) = γ m (t) is a nonconstant periodic integral curve of the reduced vector field V on M. Let τ = τ (c) be the smallest positive period of γ m with m = π(c). Let C per be the collection of all c ∈ C such that γ m with m = π(c) is a nonconstant periodic integral curve of the reduced vector field V. It follows from the description in §5.3 of the solutions of a conservative one degree of freedom Newtonian system that for the rolling disk C per is a dense open subset of C and that the function τ : C → R : c 7→ τ (c) is real analytic. Because π(γc (τ )) = π(γc (0)) = π(c) = m, the fibers of the map π are the E(2) × S 1 orbits on C. Since the E(2) × S 1 action on C is free, there is a unique element g = g(c) ∈ E(2) × S 1 such that γc (τ (c)) = g(c) · γc (0) = g(c) · c,

7.15. The spatial rotational shift

371

where the E(2) × S 1 action on C is denoted by ·. The element g(c) in E(2) × S 1 is called the shift element of E(2) × S 1 at c.

The E(2) × S 1 invariance of the vector field V on C implies that for any h ∈ E(2) × S 1 the curve h · γc : R → C : t #→ h · γc (t) is an integral curve of V . Moreover, π(h · γc ) = π(γc ) = γ m . From h · γc (0) = h · c, it follows that h · γc (t) = γh·c (t),

(215)

that is, the flow of V commutes with the action of h on C. Applying (215) when t = τ = τ (c) we get (h · g(c)) · c = h · (g(c) · c) = g(h · c) · (h · c) = (g(h · c) · h) · c.

Therefore h · g(c) = g(h · c) · h, because the E(2) × S 1 action on C is free. In other words, g(h · c) = h · g(c) · h−1 is conjugate to g(c) by the element h ∈ E(2) × S 1 . 7.15.2

Quasiperiodic motion

Fix c ∈ C per and let T be a 2-dimensional torus subgroup of E(2) × S 1 such that g = g(c) ∈ T . Let t be the lie algebra of T and let exp : t → T be the exponential mapping. Because exp is a surjective homomorphism of abelian groups, it induces an isomorphism t/ ker exp → T . Since T is compact, the additive discrete group ker exp has a Z-basis, which is also an R-basis of t. Our assumption that g(c) ∈ T implies that there is a ξ = ξ(c) ∈ t such that g(c) = exp ξ(c). Of course ξ(c) is determined modulo the lattice ker exp. Now consider the analytic map Φ : t × R → C : (η, t) #→ eη · γc (t).

(216)

γc (t) = eζ−η · γc (s) = γeζ−η ·c (s).

(217)

We have Φ(η, t) = Φ(ζ, s) if and only if

In view of the fact that the flow ϕt : C → C of V given by ϕt (c) = γc (t) gives rise to a homomorphism from R into the group of diffeomorphisms of C, it follows from (217) that γc (t − s) = eζ−η · c. Thus γ c (t − s) = π(γc (t − s)) = π(eζ−η · c) = π(c) = m.

So there is a unique k ∈ Z such that t − s = kτ (c). Because g(g(c) · c) = g(c) · g(c) · g(c)−1 = g(c), by induction we find that γc (kτ ) = (g(c))k · c for every k ∈ Z. Using the fact that the action of E(2) × S 1 on C is free, we obtain eζ−η = g(c)k = ekξ , that is, ζ − η = kξ mod(ker exp). Therefore

372

The rolling disk

Φ(η, t) = Φ(ζ, s) if and only if there is a k ∈ Z and a σ ∈ ker exp such that t − s = kτ (c) and ζ − η = kξ + σ. In other words, Φ(η, t) = Φ(ζ, s) if and only if (ζ − η, t − s) ∈ Λ = Ψ(ker exp ×Z), where Ψ : t × R → e × R : (σ, k) #→ (kξ + σ, kτ ).

(218)

Now Ψ|(ker exp ×Z) : ker exp ×Z → Λ is a surjective homomorphism of additive groups. Because Ψ is a bijective affine transformation of t × R into itself, it maps any Z-basis of ker exp ×Z onto a Z basis of Λ, which is an R basis of t × R. Therefore Λ is a discrete subgroup of t × R of full rank. This implies that T % = (t × R)/Λ is a 3-dimensional torus, where dim T = 2. Choosing a Z-basis b of t × R gives rise to a Z-basis Ψ(b) of Λ. This identifies T % with the standard 3-dimensional torus R3 /Z3 . Therefore the analytic map Φ : t × R → C (216) induces an analytic & : T % = (t × R)/Λ → C, which is an injective immersion. Because T % map Φ & is an embedding. Therefore Φ(t × R) = Φ(T & % ) is a is compact, the map Φ compact analytic submanifold of C, which is analytically diffeomorphic to the 3-dimensional torus R3 /Z3 . From (216) it follows that Φ(ξ, t + s) = γeξ ·c (t + s) = γγeξ ·c (t) (s) = γΦ(ξ,t) (s). & % ) ⊆ C is invariant under the flow of V on Consequently, the manifold Φ(T C. Moreover, Φ intertwines the straight line flow on T % , defined by the & % ). In other constant vector field (0, 1) ∈ t × R, with the flow of V on Φ(T & % ) is quasiperiodic and Φ is words, the restriction of the flow of V to Φ(T & % ) as a the change of variables which exhibits an integral curve of V |Φ(T % 4 straight line on the 3-dimensional torus T . 7.15.3

The spatial rotational shift

Recall that we have denoted an element of E(2) × S 1 by ((B, b), C), where B, C ∈ S 1 = SO(2, R), viewed as those elements of SO(3, R) which leave the vertical vector e3 in R3 fixed, b ∈ R2 = R2 × {0}, and (B, b) is the affine transformation of R2 into itself given by x #→ Bx + b. We call B the spatial part of the element ((B, b), C) ∈ E(2) × S 1 . Note that the mapping µ : E(2) × S 1 → SO(2, R) : ((B, b), C) #→ B 4 In general a smooth function f : R → R is quasiperiodic with k frequencies, if there is a smooth function F : Rk /Zk → R and constants ν1 , . . . , νk , called the frequencies, such that f (t) = F (tν1 , . . . , tνk ) for all t ∈ R.

7.15. The spatial rotational shift

373

is a group homomorphism. If ((B, b), C) is the shift element at c ∈ C of a relative periodic solution, then B is the spatial part of the shift element, which is called the spatial rotational shift of the relative periodic orbit. Because µ is a group homomorphism and shift elements on an E(2)×S 1 orbit on C are conjugate, the rotational shift is constant on the E(2) × S 1 -orbit, since SO(2, R) is commutative. Therefore the spatial rotational shift defines a real analytic SO(2, R)-valued map B : C per → SO(2, R) on the space C per of relative periodic orbits of the vector field V (4) on the constraint manifold C. Suppose that we are in the following situation. Let γc : R → C : t 7→ (A(t), a(t)), (ω(t), a(t)) ˙ = γc (t),

be an integral curve starting at c ∈ C of the vector field V , which governs the motion of the rolling disk. Suppose that the projection of γc under the reduction map π gives an integral curve γ m : R → M : t 7→ γ m of the fully reduced vector field V on the fully reduced phase space M, which is periodic of period τ = τ (c). We now prove Lemma 7.15.3.45. Suppose that the spatial rotation part of γc (τ ) is equal to id, that is, A(τ ) = A(0)C −1 , a(τ ) = a(0) + b, ω(τ ) = Cω(0), and a(τ ˙ ) = a(0) ˙ for some b ∈ R2 and C ∈ SO(2, R). Then 1. The map R → R3 : t 7→ a(t) − tτ −1 b is τ -periodic. In other words, if b = 0, then the motion of the center of mass of the disk is periodic with period τ . If a(τ ˙ ) 6= 0, then a(t) moves rectilinearly with constant velocity plus a τ -periodic perturbation. 2. The rotational motion t 7→ A(t) of the disk is quasiperiodic on a 2dimesional torus, while the vectors v(t) = A(t)e3 and ν(t), see (25), perform a periodic motion. Proof. Recall that the map (E(2) × S 1 ) × C → C : (g, c) = ((B, b), C), ((A, a), (ω, a)) ˙ 7→ (BAC −1 , Ba + b), (Cω, B a) ˙

374

The rolling disk

defines the action of E(2) × S 1 on the constraint manifold C. If g = g(c) is the shift element at c, then after time τ the vector u = A−1 e3 has become (BAC −1 )−1 e3 = −CA−1 B −1 e3 and the vector A(ω × e3 ) has become (BAC −1 )(Cω × (CA−1 B −1 e3 )hor) = BAC −1 (Cω × C(A−1 B −1 e3 )hor ) = BA(ω × (A−1 B −1 e3 )hor) .

(219)

If B = id, then the right hand side of (219) is A(ω × (A−1 e3 ))hor = A(ω × uhor). Because σ1 is periodic with period τ on the fully reduced space, it follows that the right hand side of equation (33) for a˙ is periodic of period τ . Since the time integral of a periodic function is a linear function of t plus a τ -periodic function, we have proved point 1 of the lemma with b equal to the integral of the right hand side of (33) over t ∈ [0, τ ]. Point 2 follows if we observe that disregarding the translational part of the motion of the disk yields a symmetry group SO(2, R) × S 1 , where the first component of the shift element is id. From results of §15.4 below we see that the motion of the disk is quasiperiodic on a 2-dimensional torus. If we disregard the internal S 1 symmetry of the disk, then the reduced phase space C/S 1 is parametrized by v and ν. We obtain τ -periodic motion because the disk’s motion has returned to the same point in the fiber. Let p ∈ R2 . If B ∈ SO(2, R), then let Bp : x 7→ B(x − p) + p be the rotation about the point p. The mapping B 7→ Bp is an injective homomorphism, whose image is the set of all rotations about p, denoted SO(2, R)p . It is straightforward to see that SO(2, R)p is a circle subgroup of E(2), which depends analytically on p. Lemma 7.15.3.46. Let g = ((B, b), C) ∈ E(2) × S 1 . Suppose that the spatial rotational part of g is nontrivial, that is, B 6= id. (Since B ∈ SO(2, R), the condition B 6= id implies B − id is invertible.) Then the −1 centralizer CE(2)×S 1 (g) of g in E(2) × S 1 is SO(2, R)(id−B) b × S 1 . Proof. Because every element in the second S 1 factor of E(2) × S 1 commutes with every element of E(2) × S 1 , it suffices to show that CE(2) (g) = (id−B)−1 b

SO(2, R) . Now the equation (A a) (B, b) = (AB, Ab + a) is the same as the equation (B, b) (A, a) = (BA, Ba + b) if and only if AB = BA and Ab + a = Ba + b. The first equality holds because SO(2, R) is commutative. Whereas the second equality is equivalent to (1 − A)b = (1 − B)a,

7.15. The spatial rotational shift

375

or a = (1 − B)−1 (1 − A)b = (1 − A)(1 − B)−1 b, since AB = BA. Therefore (A, a)(x) = Ax + (1 − A)(1 − B)−1 a = A(x − (1 − B)−1 b) + (1 − B)−1 b,

that is, (A, a) is equal to the rotation about the point (1 − B)−1 b.

!

For c ∈ C per let B(c) ∈ SO(2, R) be the spatial rotational shift at c, that is, the spatial rotational part B of the shift element g(c) = ((B, b), C) ∈ E(2) × S 1 at c. Note that B : C per → SO(2, R) : c '→ B(c) is a real per = {c ∈ C per B(c) -= id} be the analytic mapping. In addition, let Cnt set of all c ∈ C per with nontrivial spatial rotational shift. Let U be a per is the set connected component of the open subset C per of C. Then U \ Cnt of all c ∈ U such that the rotational shift g(c) is the identity. Because the per mapping C per → E(2) × S 1 : c '→ g(c) is real analytic, the image of U \ Cnt under the group homomorphism µ : E(2) × S 1 → SO(2, R) : ((B, b), C) '→ B

is either equal to U or is a proper real analytic subvariety of U . In §15.4 below we show that for every connected component U of C per there is an element c of U such that g(c) has a nontrivial spatial rotational part. This proves per is the complement of a proper closed Proposition 7.15.3.47. The set Cnt analytic subvariety of C per . Thus it is a dense open subset of C per . Conper admits a real analytic fibration with fibers which are sequently, the set Cnt 3-tori that are invariant under the flow of the vector field V and on which the motion of the disk is quasiperiodic.

We can say more. If we disregard the internal rotation of the disk, that is, if we pass to the reduced C/S 1 of §2.2, which has coordinates # phase space " per 1 per ((v, a), (ν, a)), ˙ then C/S nt = Cnt /S 1 admits an analytic fibration with per /S 1 2-tori as fibers that are invariant under the S 1 reduced flow of V on Cnt per and on which the motion is quasiperiodic. Therefore Cnt admits an analytic fibration with 3-tori as fibers which are invariant under the flow of V and on which the motion is quasiperiodic. Also the point of contact of the disk, which is a smooth function of σ1 , v, and a, see (38), is a quasiperiodic function of t with two frequencies that depends analytically on the initial which conditions. Looking at the reduced space C/E(2) "= (S 2 \ {±e # per3}) × R, per /E(2) is parametrized by (u, ω), see (6), we find that C/E(2) nt = Cnt admits a real analytic fibration with 2-tori as fibers that are invariant under the flow of V and on which the motion is quasiperiodic.

376

The rolling disk

Proof. To prove the proposition and the above observations we argue as follows. The statement about the motion on the S 1 -reduced phase space 1 has an E(2) symmetry group and follows from the observation "that C/S # 1 that the E(2)-reduced space C/S /E(2) = C/(E(2) × S 1 ). From proposition 7.15.4.48 below it follows that if B -= id then the centralizer of (B, b) in (id−B)−1 b . The theory of# §15.2 then E(2) is the circle group (S 1 )∗ = SO(2, R) " per yields the existence of a quasiperiodic motion on 2-tori on C/S 1 nt /(S 1 )∗ . per 1 )nt , which are invariant The construction in §15.2 leads to 2-tori " in1 #(C/S per per 1 ∗ /S 1 are quotients of under the (S ) -action. The 2-tori in C/S nt = Cnt per 3-tori in Cnt . The situation with the E(2) reduced phase space is simpler. bundle over the fully reduced phase The space C/E(2) is an" S 1 principal # space C/(E(2) × S 1 ) = C/E(2) /S 1 . Here S 1 is a 1-dimensional torus and ! we do not need any conditions on the shift element in S 1 . 7.15.4

Near elliptic relative equilibria

In this subsection we discuss the behavior of the rolling disk near an elliptic relative periodic orbit. Let c ∈ C per be the starting point of a relative periodic solution t '→ c(t) = ((A(t), a(t)), (ω(t), a(t))) ˙ = γc (t)

of the equations of motion of the disk, that is, t '→ γc (t) is an integral curve of V starting at c. With m = π(c) let τ = τ (c) be the smallest positive period of the integral curve t '→ γ m (t) = π(γc (t)) of the fully reduced vector field V. Then, using the notation of §4.2, there are η, ζ ∈ R such that A(τ ) = eτ η E3 A(0) e−τ ζ E3 e3 = eτ η E3 A(0)e3 = eβ E3 v(0),

where eβ E3 is the spatial rotational shift and β = τ η. In view of (21) it follows that v(τ ) = A(τ )e3 = eβ E3 A(0)e−τ ζ E3 e3 = eβ E3 A(0)e3 = eβ E3 v(0). Using (35) we get χ(τ ) = χ(0) + β = χ(0) + τ η. Integrating (36) gives . τ . τ dχ dt = (1 − σ12 (t))−1 σ3 (t) dt β = χ(τ ) − χ(0) = 0 dt 0

and therefore

B(c) = eβ E3 ,

where β =

9τ 0

(1 − σ12 (t))−1 σ3 (t) dt.

(220)

The set E of elliptic relative periodic orbits of V on C is a proper real analytic subvariety of C. Now C ∪ E ell is an open neighborhood of E ell in C. ell

377

7.15. The spatial rotational shift

Lemma 7.15.4.49 below implies that the period function τ : C per → R>0 : c 7→ τ (c) extends to an analytic function on C per ∪E ell , which we also denote by τ . It follows that the spatial rotation map B : C per → SO(2, R) extends to an analytic map B : C per ∪ E ell → SO(2, R). If c0 is the starting point of a relative periodic orbit of V in C, then m0 = π(c0 ) is an equilibrium point of the reduced vector field V on M. Its coordinates σ10 , σ30 in M are constant functions of time. Therefore from (220) it follows that B(c0 ) = eτ η E3 ,

where η = (1 − (σ10 )2 )−1 σ30 .

(221)

Let U be a connected component of C per in C. Recall that C per ∪ E ell is an open neighborhood of E ell in C per . Suppose that U 0 is another connected component of C per . Then U ∩ U 0 in C consists of hyperbolic or degenerate relative equilibria. Hence U ∪ (E ell ∩ U ) is an open neighborhood of E ell ∩ U in C. We now show that for each connected component U of C per the right hand side of (221) is a nonconstant real analytic function on U . This implies Proposition 7.15.4.48. The set of points c ∈ U at which the spatial rotational shift is equal to the identity is a proper analytic subvariety of U . Before proving this proposition we consider the conservative one degree of freedom Newtonian system M ϕ¨ + V (ϕ) = 0

(222)

with mass M and potential V , which is real analytic on an open interval I ⊆ R. Suppose that V has a nondegenerate minimum at ϕ0 , that is, V 0 (ϕ0 ) = 0 and V 00 (ϕ0 ) > 0. If E ∈ J = V (ϕ0 ) − η, V (ϕ0 ) and η > 0 is sufficiently small, then the solutions of (222) with energy E near ϕ = ϕ0 and dϕ dt (ϕ0 ) = 0 oscillate in a small interval around ϕ0 and hence are periodic of period τ = τ (E). Lemma 7.15.4.49. The function τ : J → R : E 7→ τ (E), which gives the period of the periodic solution of (222) near the nondegenerate critical point ϕ0 of the real analytic potential V , can be extended to a complex analytic function in a complex neighborhood Je of J. Moreover, lim

E↓V (ϕ0 )

τ (E) = 2πM 1/2 V 00 (ϕ0 )−1/2 .

(223)

If the potential V depends real analytically on parameters p near p0 , then the period function τ extends to an analytic function of (p, E) in a neighborhood of (p0 , V (ϕ0 , p)).

378

The rolling disk

Proof of lemma 7.15.4.49. Even though the limit formula (223) can be proved in an elementary fashion, we will give a sketch of its proof. From equation (88) recall that Z ϕ+ −1/2 2 τ (E) = 2 (E − V (ϕ)) dϕ, M ϕ−

(224)

where ϕ− = ϕ− (E) and ϕ+ = ϕ+ (E) are close to ϕ0 with ϕ− < ϕ0 < ϕ+ . Moreover, V (ϕ− ) = E = V (ϕ+ ) with V 0 (ϕ− ) < 0, V 0 (ϕ+ ) > 0, and V (ϕ) < E for ϕ ∈ I = [ϕ− , ϕ+ ]. Thus the right hand side of (224) is convergent. Because V is real analytic on I ⊆ R, it has an extension to a complex analytic function on an open complex neighborhood U of I. Let ε = 12 (ϕ+ − ϕ− ). Consider the curve γε in C, which consists of the concatenation of following four curves: 1) ϕ traverses the closed interval [ϕ− + ε, ϕ+ − ε], followed by 2) the semicircular arc t 7→ ϕ+ + ε eit with t ∈ [−π, π], followed by 3) ϕ traversing the interval (ϕ− + ε, ϕ+ − ε) backward, and finished by 4) the semicircular arc t 7→ ϕ− + ε eit with t ∈ [0, π]. Because V (ϕ± ) = E and V 0 (ϕ± ) = 0, from the Taylor expansion of V at ϕ± V (ϕ) = V (ϕ± ) + V 0 (ϕ± )(ϕ − ϕ± ) + O((ϕ − ϕ± )2 ) = E + V 0 (ϕ± )(ϕ − ϕ± ) + O((ϕ − ϕ± )2 )

we get p p ±V 0 (ϕ± ) ∓(ϕ − ϕ± ) + O(|ϕ − ϕ± |). (225) −1/2 2 Therefore the complex analytic function F = M (E − V (ϕ)) is double valued near ϕ± . If ε > 0 is sufficiently small, then the two values of the integrand F (224) are interchanged along segments 2) and 4) of the curve γε . While traversing all of γε , the function F returns to its original value. (E − V (ϕ))1/2 =

Taking I and U sufficiently small we can arrange that ϕ± are the only zeroes of the function ϕ 7→ E − V (ϕ) in U . Consider the Riemann surface X(E) given by {(ϕ, ψ) ∈ (U \ {ϕ± }) × C

2 (E − V (ϕ)) ψ 2 = 1} M

with projection map ρ : X(E) → U \ {ϕ± } : (ϕ, ψ) 7→ ϕ. Note that ρ is a twofold covering. Because the integrand F on the right hand side of (224)

379

7.15. The spatial rotational shift

returns to its original value after traversing the path γε , each of the two lifts γ eε of γε to X(E) is a closed curve. Also Z Z Z −1/2 2 (E − V (ϕ)) dϕ = ρ∗ (F dϕ) = ψ dϕ, γε M e γε e γ eε

where ψ dϕ is a holomorphic 1-form on X(E). Because X(E) has comR plex dimension 1, the 1-form ψ dϕ is closed. Since eγε ψ dϕ R is homotopy invariant, it does not depend on ε and is therefore equal to γ ψ dϕ, where γ is a closed path in U \ {ϕ± }. Because γ avoids ϕ± , it follows that R τ (E) = γ ψ dϕ is a complex analytic function of E in a complex neighbor−1/2 is hood Je of V (ϕ0 ). From (225) it follows that ψ = 2 (E − V (ϕ)) M

a meromorphic R function on X(E) with a simple pole at ϕ0 . The value of the integral γ ψ dϕ is 2πi times the residue Resϕ0 , which we compute as follows. 1 Resϕ0 = lim (ϕ − ϕ0 ) 1/2 ϕ→ϕ0 2 i M (V (ϕ) − V (ϕ0 )) r 1 M 1 = lim (ϕ − ϕ0 ) q , i 2 ϕ→ϕ0 1 00 V (ϕ )(ϕ − ϕ )2 + O((ϕ − ϕ )3 ) 2

=

0

0

0

which comes from taking the square root of the Taylor series of the function ϕ 7→ V (ϕ) − V (ϕ0 ) to second order about ϕ0 and using the fact that ϕ0 is a nondegenerate critical point of Vs

1 i

M

V

00 (ϕ ) 0

.

(226)

This gives (223). Proof of proposition 7.15.4.48. To prove that the spatial rotational shift map B is not constant on every connected component of C per , we begin by describing these components. Let C flat be the set of starting points of integral curves of V , where the disk rises from and/or falls to a flat position in finite time. The value of the integral map I (207) on C flat is contained in π e−1 (`+ ∪ `− ), where π e = (σ 3 , σ 4 ) : C → R2 . The set π e−1 (`+ ∪ `− ) contains all the periodic integral curves corresponding to the part of `+ ∪ `− , which lies in the interior of the region bounded by the discriminant locus ∆. Here ϕ(t) oscillates in the highest occurring potential well described in proposition 7.12.5.36. The boundary ∂C flat of C flat consists of unstable (= hyperbolic) and degenerate relative equilibria for which (σ 3 , σ 4 ) ∈ `+ ∪ `− .

380

The rolling disk

The set C \ (C flat ∪ ∂C flat ) is the largest open subset of C where the flow of V is complete. The set C \ (C flat ∪ ∂C flat ) is connected. For if c, c0 ∈ C and π e(c), π e(c0 ) 0 2 lie in different connected components U and U of R \ (`+ ∪ `− ), then we first move in π e−1 (U ) from c to c0 such that π e(c0 ) ∈ U and π e(c0 ) lies in the interior of the region bounded by ∆. Moreover, π e(c0 ) is a periodic orbit for which ϕ(t) oscillates in the highest potential well. Because the set of all such c0 is connected, we can move c0 to c00 such that π e(c00 ) ∈ U 0 . Then we 0 0 −1 0 per move c0 to c in π e (U ). However, the open subset C of C \(C flat ∪∂C flat ) is not connected. In fact, it has four connected components. Each of the four connected components U of R2 \ (`+ ∪ `− ), where ϕ(t) oscillates in the lowest potential well each form a connected component of C per ; while those which oscillate in the highest well form the fifth connected component. Because an elliptic relative periodic orbit corresponds to a nondegenerate local minimum of the potential function V , we have I(E ell ) = Σnd . Moreover, for each connected component U of C per , we have I(E ell ∩ U ) is a nonempty open subset of Σ stabl , which contains one of the connected components of Σ deg \ (`+ ∪ `− ) in its closure. Let a point in Σ stabl approach a fixed point in Σ deg , which is not one of the two swallow tail points on the σ 4 -axis. In other words, we do not approach a rectilinear motion of the disk rolling at constant angular velocity. Because σ3 = 0 for a relative equilibrium, this implies σ1 = 0, see (45). Hence σ 3 = σ3 = 0, which we have just excluded. Therefore σ3 6= 0 at our degenerate critical point. Because V 00 (ϕ) → 0 as we approach a degenerate critical point, in view of (223) it follows that τ → ∞. Thus the spatial rotational shift eβ E3 in (220) winds faster and faster around the circle SO(2, R). This implies that the spatial rotational shift map B is not constant on I(E ell ∩ U). Thus B is not constant on U . Corollary 7.15.4.50. The set Ctper of all c ∈ C per such that B(c) = id is a per proper real analytic subvariety of C per . On its complement Cnt the motion per 5 is quasiperiodic and forms a dense subset of C . Let c ∈ C per and let g(c) = (B(c), b(c)), C(c) ∈ E(2) × S 1 be the shift element. It is an interesting problem to determine the asymptotic behavior 5 The observation that the spatial rotational shift runs infintely fast on SO(2, R) when approaching a degenerate relative equilibrium implies that we have an accumulation of infinitely many sheets of Ctper .

381

7.15. The spatial rotational shift

of (B(c), b(c)) for c near E ell . Recall that Ctper is the set of all c ∈ C per such that B(c) = id. Asymptotic analysis should give information about the structure of Ctper near E ell . If c ∈ Ctper and b(c) 6= 0, then the motion of the disk is a superposition of rectilinear motion with nonzero constant velocity τ (c)−1 b(c) and a periodic motion, see lemma 7.15.3.45. We now give an alternative proof of proposition 7.15.4.48. mgr Proof. If σ 4 = 0 and |σ 3 | > (I1 +mr 2 )1/2 , which is the critical angular speed of spinning of the vertical disk, see case 2 in §8.2, then σ1 = 0 is the only critical point of Vσ 3 ,σ 4 . It is a nondegenerate minimum and corresponding to the disk spinning vertically in a stable way. Because σ1 = 0, we have σ3 = σ 3 and σ4 = σ 4 = 0. From (129) we obtain Vσ003 ,σ4 (0) = (I1 + mr2 )σ 23 − mgr. Because M = I1 + mr2 , from (223) it follows that the limiting period of nearby periodic solutions of the reduced system is

−1/2 τ = 2π (σ 23 − mgr)/(I1 + mr2 ) .

Thus from (220) we get

β = τ η = τ σ 3 = 2π 1 −

mgr 1 −1/2 . I1 + mr2 σ 23

(227)

1/2 , the right hand side of (227) increases to ∞ on Σ ell . As |σ 3 | ↓ I1mgr +mr2 Because the set of elliptic relative equilibria is connected and meets the boundary of each connected component of C per , this computation shows that the spatial rotational shift is a nonconstant real analytic function on each connected component of C per . 7.15.5

Nearly flat solutions

We use the notation of §10.1. Let p = (σ 3 , σ 4 ) ∈ 6 `± be near q ∈ `± . Suppose that σ1 (p) is a critical point of Vp , which is near ±1. Then σ1 (p) is a nondegenerate local minimum of Vp . Lemma 7.15.5.51. Let τ be the limit of the period of periodic solutions of the reduced equations of motion near an equilibrium point in the reduced phase space with σ1 = σ1 (p). Then 2π −1/6 1/3 τ ∼ √ (I1 + mr2 )1/2 I1 (mgr)−2/3 |η± | 3

as η± → 0.

(228)

382

The rolling disk

The limit of the spatial rotational shift of periodic solutions near the equilibrium point in the reduced phase space is eβ E3 , where 2π β = τ η = (sgn η± ) √ (1 + mr2 /I1 )1/2 3 2π = (sgn η± ) √ ∆χ as η± → 0. 3

(229)

Here ∆χ is defined in proposition 7.11.2.27. Proof. With V (ϕ) = V(σ 3 ,σ4 ) (sin ϕ) = Vp (sin ϕ) and M = I1 + mr2 we obtain V 0 (ϕ) = Vp0 (σ1 ) cos ϕ and V 00 (ϕ) = Vp00 (σ1 )cos2 ϕ − Vp0 (σ1 ) sin ϕ. Using the notation of (95), we see that ϕ is a critical point of V if and only f = 0. In our situation we find that if Vp0 (σ1 ) = 0 if and only if CW f cos2 ϕ. V 00 (ϕ) = mgr C2 W

From the proof of lemma 7.14.7.44 we find that x ∼ , y ∼ ∓a, and cos ϕ ∼ d1/3 ||2/3 , where d2 = |η± |2 /(I1 + mgr). If we substitute these asymptotic f (149) we obtain relations into C2 W f cos2 ϕ ∼ 4d d−4/3 ||−8/3 2 − d−1/3 ||−4/3 (4d + 2c)a C2 W + (d + c)2 a2 − d−1/3 ||−2/3

∼ 3d−1/3 ||−2/3 .

In view of the formula for τ given in lemma 7.15.4.49 with M = I1 + mr2 we obtain (228). According to (221) β is obtained by multiplying (228) by (1 − σ12 )−1 σ3 = −

I1 ux I1 u −2/3 −1/3 ∼ (sgnη± ) d || , I3 cos2 ϕ I3

(230)

where we have used (91), cos ϕ ∼ d1/3 ||2/3 , x ∼ , and sgn = −sgn η± . For the last equality above see the proof of lemma 7.10.1.22. In the product of (228) and (230) the powers of cancel. Using u = (mgr)1/2 /(I3 +mr2 )1/2 and d = I32 /(I1 (I3 + mr2 )) we obtain (229). From (228) it follows that if the disk is close to an almost flat relative equilibrium, then its angle with the vertical oscillates very rapidly with a frequency of order |η± |−1/3 . This oscillation is in combination with the

7.15. The spatial rotational shift

383

rapid motion around the rim by the point of contact which we observed in §4.5. It is quite remarkable that the spatial√rotational shift approaches a constant, whose absolute value equal to (2/ 3) ∆χ, where ∆χ is the absolute value of the increase of the angle χ when the disk rises from and falls to the flat position, see (186). Note that β changes sign together with η± . For the uniformpdisk and the hoop the absolute value of the right hand side of (229) is 2π 5/3 and 2π, respectively. When the mass distribution becomes concentrated near the center of mass, that is, when I1 /mr2 ↓ 0, then the absolute value of the right hand side of (229) increases to ∞. When the right hand side of (229) is not an integral multiple of 2π, that is, when mr2 /I1 6= 3k 2 − 1 for every k ∈ Z, then all the motions of the disk which are sufficiently close to the nearly flat relative equilibrium are quasiperiodic as described in proposition 7.15.3.47. This holds for the uniform disk, whereas for the hoop further asymptotic analysis is needed in order to decide whether there are nearly flat motions, which have trivial spatial rotational shift and nonzero translational shift, and hence run off to infinity as described in lemma 7.15.3.45. The fact that the absolute value of the right hand side of (229) is not equal to ∆χ implies that the spatial rotational shift does not have a unique limit if we let both η± and E − Vq (±1) tend to zero, keeping the sign of η± constant. To see this note that if η± = 0, that is, p = q ∈ `± and E −Vq (±1) is small, then the time interval I between rising from and falling to the flat position is small, see the last paragraph in §11.1. As observed in (174), the quantity (1 − σ1 2 )−1 σ3 = dχ dt remains bounded. Therefore in the time interval I the increase of χ is small. From (186) if follows that if E −Vq (±1) is small, η± 6= 0, and |η± | is small enough compared to E − Vq (±1), then the increase β of χ(t) during a period of the periodic solution of the reduced system is close to (sgn η± ) ∆χ. √ Let β be a real number between (sgn η± ) ∆χ and (sgn η± ) (2/ 3) ∆χ and fix the sign of η± . By the intermediate value theorem, it follows that in any continuous one parameter family of solutions, which goes from a falling flat solution, that remains near the flat position to an elliptic relative equilibrium, that remains near the flat position, there is at least one for which the increase χ(t) during a period of the periodic solution of the reduced system is equal to β. Because the spatial rotational shift is equal

384

The rolling disk

to eβ E3 , the spatial rotational shift does not have a unique limit if we let both η± and E − Vq (±1) tend to zero, keeping the sign of η± fixed. The argument in the preceding √ paragraph shows that if there is an integer k such that ∆χ < 2π k < (2/ 3) ∆χ, that is, 3k 2 − 1 < m r2 /I1 < 4k 2 − 1, then for each choice of sign of η± in each continuous family as described above there is a solution for which the spatial rotational shift is the identity. Since 3(k + 1)2 − 1 < 4k 2 − 1 if k ≥ 7, this last conclusion holds when mr2 /I1 ∈ (2, 3)∪(11, 15)∪(26, 35)∪(47, 63)∪(74, 99)∪(107, 143)∪(146, ∞). In particular this conclusion holds if there is more mass inside the rim than for the hoop, that is, if mr2 /I1 is slightly larger than 2. 7.16

Notes

The first papers on the rolling disk considered only the uniform disk. In 1872 Ferrers [43] obtained the fully reduced equations of motion (13) for the first time using Euler angles. The first serious analysis of these equations was given by Vierkandt [114] in 1892, who observed that almost all of its solutions were periodic. In 1897 Chaplygin [23] derived what we call Chaplygin’s equations (76) for the rolling disk. In §244 Routh [96] noted that these equations define Legendre functions and in 1899 Korteweg [62] observed that hypergeometric functions appear in the solutions of the fully reduced equations of motion of the disk. Later in 1900 Appell [3] made the same observation. In the past one hundred years there has been little interest in the rolling disk. In this period, many authors were content only to repeat the derivation of the equations of motion and the proof of integrability by quadratures and leave it at that. Only recently has interest arisen in the geometric behavior of the solutions of the rolling disk in phase space. Kemppainen [57] was the first to construct the two parameter family of one degree of freedom Hamiltonian systems (81) and (82) from the fully reduced system. Figures 7.10 and 7.11 give all the possible potential functions and level sets for the corresponding Hamiltonian. In [84] O’Reilly has an interesting discussion of the motion of the rolling disk. He finds several bifurcations numerically. See Srinivasan [110] for a discussion of some rocking and rolling motions of cans. The two frequency quasiperiodic behavior of the point of contact of the disk, shown in §15.3, is nicely illustrated in figure 8 of Borisov et al. [18].

7.16. Notes

385

This figure also exhibits some special cases where the spatial rotational shift is the identity and the motion of the disk is a combination of constant velocity rectilinear and periodic motion. The motion of the rolling disk is incomplete because it runs off the constraint manifold in finite time by falling flat. A complete analogue of the disk is the body of revolution rolling without slipping on a horizontal plane under the influence of a uniform vertical gravitational field. The reduced dynamics of this problem was first analyzed by Chaplygin [23]. In [81] Moshchuk obtained the full motion t 7→ (A(t), a(t)) of the body of revolution by integrating equation (29) and then (33). Because the rotational motion is quasiperiodic on 2-tori, we are confronted with the following difficulty: integrals of quasiperiodic functions need not be linear functions of time plus quasiperiodic functions. The reason for this is the appearance of small divisors in the coefficients of the Fourier series of a quasiperiodic function leads to the divergence of the series. If we use reconstruction from the periodic solutions in the fully reduced phase space M as in §15.4, then we do not meet the problem of small divisors. Moreover, we are able to obtain the stronger conclusion that the motion of the disk is quasiperiodic on 3-tori, if the spatial rotational shift is not the identity. If the spatial rotational shift is the identity, then the motion of the disk is a periodic perturbation of a rectilinear motion with constant velocitiy, see §15.3. Figure 4.4 in [31] gives pictures of the cusps in a constant energy slice of the set of critical values of the integral map I (207) appear to be rounded off compared with our figures in §14.1. One can find a picture of the critical surface in Borisov et al. [18]. Its downward cusps over the lines `± are clearly visible in their figures 3, 4, and 5. Their explanation of these cusps in terms of the singularities of the fully reduced differential equation at σ1 = ±1 is a bit too soft for us. By the way the authors of this paper do not prove what one sees in their computer generated figures.

This page intentionally left blank

Bibliography

[1] Abraham, R. and Robbin, J. (1967). Transversal mappings and flows, (W.A. Benjamin). [2] Abramowitz, M. and Stegun, I. (1972). The Handbook of Mathematical Functions, (Dover). [3] Appell, P. (1900). Sur l’int´egration des ´equations du mouvement d’un corps pesant de r´evolution roulant par une arˆete circulaire sur un plan horizontal; cas particular du cerceau, Rend. Palermo 14 pp. 1–6. [4] Appell, P. (1896). Trait´e de M´echanique rationelle, Tome I, II, Paris. (1904). 2nd edn. [5] Arms, J.M., Cushman, R., and Gotay, M.J. (1991). A universal reduction procedure for Hamiltonian group actions, in The geometry of Hamiltonian systems, T.S. Ratiu, ed., (Birkh¨ auser) pp. 31–51. [6] Arms, J.M., Marsden, J.E., and Moncrief, V. (1975). Symmetry and bifurcations of the momentum mappings, Commun. Math. Phys. 78, pp. 455–478. [7] Arnol’d, V.I., Kozlov, V.V., and Neishtadt, A.I. (1988). Mathematical aspects of classical and celestial mechanics, in Dynamical Systems, III, ed. V.I. Arnol’d, (Springer Verlag). [8] Arnol’d, V.I., Gusein-Zade. S.M., and Varchenko, A.N. (1985). Singularities of Differentiable Maps, Vol. I, (Birkh¨ auser). [9] Aronszjan, N. (1967). Subcartesian and subriemannian spaces, Notices Amer. Math. Soc. 14 p. 111. [10] Bates, L. (1998). Examples of singular nonholonomic reduction, Rep. Math. Phys. 42, pp. 231–247. [11] Bates, L., Graumann, H., and MacDonnell, C. (1996). Examples of gauge conservation laws in nonholonomic systems, Rep. Math. Phys. 37, pp. 295– 308. [12] Bates, L. and Lerman, E. (1997). Proper group actions and symplectic stratified spaces, Pacific J. Math. 191, pp. 201–229. ´ [13] Bates, L. and Sniatycki, J. (1993). Nonholonomic reduction, Rep. Math. Phys. 32, pp. 99–115. [14] Bierstone, E. (1975). Lifting isotopies from orbit spaces, Topology 14, pp.

387

388

Bibliography

245–252. [15] Bierstone, E. (1980). The structure of orbit spaces and the singularities of equivariant mappings, Monografias de Matematica, 35, (Instituto de Matematica Pura e Aplicada, Rio de Janeiro). [16] Bloch, A.M., Khrishnaprasad, P.S., Marsden, J.E., and Murray, R.M. (1996). Nonholonomic systems with symmetry, Arch. Ration. Mech. Anal. 138, pp. 21–99. [17] Bocharov, A.V. and Vinogradov, A.M. (1997). The Hamiltonian form of mechanics with friction, nonholonomic mechanics, invariant mechanics, the theory of refraction and impact, appendix II in The structure of Hamiltonian mechanics, B.A. Kuperschmidt and A.M. Vinogradov (authors), Russ. Math. Surv. 42 pp. 177–243. [18] Borisov, A.V., Manaev, I.S., and Kilin, A.A. (2003). Dynamics of rolling disk, Regular and Chaotic Dynamics 8 pp. 201–212. [19] Cantrijn, F., de Le´ on, M., and Martin de Diego, D. (1999). On almost Poisson structures in nonholonomic mechanics, Nonlinearity 12, pp. 721– 737. [20] Cartan, H. (1967). G´en´eralit´es sur les espaces fibr´es, I, S´eminaire Henri Cartan, E.N.S. 1949/1950, No. 6. in S´eminair Cartan, Vol. I, (W.A. Benjamin). [21] Carath´eodory, C. (1933). Der Schlitten, Z. f¨ ur Angew. Math. und Mech., 13, pp. 71–76. [22] Cendra, H., Marsden, J.E., and Ratiu, T. (2001). Geometric Mechanics, Lagrangian Reduction, and Nonholonomic Systems, in Mathematics Unlimited - 2001 and beyond, B. Engquist and W. Schmid, eds., (Springer Verlag) pp. 221–273. [23] Chaplygin, S.A. (1897). On the motion of a heavy body of revolution on a horizontal plane, Physics Section of the Imperial Friends of Physics, Anthropology and Ethnographics, (Moscow). (1954). Selected Works on Mechanics, (Moscow) pp. 413–425 (in Russian). English translation (2002) Regular and chaotic dynamics 7, pp. 119–130. [24] Chaplygin, S.A. (1911). On the theory of nonholonomic systems. The reducing multiplier theorem, Mathematical Collection XXVII (1954). Selected Works on Mechanics, (Moscow) pp. 413–425 (in Russian). [25] Chaplygin, S.A. (1903). On a sphere rolling on a horizontal plane, Mathematical Collection, XXIV. (1897) Physics Section of the Imperial Friends of Physics, Anthropology and Ethnographics, (Moscow). (1954). Selected Works on Mechanics, (Moscow) pp. 413–425 (in Russian). English translation (2002). Regular and chaotic dynamics 7, pp. 131–149. ¨ [26] Chow, W.L. (1940/41). Uber Systeme von linearen partiellen Differentialgleichungen erster Ordnung, Math. Ann. 117, pp. 98–115. [27] Coddington, E.A. and Levinson, N. (1955). Theory of Ordinary Differential Equations, (McGraw-Hill). [28] Copson, E.T. (1935). An Introduction to the Theory of Functions of a Complex Variable, (Oxford University Press). [29] Courant, T.J. (1990). Dirac manifolds, Trans. Am. Math. Soc. 319, pp.

Bibliography

389

631–661. [30] Cushman, R. and Bates, L. (1997). Global aspects of classical integrable systems, (Birkh¨ auser). [31] Cushman, R., Hermans, J., and Kemppainen, D. (1996). The rolling disc, in Nonlinear dynamical systems and chaos (Groningen, 1995), Progress in Nonlinear Differential Equations and Their Applications, volume 19, pp. 21–60, (Birkh¨ auser). ´ [32] Cushman, R., Kemppainen, D., Sniatycki, J., and Bates, L. (1995). Geometry of nonholonomic constraints, Rep. Math. Phys. 36, pp. 275–268. ´ [33] Cushman, R. and Sniatycki, J. (2001). Differential structure of orbit spaces, Canad. J. Math. 53, pp. 715–755. ´ [34] Cushman, R. and Sniatycki, J. (2002). A nonholonomic oscillator, Rep. Math. Phys. 50, pp. 85–98. ´ [35] Cushman, R. and Sniatycki, J. (2002). Nonholonomic reduction for free and proper actions, Regular and Chaotic Dynamics 7, pp. 61–72. [36] Dalsmo, M. and van der Schaft, A. (1999). On representations and integrability of mathematical structures on energy-conserving physical systems, SIAM J. Optimal Control 37, pp. 54–91. [37] Dieudonn´e, J. (1944), Une g´en´eralisation des espaces compacts, J. Math. Pures Appl. 23, pp. 65–76. ´ ements d’analyse, III, (Gauthiers-Villars). [38] Dieudonn´e, J. (1970). El´ [39] Dirac, P.A.M. (1950). Generalized Hamiltonian systems, Canad. J. Math. 2, pp. 129–148. [40] Duistermaat, J.J. (1974). Oscillatory integrals, Lagrange immersion and unfoldings of singularities, Comm. Pure Appl. Math. 27 pp. 207–281. [41] Duistermaat, J.J. (2004). Lectures on dynamical systems with symmetry given at Utrecht spring school on Lie groups, unpublished. [42] Duistermaat, J.J and Kolk, J.A.C. (1999). Lie groups, (Springer Verlag). [43] Ferrers, N.M. (1872). Extension of Lagrange’s equations, Quart. J. Math. 12 pp. 1–5. [44] Field, M. (1980). Equivariant dynamical systems, Trans. Amer. Math. Soc. 259 pp. 185–205. [45] Foote, R.L. (1998). Geometry of Prytz planimeter, Rep. Math. Phys. 42 pp. 248–271. [46] Friedman, A. (1963). Generalized functions and partial differential equations, (Prentice Hall). [47] Gibbs, J. Willard. (1879). On the fundamental formulæ of dynamics, Amer. J. Math. II pp. 49–64. (1928). in Collected Works, vol. II, (New York). (1961). reprint The scientific papers of J. Willard Gibbs, (Dover). [48] Goresky, M. and MacPherson, R. (1988). Stratified Morse Theory, (Springer Verlag). [49] Guillemin, V. and Sternberg, S. (1984). Symplectic Techniques in Physics, (Cambridge University Press). [50] Gunning, R.C. (1966). Lectures on Riemann surfaces, (Princeton University Press). [51] Hamel, G. (1949). Theoretische Mechanik, (Springer Verlag).

390

Bibliography

[52] Helgason, S. (1962). Differential Geometry and Symmetric Spaces, (Academic Press). [53] Hermans, J. (1995). Rolling rigid bodies with and without symmetries, Ph.D. thesis, University of Utrecht, Utrecht, the Netherlands. [54] Hermans, J. (1995). A symmetric sphere rolling on a surface, Nonlinearity 8, pp. 295–317. [55] Hirsch, M. (1976). Differential Topology, (Springer Verlag). [56] Karapetian, V.A. (1980). On the problem of steady motions stability of nonholonomic systems, J. Appl. Math. Mech. 44, pp. 295–300. [57] Kemppainen, D. (1996). Geometry of Rolling Disk, MSc. Thesis, University of Calgary, Alberta, Canada. [58] Kobayashi, S. and Nomizu, K. (1963). Foundations of Differential Geometry, volume 1, (Interscience Publishers). [59] Koiller, J. (1992). Reduction of some classical nonholonomic systems with symmetry, Arch. Rat. Mech. Anal. 118, pp. 113–148. [60] Kol´ ar, I., Michor, P.W., and Slov´ ak, J. (1993). Natural Operations in Differential Geometry, (Springer Verlag). [61] Koon, W.S. and Marsden, J.E. (1988). Poisson reduction for nonholonomic mechanical systems with symmetry, Rep. Math. Phys. 42 101–134. ¨ [62] Korteweg, D.J. (1899). Uber eine ziemlich verbreitete unrichtige Behandlungsweise eines Problems der rollenden Bewegung, Nieuw Archief voor Wiskunde 4, pp. 130–155. [63] Kronecker, L. (1884). N¨ aherungsweise ganzzahlige Aufl¨ osung linearer Gleichungen Monatsber. K¨ on. Preuss. Akad. Wiss. zu Berlin pp. 1179–1193, 1271–1299. (1899). Werke, Band 3:1, (Teubner) pp. 47–109. [64] Krupa, M. (1990). Bifurcation of relative equilibria, SIAM J. Math. Anal. 21, pp. 1453–1486. [65] Lagrange, J.L. (1788). M´echanique Analytique, (Paris). (1965). reprint with notes from the third and fourth edition, (A. Blanchard). [66] Lee, J.M. (2003). Introduction to Smooth Manifolds, Graduate Texts in Mathematics, vol. 218, (Springer-Verlag). [67] Lewis, A.D. and Murray, R.M. (1995). Variational principles for constrained systems: theory and experiment, Internat. J. Non-Linear Mech. 30, pp. 793–815. [68] Libermann, P. and Marle, C.M. (1987). Symplectic Geometry and Analytical Mechanics, (D. Reidel). [69] Lie, S. (1888). Theorie der Transformationsgruppen, Unter Mitwirkung van Prof. Dr. F. Engel, Erster Abschnitt, (Teubner). [70] Lindel¨ of, E. (1895). Sur le mouvement d’un corps de revolution roulant sur un plan horizontal, Acta Societatis Scientiarum Fennicae XX, number 10. ´ [71] Lusala, T. and Sniatycki, J., Stratified subcartesian spaces, http:// arXiv.org/abs/0805.4807v1. ´ [72] Lusala, T. and Sniatycki, J. Consequences of the Palais slice theorem for differential spaces, in preparation. [73] Marle, C.M. (1995). Reduction of constrained mechanical systems and stability of relative equilibria, Commun. Math. Phys. 174, pp. 295–318.

Bibliography

391

[74] Marle, C.M. (1998). Various approaches to conservative and nonconservative nonholonomic systems, Rep. Math. Phys. 42 pp. 211–229. [75] Marsden, J.E. (1992). Lectures on Mechanics, (Cambridge University Press). [76] Marsden, J.E., Misiolek, G., Ortega, J.-P., Perlmutter, M. and Ratiu, T. (2007). Hamiltonian Reduction by Stages, Lecture Notes in Mathematics 1913, (Springer Verlag). [77] Marsden, J.E. and Weinstein, A. (1974). Reduction of symplectic manifolds with symmetry, Rep. Math. Phys. 5, pp. 121–130. [78] Mather, J. (1967). Differential invariants, Topology 16, pp. 145–155. [79] Meyer, K. (1973). Symmetries and integrals in mechanics, in Dynamical Systems, M. Piexoto (ed.), pp. 259–272, (Academic Press). [80] Michel, L. (1974). Points critiques des fonction invariantes sur une Gvari´et´e, Comptes Rendus Acad. Sci. Paris 272 pp. 433–436. [81] Moshchuk, N.K., (1988). A qualitative analysis of the motion of a heavy solid of revolution on an absolutely rough plane, J. Appl. Math. Mech. 52 pp. 203–210. [82] Neimark, J.I. and Fufaev, N.A. (1972). Dynamics of nonholonomic systems, Trans. of Moscow Math. Soc. 33, (Amer. Math. Soc.). [83] Newton, I. (1686). Philosophiæ Naturalis Principia Mathematica, (London). (1999). The principia: mathematical principles of natural philosophy, translated by I.B. Cohen and A. Whitman, (University of California Press). [84] O’Reilly, O.M. (1996). The dynamics of rolling disks and sliding disks, Nonlinear Dynamics 10 pp. 287–305. [85] Ortega, J.P. (1998). Symmetry, Reduction, and Stability in Hamiltonian Systems, PhD. Thesis, University of California, Santa Cruz, CA. [86] Ortega, J.P., private communication. [87] Ortega, J.P. and Ratiu, T.S. (2004). Reduction of Hamiltonian systems with symmetry, (Birkh¨ auser). [88] Palais, R. (1961). On existence of slices for actions of noncompact Lie groups, Ann. Math. 73 pp. 295–323. [89] Pars, L.A. (1965). A Treatise on Analytical Dynamics, (Wiley). (1979). reprint (Oxbow Press). [90] Patrick, G. (1995). Relative equilibria of Hamiltonian systems with symmetry: linearization, smoothness and drifts, J. Nonlinear Sci. 5, pp. 2089– 2105. [91] Pflaum, M.J. (2001). Analytic and geometric stratified spaces, Lecture Notes in Mathematics, 1768, (Springer Verlag). [92] Poincar´e, H. (1901). Sur une forme nouvelle des ´equations de la m´echanique, Comptes Rendus Acad. Sci. Paris 132 pp. 369–371. Oevres t. VII pp. 218– 219. [93] Procesi, C. and Schwarz, G. (1985). Inequalities defining orbit spaces, Invent. math. 81 pp. 539–554. [94] Roberts, M., Wulff, C. and Lamb, J.S.W. (2002). Hamiltonian systems near relative equilibria, J. Diff. Eq. 179, pp. 562–604. [95] Rosenberg, R. (1977). Analytical Dynamics, (Plenum Press).

392

Bibliography

[96] Routh, E. (1860). A Treatise on the Dynamics of a System of Rigid Bodies, (MacMillan and Co.) (1960). reprint Advanced Rigid Body Dynamics, parts I and II, (Dover). [97] Sard, A. (1942). The measure of critical values of differentiable maps, Bull. Amer. Math. Soc. 48, pp. 883–890. [98] Schwarz, G.W. (1975). Smooth functions invariant under the action of a compact Lie group, Topology 14, pp. 63–68. [99] Schwarz, G.W. (1980). Lifting smooth homotopies of orbit spaces, Inst. ´ Hautes. Etudes Sci. Publ. Math. 51, pp. 37–135. [100] Sikorski, R. (1967). Abstract covariant derivative. Colloq. Math. 18, pp. 251–272. [101] Sikorski, R. (1972). Wstep do geometrii r´ ozniczkowej, ˙ (PWN). [102] Sjamaar, R. and Lerman, E. (1991). Stratified symplectic spaces and reduction, Ann. Math. 134, pp. 375–422. ´ [103] Sniatycki, J. (1998). Non-holonomic Noether theorem and reduction of symmetries, Rep. Math. Phys. 42, pp. 5–23. ´ [104] Sniatycki, J. (2001). Almost Poisson spaces and non-holonomic singular reduction, Rep. Math. Phys. 48, pp. 235–248. ´ [105] Sniatycki, J. (2002). The momentum equation and the second order differential equation condition, Rep. Math. Phys. 49, pp. 371–394. ´ [106] Sniatycki, J. (2002). Integral curves of derivations on locally semi-algebraic differential spaces, in Proceedings of the Fourth International Conference on Dynamical Systems and Differential Equations, May 24-27, 2002 (Wilmington, NC, USA), eds. S. Hu and X. Lu, pp. 825–831. ´ [107] Sniatycki, J. (2003). Orbits of families of vector fields on subcartesian spaces, Ann. Inst. Fourier (Grenoble) 53, pp. 2257–2296. ´ [108] Sniatycki, J. and Cushman, R. (2007). Non-holonomic reduction, symmetries, constraints, and integrablility, Reg. and Chaotic Dyn. 12, pp. 615– 621. [109] Souriau, J.M. (1997). Structure of Dynamical Systems, (Birkh¨ auser). translation of (1970). Structure des Syst´emes Dynamique, (Dunod). [110] Srinivasan, M. (2008). Rocking and rolling: A can that appears to rock might actually roll, Phys. Rev. E. 78, 066609. [111] Stefan, P. (1974). Accessible sets, orbits and foliations with singularities, Proc. London Math. Soc. 29, pp. 699–713. [112] Sussmann, H. (1973). Orbits of families of vector fields and integrability of distributions, Trans. Amer. Math. Soc. 180, pp. 171–188. [113] van der Schaft, A.J. and Maschke, B.M. (1994). On the Hamiltonian formulation of nonholonomic mechanical systems, Rep. Math. Phys. 34, pp. 225–233. ¨ [114] Vierkandt, A. (1892). Uber gleitende und rollende Bewegung, Monatsh. f. Math. u. Physik, 3, pp. 31–54, 97–134. [115] Walker, G.T. (1895). On a curious dynamical property of celts, Proc. Cambridge Phil. Soc. 8, pp. 305–306. [116] Walker, G.T. (1896). On a dynamical top, Quart. J. Pure Appl. Math. 28, pp. 175–184.

Bibliography

393

[117] Whitney, H. (1965). Local properties of analytic varieties, in Differential and combinatorial topology, ed. S.S. Cairns, (Princeton University Press) pp. 205–244. [118] Whittaker, E.T. (1964). A Treatise on the Analytical Dynamics of Particles and Rigid Bodies, 4th edn. (Cambridge Univ. Press). [119] Yoshimura, H. and Marsden, J.E. (2007). Reduction of Dirac structures and the Hamilton-Pontryagin principle, Rep. Math. Phys 60, pp. 381–426.

394

Bibliography

Index

G-action H-symmetry type, 53 affine, 106 fixed point, 50 free, 50 infinitesimal generated by g, 50 isotropy group of, 50 isotropy type H, 53 orbit map, 51 orbit space of, 51 projection map of, 51 proper, 51 slice of, 54 E(2), 230, 284 E(3), 208, 284 R-action generated by flow, 50

subalgebra, 85 vector field, 27, 87 Poisson algebra Leibniz’ rule, 87 reduced, 86 annihilator, 12 symplectic, 13 Appell, P., 46, 47 Arms, J.M., 120 Arnol’d, V.I., 47, 306, 307 Aronszjan, N., 79 Bates, L., 47, 120, 204 Bierstone, E., 68, 80 Bloch, A.M., 48, 120 Bocharov, A.V., 47 body of revolution axial action, 246 induced action, 247 symmetry, 246 Chaplygin’s equations, 251 constants of motion, 253 fully reduced kinetic energy, 255 Newtonian system, 255 potential energy, 255 total energy, 254 vector field, 254 Gauss map equivariant, 246 invariant theory, 248

acceleration in Euclidean space, 1 Newtonian, 1 on manifold, 2 almost Hamiltonian vector field, 95 Poisson bracket, 24, 31, 84 Casimir, 44, 113 Hamilton’s equations, 25 map, 104 structure, 24 structure tensor, 27 395

396

inward normal, 244 reduced axial symmetry, 248 motion, 249 Borisov, A.V., 384, 385 canonical 1-form θQ , 9 symplectic form, 177 ωQ , 9 Cantrijn, F., 48 Carath´eodory’s sleigh, 173 E(2) symmetry, 187 almost Poisson structure matrix, 178 almost Poisson bracket in trivialization, 182 configuration space, 174 E(2), 183 constraint 1-form, 175 distribution, 175 nonholonomic, 176 constraint distribution trivialization, 179 distributional Hamiltonian vector field, 186 equations of motion, 177 almost Possion brackets, 183 kinetic energy, 174, 175, 183 in trivialization, 182 metric, 184 rotational, 175, 184 translational, 175, 184 Lagrange derivative, 176 Lagrange-d’Alembert principle trivialization, 180 Lagrangian in trivialization, 180, 182 moment of inertia, 175 generalized, 184 momentum equation, 189 reconstructed motion, 198 reduced asymptotic motions, 200

Index

equations of motion, 195 generalized distribution, 194 space, 189 symplectic form, 194 vector field, 195 reference position, 173 relative equilibria, 200 symplectic distribution in trivialization, 181 unconstrained Hamiltonian, 177 Lagrangian, 176 Carath´eodory, C., 203, 204 Cartan, H., 79 Chaplygin, S.A., 101, 203, 204, 262, 384, 385 Coddington, E.A., 79 Cohen, I.B., 1 condition second order equation, 35 conjugate subgroups, 53 class of, 53 connection 1-form, 100, 114 Christoffel symbols, 6 curvature, 101 2-form, 100 distribution horizontal, 114 vertical, 114, 124 Ehresmann, 98 horizontal distribution, 124 lift, 98, 100 Levi-Civita, 2, 6 mechanical, 121 principal bundle, 114, 124 constraint, 3 distribution, 177 function, 28, 177 holonomic, 3 manifold, 29 map, 29 nonholonomic linear, 3 nonlinear, 3

397

Index

perfect, 4 contact structure, 239 Copson, E.T., 302 curve quasiperiodic at most k frequencies, 129 frequency vector, 129 runaway, 132 Cushman, R., 47, 79, 80, 120, 121, 204, 385 Dalsmo, M., 47 de Diego, D.M., 48 de Le´ on, M., 48 derivation, 68 Dieudonne, J., 79 differential space, 56 cut off function, 59 derivation flow of, 69 diffeomorphism, 57 differential structure, 57 integral curve of derivation, 69 smooth functions C ∞ (Q), 59 manifold, 57 map, 57 stratified vector fields, 73 vector fields, 71 stratification, 73 locally smoothly trivial, 78 stratified vector field, 73 subcartesian, 62 locally compact, 62 subspace of, 57 tangent wedge, 78 Dirac, P.A.M., 48 distribution, 4 F , 12 F 0 , 12 F ω , 13 H, 14

in a trivialization, 16 symplectic, 14 H ω , 20 U , 93 constraint, 4 D, 5 generalized, 41 H, 27 symplectic, 26 integrable, 213 integral manifold of, 4 Lagrangian, 13 rank of, 14, 41 symplectic H, 21 Duistermaat, J.J., 51, 54–56, 79, 128, 140, 142, 150, 162, 164, 171, 306 dynamical system conservation law, 84 nonholonomic Chaplygin system, 97 reduced, 82, 83 singular, 83 smooth, 81 dynamics Hamilton-d’Alembert, 9 energy conservation, 6 function, 6 equations of motion distributional Hamiltonian, 22 geodesic, 11 momentum, 35, 36 Euler angles, 281 Ferrers, N.M., 46, 384 fiber derivative, 10 Field, M., 120, 171, 172 Foote, R.L., 204 force conservative, 4 external, 4 in Euclidean space, 1, 2 on manifold, 3 reaction of constraint, 4

398

Friedman, A., 80 Fufaev, N.A., 121, 204, 262 Gauss map, 210 generalized distribution accessible set, 41 Carath´eodory’s sleigh, 95 reachable set, 41 Gibbs, J.W., 47 Gotay, M.J., 120 Grassmann identity, 207 group Euclidean three dimensional, 205 two dimensional, 265 generated by, 153 Gunning, R., 79 Gusein-Zade, S.M., 306, 307 Hamel, G., 204 Hamiltonian function, 13 system distributional, 22 Hermans, J., 172, 262, 385 homogeneous function vector space of, 37 polynomial, 37 hypergeometric equation, 302 invariant polynomials Hilbert basis, 63 Hilbert map, 63 proper, 63 quasi homogeneous, 63 of linear action, 62, 248, 271 Kemppainen, D., 47, 48, 385 Kilin, A.A., 385 kinetic energy metric on Euclidean space Euclidean inner product, 1 on manifold Riemannian metric, 2 Koiller, J., 121, 204

Index

Kolar, I., 5 Kolk, J.A.C., 51, 54–56, 79, 128, 140, 142, 150, 162, 164 Koon, W.S., 48 Korteweg, D.J., 46, 121, 384 Krishnaprasad, P.S., 48, 120 Kronecker, L., 171 Krupa, M., 171 Lagrange derivative, 5 in a trivialization, 7 reduced, 98 Lagrange, J.L., 5 Lagrangian, 5, 7 in a trivialization, 7 reduced, 98 left interior product, 10 Legendre transformation, 177 Lerman, E., 120 Levinson, N., 79 Lewis, A.D., 46, 262 Libermann, P., 120 Lie algebra centralizer of element, 140 elliptic element of, 130, 140 set stable gse , 142 stable, 140 regular element of, 140 set regular greg , 142 set stable elliptic elements grse , 142 Lie derivative, 69 Lie group T , 129 E(2), 143, 174 group of rotations about point, 143 E(3), 205 Rn standard basis of, 7 Rk /Zk standard k-torus, 129 centralizer

399

Index

of element, 160 elliptic element of, 155, 161 stable, 161 Euclidean group rotations about a point, 165 three dimensional, 205, 208, 284 two dimensional, 164, 174, 230, 284 identity component G0 , 141 regular element of, 161 set of Greg , 163 regular elliptic set of stably elliptic Grse , 163 set of stably elliptic elements Gse , 163 tangent bundle of standard left trivialization, 9 torus T , 129 integer lattice of, 129 Weyl group of, 148 lift cotangent, 33 horizontal, 115 of an action to T Q, 113 tangent, 34 Lindel¨ of, E., 46 Lusala, T., 80, 92 Manaev, I.S., 385 Marle, C.M., 3, 47, 120 Marsden, J.E., 47, 48, 120, 121, 171 Maschke, B.M., 48 mass, 174 Mather, J., 80 Meyer, K., 120 Michel, L., 120 Michor, P.W., 5 momentum as coordinates, 38 equation

G-invariant, 117 function, 106 on T ∗ Q, 33 on T Q, 34 gauge, 112 map, 106 space, 9 Moncrief, V., 120 monodromy homomorphism, 149 Moshchuk, N.K., 385 motion dynamically admissible, 4, 21, 23 dynamically allowed, 4 restriction on constants of motion, 43 moving frame, 7 Murray, R.M., 46, 48, 120, 262 Neimark, J.I., 121, 204, 262 Neishtadt, A.I., 47 Newton’s equations in Euclidean space, 1, 2 nonholonomic system, 4 on manifold, 3 second law, 1 Newton, I., 1 nonholonomic bracket, 24 Dirac brackets, 28 formula for, 33 simple system, 42 normalizer of subgroup H, 53 O’Reilly, O.M., 384 one parameter group of homeomorphism, 82 subgoup E(2), 284 orbit space proper action subcartesian, 82 set of smooth vector fields X (M , C ∞ (M )), 76

400

stratification by orbit types, 67, 82 orbit type in orbit space, 53 differential subspace, 66 of isotropy subgroup H subgroups conjugate to H, 53 Ortega, J.P., 120, 121 Palais, R., 79 Pars, L.A., 3, 47 Pflaum, M.J., 56, 79, 80 Poincar´e, H., 7 Poisson algebra Jacobi identity, 25 bracket, 30 potential function, 4 principal fiber bundle structure group G, 51 principle d’Alembert, 5 Hamilton-d’Alembert, 12 Lagrange-d’Alembert, 6, 262 projection, 40 Procesi, C., 80 quasicoordinates, 7 Ratiu, T.S., 120 rattleback, 262 asymptotically stable motions, 262 reduced by stages singular, 91 flow, 83 Hamiltonian distributional, 90, 96, 195, 221, 235, 241 singular, 90 Lagrange derivative, 98 Lagrangian, 98 symplectic singular, 90 system reconstruction of, 125

Index

vector field almost Hamiltonian, 95 relative equilibrium, 126 family of quasiperiodic, 139 for nonfree action, 133 generator of, 128 in group-orbit, 134 runaway, 132 relative periodic orbit relative period, 153 set of, 165 shift element of, 153 rolling disk, 265 S 1 -action constraint manifold, 274 orbit space, 271, 274 S 1 -reconstruction fully reduced motion, 278 S 1 -reduced space equations of motion, 274 S 1 -symmetry orbit map, 271 E(2)-reduced energy, 270, 274 motion, 270 vector field, 270 E(2) × S 1 symmetry, 265 critical value surface Σ stable part, 356 bifurcation set exceptional lines, 267 center of mass horizontal velocity, 284 Chaplygin’s equations, 266, 290 discrete symmetries, 291 flow, 291 rescaled, 296 conservative Newtonian system, 266, 292 elliptic equilibrium point, 349 global description of solutions, 349 highest potential well, 349 hyperbolic equilibrium point, 349

Index

hyperbolic motion, 294 incomplete, 294 kinetic energy, 292 lowest potential well, 349 mathematical pendulum, 295 periodic orbits, 349 periodic solutions, 294 potential energy, 292 qualitative behavior, 293 constant of motion, 266 constant energy slice geometric features, 362 constraint holonomic, 269 manifold, 265, 268 nonholonomic, 269 critical value surface Σ, 267 constant energy slice, 358 cusped edge, 354 global structure, 356 regular part, 354 singular part, 354 standard swallow tail, 355 unstable part, 356 degeneracy locus, 266 cusp of, 266 global description of, 311 qualitative properties, 306 falls flat, 266, 322 codimension one, 323 limiting behavior, 324 family of potentials bifurcation set, 340, 341 critical points, 340 global qualitative description, 347 Morse data, 340 Morse function off degeneracy locus, 340 Morse index of critical point, 341 Morse type of critical point, 341 family of rescaled potentials qualitative properties, 306

401

fully reduced constants of motion, 291 energy, 266, 292 equations of motion, 270, 293 motion, 265 phase space, 296, 370 reduction map, 280 relative equilibria, 283 relative periodic solution, 370 space, 265, 292 vector field, 265, 271, 303, 370 gyroscopic stabilization, 267, 364 hoop, 268 integral map, 267, 352 critical value surface Σ, 353, 354 critical values, 267 degenerate part of Σ, 353 nondegenerate part of Σ, 353 regular values in image, 352 internal symmetry, 265 invariant theory, 271 inverse of Gauss map, 269 inward normal, 269 lying flat, 269 moment of inertia, 267 motion, 265 apparent singularities, 269 falls flat, 294 near elliptic relative periodic orbit, 376 rises, 294 near falling flat, 267 asymptotic behavior, 326 elastic refection, 326 motion, 332 point of contact fast motion, 337 uncommon oscillations, 328 not fall flat, 320 position of, 268 principal axis frame, 267 quasiperiodic motion, 372 reconstruction E(2)-reduced vector field, 276 fully reduced motion, 265, 270

402

reduction of E(2) × S 1 symmetry, 270 of E(2) symmetry, 270 of S 1 symmetry, 270, 271 of induced E(2)-action, 275 reference disk, 267, 268 relative equilibria, 265 criterion, 286 degeneracy locus, 304 degenerate, 304 derivative of potential, 303 elliptic, 267, 303 hyperbolic, 267, 303 leans inward, 289 leans outward, 289 stable, 303 unstable, 303 relative periodic orbit shift element, 371 spatial rotational shift, 373 rescaled Chaplygin’s equations hypergeometric equation, 302 asymptotic behavior, 299 dominant solution, 300 even solution, 301 odd solution, 301 recessive solution, 298, 300 rescaled Newtonian equations, 297 rescaled potential, 297 singularity A2 , 311, 343 singularity theory, 306 Ap+1 -singularity, 307 equivalent families of analytic functions, 306 family of analytic functions, 306 normal form, 307 standard cusp, 310 spatial rotational shift, 267 symmetry axis angular velocity, 284 uniform disk, 267 vertical bifurcations of motion, 303

Index

critical speed, 305 degenerate relative equilibria, 304 gyroscopic stabilization, 305 Rosenberg, R., 204, 262 Routh, E., 46, 384 Sard, A., 171 Schwarz, G., 64, 74, 80 second order equation condition, 36 semialgebraic subset of Rn , 63 sheaf of smooth functions, 58 locally defined, 58 Sikorski, R., 79, 121 singular foliation, 42 Sjamaar, R., 120 Slovak, J., 5 Sniatycki, J., 47, 48, 72, 78–80, 91, 92, 120, 121, 204 Souriau, J.M., 120, 121 Srinivasan, M., 384 Stefan, P., 42, 48, 121 stratification, 55 by orbit types minimal, 68 minimal, 67 primary, 68 strata of, 55 Whitney, 56 Whitney’s axioms, 56 strongly convex rigid body E(2) symmetry, 230 SO(2)-reduced 2-form, 237 distribution, 237 R2 -orbit space, 224 R2 -reduced 2-form, 226 distribution, 226 vector field, 224 body of revolution, 243 Chaplygin’s equations, 251 configuration space, 205 constraint

403

Index

accessible sets, 211 distribution, 211 holonomic, 210 manifold, 211 nonholonomic, 211 vector field on, 217 energy, 217 inverse of Gauss map, 210 inward unit normal, 210 kinetic energy, 206 metric, 219 rotational, 206 translational, 206 mass, 205 moment of inertia, 207 generalized, 208 principal, 207 moving body, 205 position space, 210 principle Lagrange-d’Alembert, 215 reference body, 205 surface of revolution, 244 symmetry, 243 distributional system, 231 dynamic, 243 geometric, 243 translational symmetry, 222 trivialization, 208 kinetic energy, 208 potential energy, 208 unconstrained Lagrangian, 208 unconstrained Lagrangian, 208 subcartesian space vector field locally complete family of, 71 orbit of a family of, 71 surface of revolution axial symmetry, 246 Gauss map, 244 Sussmann, H., 41, 48, 121 symmetry dynamical system, 81 Hamiltonian system

distributional, 84 singular reduction, 83 symplectic 2-form, 10 $H , 14 $H in a trivialization, 16 on H, 186 tangent bundle inverse of trivialization, 7 stratified, 77 trivial, 7 trivialization of, 7 cone, 77 wedge differential space, 78 theorem Chow, 213 Frobenius, 4 nonholonomic Noether, 43, 44, 105 nonholonomic reduction regular, 97 singular, 90 Schwarz, 64 singular reduction, 83 Stefan, 42 Sussman, 41 Tarski-Seidenberg, 63 tube, 54 topological space locally compact, 58 paracompact, 58 partition of unity, 58 van der Schaft, A.J., 47, 48 Varchenko, A.N., 306, 307 vector field aQ → , 17 aQ ↑ , 17 almost Poisson, 27 almost Hamiltonian invariant, 85 reduced, 95 distributional Hamiltonian

404

conservation law, 105 flow, 50 complete, 50 group property of, 50 reduced, 82 Hamiltonian, 10 G-invariant second order equation, 117 distributional, 21 invariant, 81 on D values in H, 21 orbit of family of, 41 reduced, 82 relative periodic orbit, 153 relative period, 153 singular reduction, 83 subperiodic orbit, 167 symmetry of, 81 velocity space, 9 Vierkandt, A., 384 Vinogradov, A.M., 47 Walker, G.T., 262 Weinstein, A., 120 Whitman, A., 1 Whitney, H., 80 Whittaker, E.T., 7, 46 Yoshimura, H., 47 Zariski cotangent space, 77 tangent bundle, 77 space, 77

Index

Geometry of Nonholonomically Constrained Systems (Nonlinear Dynamics) (Advanced Series in Nonlinear Dynamics)

Over-Constrained Systems

Molecular Encapsulation: Organic Reactions in Constrained Systems

Symplectic Geometry of Integrable Hamiltonian Systems

Resource-constrained Project Scheduling

Probability, geometry and integrable systems

Practical Embedded Security: Building Secure Resource-Constrained Systems (Embedded Technology)

Dynamical systems 04: Symplectic geometry

Practical Embedded Security: Building Secure Resource-Constrained Systems

Practical Embedded Security: Building Secure Resource-Constrained Systems

Power-Constrained Testing Of Vlsi Circuits

Real-Time PDE-Constrained Optimization

Real-Time PDE-Constrained Optimization

Nonlinear H2H8 Constrained Feedback Control

Orthonormal systems and Banach space geometry

Lie algebras, geometry, and Toda-type systems

Integrable Systems in the Realm of Algebraic Geometry

Optimal Control of Constrained Piecewise Affine Systems (Lecture Notes in Control and Information Sciences)

Integrable systems in the realm of algebraic geometry

Integrable Systems in the Realm of Algebraic Geometry

Constrained Solution of a system of matrix Equatation

Classical and Quantum Dynamics of Constrained Hamiltonian Systems (World Scientific Lecture Notes in Physics)

Affine differential geometry. Geometry of affine immersions

Donaldson, Differential Geometry - Geometry of Four Manifolds

Modern geometry. The geometry of surfaces

Geometry

Geometry

Geometry

Geometry

Geometry

Geometry of nonholonomically constrained systems

Geometry of nonholonomically constrained systems

Geometry of Nonholonomically Constrained Systems (Nonlinear Dynamics) (Advanced Series in Nonlinear Dynamics)

Over-Constrained Systems

Molecular Encapsulation: Organic Reactions in Constrained Systems

Symplectic Geometry of Integrable Hamiltonian Systems

Resource-constrained Project Scheduling

Probability, geometry and integrable systems

Practical Embedded Security: Building Secure Resource-Constrained Systems (Embedded Technology)

Dynamical systems 04: Symplectic geometry

Practical Embedded Security: Building Secure Resource-Constrained Systems

Practical Embedded Security: Building Secure Resource-Constrained Systems

Power-Constrained Testing Of Vlsi Circuits

Real-Time PDE-Constrained Optimization

Real-Time PDE-Constrained Optimization

Nonlinear H2H8 Constrained Feedback Control

Orthonormal systems and Banach space geometry

Lie algebras, geometry, and Toda-type systems

Integrable Systems in the Realm of Algebraic Geometry

Optimal Control of Constrained Piecewise Affine Systems (Lecture Notes in Control and Information Sciences)

Integrable systems in the realm of algebraic geometry

Integrable Systems in the Realm of Algebraic Geometry

Constrained Solution of a system of matrix Equatation

Classical and Quantum Dynamics of Constrained Hamiltonian Systems (World Scientific Lecture Notes in Physics)

Affine differential geometry. Geometry of affine immersions

Donaldson, Differential Geometry - Geometry of Four Manifolds

Modern geometry. The geometry of surfaces

Geometry

Geometry

Geometry

Geometry

Geometry

Recommend Documents